Ben Juurlink. On the efficient incremental ... Maarten Pennings and Ben Juurlink. Generating Supercombinator code ... David A. Turner. A New Implementation ...
Higher order Attribute Grammars Hogere orde Attributen Grammatica's (met een samenvatting in het Nederlands)
Proefschrift ter verkrijging van de graad van doctor aan de Rijksuniversiteit te Utrecht op gezag van de Rector Magni cus, Prof. Dr J.A. van Ginkel, ingevolge het besluit van het College van Dekanen in het openbaar te verdedigen op maandag 1 februari 1993 des namiddags te 2.30 uur
door
Harald Heinz Vogt geboren op 8 mei 1965 te Rotterdam
Promotor: Prof. Dr S.D. Swierstra Faculteit Wiskunde en Informatica
Support has been received from the Netherlands Organization for Scienti c Research (NWO) under grant NF 63/62-518, NFI-project \Speci cation and Transformation Of Programs" (STOP).
Contents 1 Introduction
1.1 This thesis . . . . . . . . . . . . . . . . . . 1.1.1 Structure of this thesis . . . . . . . 1.2 The description of programming languages 1.2.1 Syntax and semantics . . . . . . . . 1.2.2 Attribute grammars (AGs) . . . . . 1.3 Higher order attribute grammars (HAGs) . 1.3.1 Shortcomings of AGs . . . . . . . . 1.3.2 HAGs and related formalisms . . .
2 Higher order attribute grammars
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
2.1 Attribute evaluation of HAGs . . . . . . . . . . . . 2.2 De nition and classes of HAGs . . . . . . . . . . . 2.2.1 De nition of HAGs . . . . . . . . . . . . . . 2.2.2 Strongly and weakly terminating HAGs . . . 2.3 Ordered HAGs (OHAGs) . . . . . . . . . . . . . . . 2.3.1 Deriving partial orders from AGs . . . . . . 2.3.2 Visit sequences for an OHAG . . . . . . . . 2.4 The expressive power of HAGs . . . . . . . . . . . . 2.4.1 Turing machines . . . . . . . . . . . . . . . 2.4.2 Implementing Turing machines with HAGs .
3 Incremental evaluation of HAGs
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
1
. 3 . 4 . 5 . 5 . 6 . 16 . 16 . 21 . . . . . . . . . .
25 26 27 27 32 33 34 37 39 39 40
45
3.1 Basic ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 i
3.2 Problems with HAGs . . . . . . . . . . . . . . . . . . . . . . 3.3 Conventional techniques . . . . . . . . . . . . . . . . . . . . 3.4 Single visit OHAGs . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Consider a single visit HAG as a functional program 3.4.2 Visit function caching/tree caching . . . . . . . . . . 3.4.3 A large example . . . . . . . . . . . . . . . . . . . . . 3.5 Multiple visit OHAGs . . . . . . . . . . . . . . . . . . . . . 3.5.1 Informal de nition of visit functions and bindings . . 3.5.2 Visit functions and bindings for an example grammar 3.5.3 The mapping VIS . . . . . . . . . . . . . . . . . . . . 3.5.4 Other mappings from AGs to functional programs . . 3.6 Incremental evaluation performance . . . . . . . . . . . . . . 3.6.1 De nitions . . . . . . . . . . . . . . . . . . . . . . . . 3.6.2 Bounds . . . . . . . . . . . . . . . . . . . . . . . . . 3.7 Problems with HAGs solved . . . . . . . . . . . . . . . . . . 3.8 Pasting together visit functions . . . . . . . . . . . . . . . . 3.8.1 Skipping subtrees . . . . . . . . . . . . . . . . . . . . 3.8.2 Removing copy rules . . . . . . . . . . . . . . . . . .
4 A HAG-machine and optimizations
4.1 Design dimensions and performance criteria . . . . . . . 4.2 Static optimizations . . . . . . . . . . . . . . . . . . . . . 4.2.1 Binding optimizations . . . . . . . . . . . . . . . 4.2.2 Visit function optimizations . . . . . . . . . . . . 4.2.3 Eect on amount of bindings in \real" grammars 4.3 An abstract HAG-machine . . . . . . . . . . . . . . . . . 4.3.1 Major data structures . . . . . . . . . . . . . . . 4.3.2 Objects . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Visit functions . . . . . . . . . . . . . . . . . . . . 4.3.4 The lifetime of objects in the heap . . . . . . . . 4.3.5 De nition of purging and garbage collection . . . ii
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47 48 49 49 49 51 52 53 53 57 63 64 64 65 66 67 67 68
71
71 72 72 77 82 84 85 85 85 86 87
4.4 A space for time optimization . . . . . . . . . . . . . . . . 4.4.1 The pruning optimization . . . . . . . . . . . . . . 4.4.2 Static detection . . . . . . . . . . . . . . . . . . . . 4.5 Implementation methods for the HAG-machine . . . . . . 4.5.1 Garbage collection methods . . . . . . . . . . . . . 4.5.2 Purging methods . . . . . . . . . . . . . . . . . . . 4.6 A prototype HAG-machine in Gofer . . . . . . . . . . . . . 4.6.1 Full and lazy memo functions . . . . . . . . . . . . 4.6.2 Lazy memo functions in Gofer . . . . . . . . . . . . 4.6.3 A Gofer HAG-machine . . . . . . . . . . . . . . . . 4.7 Tests with the prototype HAG-machine . . . . . . . . . . . 4.7.1 Visit function optimizations versus cache behaviour 4.7.2 Purge methods versus cache behaviour . . . . . . . 4.8 Future work and conclusions . . . . . . . . . . . . . . . . .
5 Applications
5.1 The BMF-editor . . . . . . . . . . . . . . . . . 5.1.1 The Bird-Meertens Formalism (BMF) . 5.1.2 The BMF-editor . . . . . . . . . . . . 5.1.3 Further suggestions . . . . . . . . . . . 5.1.4 Conclusion . . . . . . . . . . . . . . . . 5.2 A compiler for supercombinators . . . . . . . 5.2.1 Lambda expressions . . . . . . . . . . 5.2.2 Supercombinators . . . . . . . . . . . . 5.2.3 Compiling . . . . . . . . . . . . . . . .
6 Conclusions and future work
6.1 Conclusions . . . . . . . . . . . . . . . . 6.2 Future work . . . . . . . . . . . . . . . . 6.2.1 HAGs and editing environments . 6.2.2 The new incremental evaluator . iii
. . . .
. . . .
. . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
87 88 89 90 90 91 92 93 94 95 96 97 99 99
103 103 105 106 117 117 117 119 121 124
127 127 128 128 128
6.2.3 The BMF-editor . . . . . . . . . . . . . . . . . . . . . . . . . 129
References
131
Bibliography
138
Samenvatting
139
Curriculum Vitae
142
Acknowledgements
143
iv
Chapter 1 Introduction In recent years there has been an explosion in computer software complexity. One of the main reasons is the trend to use the incremental evaluation paradigm. This is also described in [RT87, TC90], on which parts of this introduction are based. In the incremental evaluation paradigm each modi cation of the input-data has an instantaneous eect on the output-data. An example of incrementally evaluated systems are word-processors. In traditional batch-oriented word-processors the inputdata consists of the textual data, interleaved with formatting commands. The outputdata contains the page-layout and is only created when the input-data is processed by the document-processor. In modern desk-top publishing systems the page-layout is shown at all times and is modi ed instantaneously after each edit-action on the input-data. Another example of incremental evaluation is a spreadsheet. A spreadsheet consists of cells which depend on each other via arithmetic expressions. Changing the value of a cell causes all cells depending on the changed cell to be updated immediately. Other examples of incremental evaluation occur in drawing packages, incremental compilers and program transformation systems. The study of incremental algorithms has become very important because of the widespread use of incremental evaluation in modern programs. Let f be a function and suppose the input-data is x. When incremental evaluation is used and x is changed into x0 then f (x0) is computed and f (x) is discarded. f (x0) could be computed from scratch, but this is usually too slow to provide an adequate response. What is needed is an algorithm that reuses old information to avoid as much recomputation as possible. Because the increment from x to x0 is often small, the increment from f (x) to f (x0) is frequently also small. An algorithm that uses information in the old value f (x) to compute the new value f (x0) is called incremental. We can distinguish between two approaches to incremental evaluation: selective recomputation and nite dierencing (also known as dierential evaluation ). In se1
2
CHAPTER 1. INTRODUCTION
lective recomputation, values independent of changed data are never recomputed. Values dependent on changed data are recomputed, but after each partial result is obtained, the old and new values of that part are compared; when changes die out, no further recomputations take place. In nite dierencing, rather than recomputing f (x0) in terms of the new data x0, the old value f (x) is updated by some dierence function f : f (x0) = f (x) f (x0; x). In traditional batch-mode systems, such as word-processors and compilers, items from the input-data are processed sequentially. In contrast, in systems that use incremental evaluation, data-items are inserted and deleted in arbitrary order. The absence of any predetermined order for processing data, together with the desire to employ incremental algorithms for this task, creates additional complexity in the design of systems that perform incremental evaluation. The actions of batch-mode systems are speci ed imperatively; that is, they are implemented with an imperative programming language in which a computation follows an ordered sequence of state transitions. Although imperative speci cations have also been employed in incrementally evaluated systems, several systems have taken an alternative approach: declarative speci cations, de ned as collections of simultaneous equations whose solution describes the desired result. The advantages of declarative speci cations are that
the order of the computation of a solution is left unspeci ed, and the dependence of variables on input-data and other \variables" is implicit in the equations. Whenever the data change, an incremental algorithm can be used to re-solve the equations, retaining as much of the previous solution as possible.
The attribute grammar [Knu68] formalism is a declarative speci cation language for which incremental algorithms can be generated. In the area of compiler construction there is a relatively long tradition with respect to the \automation of the automation" [Knu68, Knu71]. Attribute grammars have their roots in the compiler construction world and serve as the underlying formal basis for a number of languagebased environments and environment generators [RT88][SDB84][JF85][BC85][Pfr86] [LMOW88][FZ89][Rit88][BFHP89][JPJ+90]. Just as a parser generator creates a parser from a grammar that speci es the syntax of a language, a language-based environment generator creates a language-based editor from the language's syntax, context-sensitive relationships, display format speci cations and transformation rules for restructuring trees. A state-of-the-art language-based environment generator is the \Synthesizer Generator" (SG) [RT88]. It has turned out that the facilities provided by the SG elevate this tool far beyond the conventional area of generating language-based editors and make
1.1. THIS THESIS
3
it possible to generate smart incremental editors like pocket calculators, formatters, proof checkers, type inference systems, and program transformation systems. One of the main reasons for this success is that by the use of attribute grammars it has become possible to generate the incremental algorithms needed for incremental evaluation. These generated incremental algorithms are
correct by construction, almost as fast as hand-written code, nearly impossible to construct by hand because of their complexity, and do not need any explicit programming.
1.1 This thesis The work described in this thesis was carried out in the STOP (Speci cations and Transformations Of Programs) project, nanced by NWO (the Netherlands Organization for Scienti c Research) under grant NF 63/62-518. This thesis is a contribution to the third item of the following list of goals of the STOP-project:
Development of easy manipulatable formalisms for the calculational approach to program development. The calculational approach to program
development means that a program should be developed in stages. During the rst stage, the programmer should not be concerned with the eciency of his initial speci cation. The initial speci cation should be a (not necessary executable) solution for which it is easy to prove that the problem's requirements are satis ed. In later stages, the speci cation is rewritten through a sequence of correctness preserving transformations, until an ecient executable speci cation is attained. Note that the resulting executable speci cation may be very complex, but will still be correct because of the application of transformations which were correctness preserving.
The construction of program transformation systems. Because we be-
lieve that derivations of ecient programs have to be engineered by a human rather than by the computer we insist on manual operation. Therefore, the program transformation system is a kind of editor.
The construction of tools for the construction of transformation systems. Because the program transformation system we have in mind is a kind of editor and the development of such an incrementally evaluated system is
4
CHAPTER 1. INTRODUCTION hard to do by hand we need tools for constructing such editors. As was indicated in the introduction attribute grammars are a good starting point for the development of such editors.
This thesis de nes and discusses an extension to attribute grammars, namely the so-called higher order attribute grammars (HAGs). An attribute grammar de nes trees and the attributes attached to the nodes of these trees. An attribute evaluator for normal (or rst order) AGs takes as input a tree and computes the attributes attached to the nodes of the tree. There is thus a strict separation between the tree and the attributes. HAGs allow the tree to be expanded as a result of attribute evaluation. This is achieved by introducing so-called nonterminal attributes, which are both nonterminals and attributes. An attribute evaluator for HAGs takes as input a tree and computes (nonterminal) attributes. The tree is expanded each time a nonterminal attribute is computed. HAGs can be used to de ne multi-pass compilers and in language-based environments. An incremental attribute evaluator for HAGs takes as input a tree and a sequence of subtree replacements. The incremental attribute evaluator applies all subtree replacements, updating the attributes after each subtree replacement. An incremental attribute evaluator should reuse old information to avoid as much recomputation as possible. There are no complications in pushing incremental attribution through an unchanged nonterminal attribute; the algorithms for incremental attribution of AGs extend immediately. What is not so immediate, however, is what to do when the nonterminal attribute itself changes. Consider for example the change to an environment, containing a list of declared identi ers, which is modeled with a nonterminal attribute and is instantiated at several places in the tree. This thesis presents a new algorithm that solves the problems with incremental evaluation of HAGs. The algorithm that will be presented is almost as good as the best incremental algorithms known for rst order AGs. The new algorithm forms the basis for a so-called HAG-machine, an abstract machine for incremental evaluation of HAGs. This thesis discusses the HAG-machine and its design dimensions, performance criteria and optimizations. A (prototype) instantiation of a HAG-machine was built in the functional language Gofer and testresults of HAGs will be discussed. Furthermore, this thesis reports on a prototype program transformation system and a supercombinator compiler which were built with (H)AGs.
1.1.1 Structure of this thesis The rst Chapter of this thesis gives an introduction to attribute grammars, higher order attribute grammars, and related formalisms. It also contains a formal de nition of AGs. The second Chapter presents a formal de nition of HAGs, several classes
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES
5
of HAGs, and discusses the expressive power of HAGs. Chapter three presents a new incremental evaluation algorithm for higher order as well as rst order AGs. An abstract machine (called the HAG-machine) for the incremental evaluation of HAGs is discussed in Chapter four. Furthermore, Chapter four discusses optimizations for the HAG-machine and a prototype HAG-machine instantiation in Gofer. Chapter four is ended with the results of tests on some \real" HAGs. Chapter ve discusses two applications of (H)AGs. First, a prototype program transformation system based on AGs is discussed. Second, an example HAG for a supercombinator compiler is presented. Chapter six contains the conclusions and some nal remarks about future work.
1.2 The description of programming languages This section consists of two parts. The rst part explains what role syntax, semantics and related terms play in the description of programming languages. The second part gives an informal description and example of attribute grammars (a formal basis for describing programming languages), a comparison with related formalisms and a formal de nition of attribute grammars for the interested reader.
1.2.1 Syntax and semantics In programming languages there is a distinction between the physical manifestation (\the representation"), the underlying structure and the nature of the composing components (\the context-free syntax"), the conditions which hold when components are composed (\the context-sensitive syntax") and the meaning (\semantics"). A de nition of a computer language covers all these items. Furthermore, a de nition should be concise and comprehensible. Once there is a de nition for a programming language it can be used by compiler writers to implement compilers and programmers will use it for programming. Programming language de nitions themself are also written in a language, which we call a meta-language . Traditionally, meta-languages consist of two parts, a de nition for the syntax part and a de nition for the semantics. We discuss each of them in turn. The syntax of programming languages is commonly described with context-free grammars. The one for ALGOL60 in [B+ 76] is a famous example. A context-free grammar describes exactly which sequences of symbols will be accepted as syntactically correct programs. A limitation of context-free grammars is that they oer no means for describing the context-sensitive syntax (like checking whether a variable is declared before it is used). Other syntax languages where developed to overcome this limitation, of which we mention two-level-grammars used in the de nition of ALGOL68 [vWMP+75] as an example.
6
CHAPTER 1. INTRODUCTION
The semantics of programming languages were described informally in early days, because no useful formalisms were available at that time. A more formal way of specifying a programming language is provided by operational semantics. This sort of semantics speci es the meaning of a construct in a language by specifying the operations it induces when it is executed on a machine. In particular it is of interest how the eect of a computation is achieved. Another sort of semantics is denotational semantics. In this kind of semantics meanings are modeled by mathematical objects (e.g. functions) that represent the eect of the constructs. Thus only the description of the eect is of interest, not how it is obtained. It is often not clear where the syntax of a programming language ends and the semantics start. The separation is not only a decision which the language designer has to make, a compiler writer has to solve a similar problem, namely in deciding what the compiler should do at compile time and what must be delayed until run-time. Static semantics is that part of a de nition of a programming language which has to be treated at compile time.
1.2.2 Attribute grammars (AGs) First an informal de nition and an example of AGs are given, followed by a comparison with related formalisms and a formal de nition of AGs.
1.2.2.1 Informal de nition and example of AGs Attribute grammars (AGs) are a formalism that is often used for de ning the static semantics of a programming language. An AG consists of a context-free grammar with the following extensions: the symbols of the grammar are equipped with attributes and the productions are augmented with attribution equations (which are also known as attribution rules). An attribute equation describes how an attribute value depends on and can be computed from other attributes. In every production p : X0 ! X1 : : :Xk each Xi denotes an occurrence of a grammar symbol. Associated with each nonterminal occurrence is a set of attribute occurrences corresponding to the nonterminal's attributes. Each production has a set of attribute equations; each equation de nes one of the production's attribute occurrences as the value of an attribute de nition function (a so-called semantic function) applied to other attribute occurrences in the production. The semantic functions are often speci ed in a separate functional kind of language with no side-eects. The attributes of a nonterminal are divided into two disjoint classes: synthesized attributes and inherited attributes. Each attribute equation de nes a value for a synthesized attribute occurrence of the left-hand side nonterminal or an inherited attribute occurrence of a right-hand side nonterminal. By convention, we deal only with attribute grammars that are
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES
7
noncircular, that is, grammars for which none of the derivation trees have circularly de ned attributes. As an example consider the attribute grammar in Figure 1.1 which describes the mapping of a structure consisting of a sequence of de ning identi er occurrences and a sequence of applied identi er occurrences onto a sequence of integers containing the index positions of the applied occurrences in the de ning sequence. Thus the program: let a,b,c in a,c,c,b ni
is mapped onto the sequence [1, 3, 3, 2]. We will describe example attribute grammars in a notation which bears a strong resemblance with the BNF-notation [B+76]. In BNF nonterminals are written lowercase between < and > brackets, in our notation nonterminals are written in uppercase ITALICS font without brackets. The terminals are written in typewriter font between stringquotes (\"). Furthermore, the productions are labeled explicitly in lowercase sans serif font. The concrete syntax for the sentence let a,b,c in a,c,c,b ni will contain a production that might look like ROOT ::= concrete block \let" DECLS \in" APPS \ni"
Here concrete block is the name of the production, ROOT is the left-hand side nonterminal, DECLS and APPS are the right-hand side nonterminals and let, in and ni are keywords that must occur literally in programs. The production in the abstract syntax does not mention the keywords let, in and ni. The AG in Figure 1.1 shows the abstract syntax. In the AG in Figure 1.1 the de nition of the productions is preceded by a (type) de nition of the inherited and the synthesized attributes of the nonterminals. The inherited and synthesized attributes are separated by a !, and their types have been indicated explicitly. The types and the semantic functions are speci ed in the functional language Gofer [Jon91]. The productions for nonterminal ID are not shown. In the attribute equations of Figure 1.1 we have used \." as the operator for selecting an attribute of a nonterminal, and subscripts to distinguish among multiple occurrences of the same nonterminal. The list of declared identi ers and their corresponding number is computed via the attribute env attached to certain nonterminals of the grammar. env is a synthesized attribute of DECLS and an inherited attribute of APPS ; its value is a list of tuples where each tuple contains an identi er name and its number. The semantic function lookup in production use searches for the number of a given identi er in the environment list. The synthesized attribute seq contains
8
CHAPTER 1. INTRODUCTION
ROOT :: DECLS :: APPS :: [([Char ]; Int )] env ID ::
! [Int ] seq ! Int number [([Char ]; Int )] env ! [Int ] seq ! [Char ] name
ROOT ::= block DECLS APPS APPS.env := DECLS.env ROOT.seq := APPS.seq DECLS ::= def DECLS ID DECLS 0:number := DECLS 1:number + 1 DECLS 0:env := [(ID.name ; DECLS 0:number )] ++ DECLS 1:env j empty decls DECLS.number := 0 DECLS.env := [ ] APPS ::= use APPS ID APPS 0:seq := APPS 1:seq ++ [(lookup ID.name APPS 0:env )] APPS 1:env := APPS 0:env j empty apps APPS 0:seq := [ ] lookup id (i; n) : l = if (id = i) then n else (lookup id l) lookup id [ ] = errorvalue
Figure 1.1: An attribute grammar
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES
9
the result sequence of integers, i.e., the index positions of the applied occurrences in the de ning sequence. A node of the structure tree that is labeled by an instance of nonterminal symbol X has an associated set of attribute instances corresponding to the attributes of X. An attributed tree is a structure tree together with an assignment of either a value or the special token null to each attribute instance of the tree. To analyze a program according to its attribute-grammar speci cation, rst a structure tree is constructed with an assignment of null to each attribute instance and then as many attribute instances as possible are evaluated, using the appropriate attribute equation as an assignment statement and replacing null by the actual value. The latter process is termed attribute evaluation. Functional dependencies among attribute instances in a tree can be represented by a directed graph, called the dependency graph. A grammar is noncircular when the dependency graphs of all of the grammar's derivation trees are acyclic. Figure 1.2 shows the derivation tree and a partial dependency graph of the sentence let a,b,c in a,c,c,b ni. The nonterminals of the derivation tree are connected by dashed lines; the dependency graph consists of the instances of the attributes env , number, name , and seq linked by their functional dependencies, shown as solid arrows.
Figure 1.2: A partial derivation tree and its associated dependency graph In incrementally evaluated systems the attributed tree is modi ed by replacing one of its subtrees. After a subtree replacement some of the attributes may no longer have
10
CHAPTER 1. INTRODUCTION
consistent values. Incremental analysis is performed by updating attribute values throughout the tree in response to modi cations. By following the dependency relationships between attributes it is possible to reestablish consistent values throughout the tree. Fundamental to this approach is the idea of an incremental attribute evaluator, an algorithm to produce a consistent, fully attributed tree after each restructuring operation. Of course, any nonincremental attribute evaluator could be applied to completely reevaluate the tree, but the goal is to minimize work by con ning the extent of reevaluation required. After each modi cation to a program tree, only a subset of attribute instances, denoted by Aected, requires new values. It should be understood that when updating begins, it is not known which attributes are members of Aected ; Aected is determined as a result of the updating process itself. Reps [RTD83] describes algorithms that identify attributes in Aected and recompute their values. Some of these algorithms have costs proportional to the size of Aected. This means that they are asymptotically optimal in time, because by de nition, the work needed to update the tree can be no less than jAectedj.
1.2.2.2 Relation to other formalisms This paragraph consists of two parts. The rst part discusses attribute grammars from a functional programming view. The second part discusses attribute grammars and their relation to object-oriented languages. Furthermore, the dierences between incremental evaluation in AGs and object-oriented languages are discussed .
Attribute grammars from a functional programming view One of the main advantages of the use of attribute grammars is the static (or equational) character of the speci cation. The description of relations between data is purely functional, and thus completely void of any sequencing of computations and of explicit garbage collection (i.e., use of assignments). We demonstrate this by giving two formulations for the same problem: given a list of positive integers compute the list where all maximum elements are removed. The correspondence with functional programming languages is demonstrated by the grammar in Figure 1.3, which has been transcribed into a Gofer [Jon91] program in Figure 1.4. In the program texts cmax is used to compute the maximum, and max contains the maximum value in the list. Note that inherited attributes in the attribute grammar correspond directly to parameters and synthesized attributes correspond to a component in the result of eval. The lazy evaluation of Gofer allows the use of so-called
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES
ROOT :: ! [Int ] seq L :: Int max ! [Int ] seq Int cmax INT :: ! Int val ROOT ::= root L L.max := L.cmax ROOT.seq := L.seq L
::= cons INT L L0 :cmax := if INT.val > L1 :cmax then INT.val else L1 :cmax L0 :seq := if INT.val < L0 :max then INT.val : L1:seq else L1:seq L1 :max := L0:max
j empty L
L:cmax := 0 L:seq := [ ] Figure 1.3: Attribute grammar
eval ROOT l = seq where (seq ; max ) = eval L l max eval L (i : l) max = (seq ; cmax ) where cmax = if (i > cmax2 ) then i else cmax2 seq = if (i < max ) then i : seq2 else seq2 (seq2 ; cmax2 ) = eval L l max eval L [ ] max = ([ ]; 0)
Figure 1.4: Gofer program
11
12
CHAPTER 1. INTRODUCTION
\circular programs", roughly corresponding to multiple visits in attribute grammars. Having a single set of synthesized attributes is in direct correspondence with the result of a program transformation called tupling. In [Joh87, KS87] it is shown that this correspondence can be used in transforming functional programs into more ecient ones, thus avoiding the use of e.g. memo-functions [Hug85]. Often inherited attribute dependencies are threaded through an abstract syntax tree, which corresponds closely to another functional programming optimization called accumulation [BW88a, Bir84]. As a consequence the result of many program transformations which are performed on functional programs in order to increase eciency, are automatically achieved when using attribute grammars as the starting formalism. This is mainly caused by the fact that in attribute grammars the underlying data structures play a more central role than the associated attributes and functions, whereas in the functional programming case the emphasis is reversed. From this correspondence it follows that attribute grammars may be considered as a functional programming language, without however providing the advantages of many functional languages such as higher order functions and polymorphism.
AGs from an object-oriented view When comparing an attribute grammar with an object-oriented system we may note the correspondences shown in Figure 1.5. AG individual nodes tree structure tree transformations attribute updating
object-oriented program set of objects references between objects outside messages to objects inter-object messages
Figure 1.5: AGs from an object-oriented view An interesting dierence with most object-oriented systems however is that the propagation updating information is done implicitly by the system, as e.g. in the Higgins [HK88] system, and not explicitly, as in e.g. the Andrew [M+86] system or Smalltalk. The advantage of this implicit approach is that the extra code associated with correctly scheduling the updating process need not be provided. Because in objectoriented systems this part of the code is extremely hard to get correct and ecient, this is considered a great advantage. In conventional object-oriented systems there are basically two ways to maintain functional dependencies:
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES
13
maintaining view relations
In this case an object noti es its so-called observers that its value has been changed, and leaves it up to some scheduling mechanism to initiate the updating of those observers. Because of the absence of a formal description of the dependencies underlying a speci c system, such a scheduler has to be of a fairly general nature: either the observation relations have to be restricted to a fairly simple form, e.g. simple hierarchies, or a potentially very inecient scheduling has to be accepted.
sending dierence messages
In this case an object sends updating messages to objects depending on it. Thus not only has an object to maintain explicitly which other objects depend on it, but it can also be gleaned from the code on which parts another object depends. A major disadvantage of this approach is thus that, whenever a new object-class B is introduced depending on objects of class A, also the code of A has to be updated. An advantage of this approach is that by introducing a large set of messages it can be precisely indicated which arguments of which functional dependencies have changed in which way, and probably costly complete reevaluations can be avoided. Although this fact is not often noticed, such systems contain a considerable amount of user programmed nite dierencing [PK82] or strength reduction. As a consequence these systems are sometimes hard to understand and maintain.
1.2.2.3 De nition of AGs This paragraph gives a formal de nition of AGs, based on the one given in [WG84]. There is one important dierence with the original de nition. A new kind of attributes, so-called local attributes, is introduced. Local attributes are not attached to a nonterminal, as inherited and synthesized attributes are, but to a production. The reason for introducing local attributes here is that they will be used for modeling higher order AGs, which are de ned on top of AGs.
De nition 1.2.1 A context-free grammar is a 4-tuple G = (T; N; P; Z ).
T is a set of terminal symbols, N is a non-empty set of nonterminal symbols, P is a nite set of productions, and Z 2 N is the start symbol.
14
CHAPTER 1. INTRODUCTION
Furthermore, we de ne V = T [ N . The set of all nite strings x1 : : : xn, n 1, formed by concatenating elements of V is denoted by V +. V denotes V + augmented by adding the empty string (which contains no symbols and is denoted by ). V is called the vocabulary of G. Nonterminal symbols are usually called nonterminals and terminal symbols are called terminals. Each production has the form A ! , A 2 N and 2 V . The derivation relation ) is de ned as follows. For any ; 2 V , ) if = 1A 2, = 1 0 2 and A ! 0 2 P where A 2 N and 0; 1; 2 2 V . If ) , we say that is obtained by a derivation from () denotes the re exive and transitive closure of the relation )). The set of strings derived from the start symbol Z is denoted by L(G). A structure tree for a terminal string w 2 L(G) is a nite ordered tree in which every node is labeled by X 2 V or by . If a node n labeled as X has sons n1; n2; : : :; nm labeled as X1; X2; : : :; Xm , then X ! X1 : : :Xm must be a production in P . The leafs of the tree for w, concatenated from left to right, form w. De nition 1.2.2 An attribute grammar is a 3-tuple AG = (G; A; R).
G = (T; N; P; Z ) is a context-free grammar, [ [ A= AIS(X) [ AL(p) is a nite set of attributes, and X 2T [N
R=
[
p2P
p2P
R(p) is a nite set of attribution rules.
AIS(X) \ AIS(Y) 6= ; implies X = Y . For each occurrence of nonterminal X in the structure tree corresponding to a sentence of L(G), exactly one attribution rule is applicable for the computation of each attribute a 2 A. AIS(X) is the set of inherited and synthesized attributes of X . AL(p) is the set of local attributes of production p. An occurrence of a nonterminal X is the occurrence of X in a production. An instance of X is a node in a structure tree which is labeled with X. Associated with each occurrence of a nonterminal is a set of attribute occurrences corresponding to the nonterminal's attributes. Likewise, with each instance of a nonterminal instances of all attributes of that nonterminal are associated. Elements of R(p) have the form := f (: : : ; ; : : :): In this attribution rule, f is the name of a function, and are attributes of the form X:a or p:b. In the latter case p:b 2 AL(p). In the sequel we will use the notation b for p:b whenever possible. We assume that the functions used in the attribution rules are strict in all arguments.
1.2. THE DESCRIPTION OF PROGRAMMING LANGUAGES
15
De nition 1.2.3 For each p : X0 ! X1 : : :Xn 2 P the set of de ning occurrences of attributes is
AF (p) = fXi:a j Xi :a := f (: : :) 2 R(p)g [ fp:b j p:b := f (: : :) 2 R(p)g An attribute X.a is called synthesized if there exists a production p : X ! and X.a is in AF(p); it is inherited if there exists a production q : Y ! X and X:a 2 AF (q). An attribute b is called local if there exists a production p such that p:b 2 AF (p). AS(X) is the set of synthesized attributes of X . AI(X) is the set of inherited attributes of X .
De nition 1.2.4 An attribute grammar is complete if the following statements hold for all X in the vocabulary of G:
For all p : X ! 2 P; AS (X ) AF (p) For all q : Y ! X 2 P; AI (X ) AF (q) For all p 2 P; AL(p) AF (p)
AS (X ) [ AI (X ) = AIS(X ) AS (X ) \ AI (X ) = ;
Further, AI(Z) is empty (Z is the root of the grammar).
De nition 1.2.5 An attribute grammar is well-de ned if for each structure tree
corresponding to a sentence of L(G), all attributes are computable.
De nition 1.2.6 For each p : X0 ! X1 : : : Xn 2 P the set of direct attribute dependencies is given by
DDP (p) = f( ; ) j := f (: : : : : :) 2 R(p)g where and are of the form Xi:a or b. The grammar is locally acyclic if the graph of DDP(p) is acyclic for each p 2 P . We often write (; ) 2 DDP (p) as ( ! ) 2 DDP (p), and follow the same conventions for the relations de ned below. If no misunderstanding can occur, we omit the speci cation of the relation. We obtain the complete dependency graph for a structure tree by \pasting together" the direct dependencies according to the syntactic structure of the tree.
16
CHAPTER 1. INTRODUCTION
De nition 1.2.7 Let T be an attributed structure tree corresponding to a sentence in L(G), and let K0 : : :Kn be the nodes corresponding to an application of p : X0 ! X1 : : :Xn and , attributes of the form Ki:a or b corresponding with the attributes , of the form Xi :a or b. We write ( ! ) if ( ! ) 2 DDP (p). The set DT(T) = f ! g, where we consider all applications of productions in T , is called the dependency relation over the tree T.
1.3 Higher order attribute grammars (HAGs) Higher order attribute grammars are an extension of normal attribute grammars in the sense that the distinction between the domain of parse-trees and the domain of attributes has disappeared:
non-attributed trees computed in attributes may be grafted to the parse tree at dierent places.
parts of the parse tree can be stored in an attribute. This feature will be modeled with the help of a synthesized attribute self for each nonterminal. An attribute instance self will contain the non-attributed tree below the instantiated nonterminal as value. These kind of constructions will not be discussed any further.
The term higher order is used because of the analogy with higher order functions; a function can be the result or parameter of another function. Trees de ned by attribution are known as nonterminal attributes (NTAs).
1.3.1 Shortcomings of AGs One of the main shortcomings of attribute grammars has been that often a computation has to be speci ed which is not easily expressible by some form of induction over the abstract syntax tree. The cause for this shortcoming is the fact that often the grammar used for parsing the input into a data structure dictates the form of the syntax tree. It is not obvious why especially that form of syntax tree would be the optimal starting point for performing further computations. A further, probably more esthetical than fundamental, shortcoming of attribute grammars is that there usually exists no correspondence between the grammar part of the system and the (functional) language which is used to describe the semantic functions. AGs show some weaknesses when used in editors. HAGs provide a solution for some of those weaknesses.
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS)
17
The following paragraphs will discuss the above mentioned shortcomings in more detail. The second paragraph will show an example HAG.
1.3.1.1 Multi-pass compilers The term compilation is mostly used to denote the conversion of a program expressed in a human-oriented source language into an equivalent program expressed in a hardware-oriented target language. A compilation is often implemented as a sequence of transformations (SL, L1), (L1, L2), . . . , (Lk , TL), where SL is the source language, TL the target language and all Li are intermediate languages. In attribute grammars SL is parsed, then a structure tree corresponding with SL is build and nally attribute evaluation takes place. The TL is obtained as the value of an attribute. So an attribute grammar implements the direct transformation (SL, TL) and no special intermediate languages can be used. The concept of an intermediate language does not occur naturally in the attribute grammar formalism. Using attributes to emulate intermediate languages is dicult to do and hard to understand. Higher order attribute grammars (HAGs) provide an elegant and powerful solution for this weakness, as attribute values can be used to de ne the expansion of the structure tree during attribute evaluation. In a multi-pass compiler compilation takes place in a xed number of steps, which we will model by computing the intermediate trees as a synthesized attribute of trees computed earlier. These attributes are then used in further attribute evaluation, by grafting them onto the tree on which the attribute evaluator is working. A pictorial description of this process is shown below.
Figure 1.6: The tree of a 4-pass compiler after evaluation Attribute coupled grammars (ACGs) [GG84] exactly de ne this extension, but nothing more. The Cornell Synthesizer Generator [RT88] provides only one step: the
18
CHAPTER 1. INTRODUCTION
abstract syntax tree, which is used as the starting point for the attribution, is computed as a synthesized attribute of the parse tree. A large example of the application of this mechanism can be found in [VSK89].
1.3.1.2 An example HAG A direct consequence of the dual-formalism approach (attribute grammar part versus semantic functions) is that a lot of properties present in one of the two formalisms are totally absent in the other, resulting in the following anomalies:
often at the semantic function level considerable computations are being per-
formed which could be more easily expressed by an attribute grammar. It is not uncommon to nd descriptions of semantic functions which are several pages long, and which could have been elegantly described by an attribute grammar; in the case of an incrementally evaluated system the semantic functions do not pro t from this incrementality property, and are either completely evaluated or completely re-used.
Here we show an example HAG and we demonstrate the possibility to avoid the use of a separate formalism for describing semantic functions. The HAG example in Figure 1.7 accepts the same language as the example AG grammar in Figure 1.1 except that the environment list is now modeled by a tree describing a list. Figure 1.8 shows the tree corresponding to the sentence let a,b,c in c,c,b,c ni. In the example HAG the following can be noted:
The strict separation between trees and semantic functions has disappeared; { the nonterminal ENV occurs as a type de nition for the attribute env in
the attribute (type) de nitions for DECLS and APPS , and { the attribute ENV is a nonterminal attribute (the overline in ENV is used to indicate that ENV is an NTA in production use). The tree structure is built using the constructor functions envcons and empty env, which correspond to the respective productions for ENV . The attribute APPS.env is instantiated (i.e. a copy of the tree is attributed) in the occurrences of the rst production of APPS , and takes the r^ole of the semantic function lookup in the AG of Figure 1.1. Notice that there may exist many instantiations of the ENV -tree, all with dierent attributes. The productions for ID and INT are omitted. Just as the constructor function envcons constructs a tree structure of type ENV , the constructor function mkint,
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS)
19
ROOT :: ! [Int ] seq DECLS :: ! Int number ENV env APPS :: ENV env ! [Int ] seq ENV :: [Char ] param ! Int index ID :: ! [Char ] name INT :: ! Int val ROOT ::= block DECLS APPS APPS.env := DECLS.env ROOT.seq := APPS.seq DECLS ::= def DECLS ID DECLS 0:env := envcons ID (mkint DECLS 0 :number ) DECLS 1 :env DECLS 0:number := DECLS 1 :number + 1 j empty decls DECLS.env := empty env DECLS.number := 0 APPS ::= use APPS ID ENV APPS 0:seq := APPS 1:seq ++ [ENV :index ] ENV := APPS 0:env ENV :param := ID.name APPS 1:env := APPS 0 :env j empty apps APPS.seq := [ ] ENV
::= envcons ID INT ENV ENV 0 :index := if ENV 0 :param = ID.name then INT.val else ENV 1 :index ENV 1 :param := ENV 0:param
j empty env
ENV.index := errorvalue
Figure 1.7: A higher order attribute grammar
20
CHAPTER 1. INTRODUCTION
Figure 1.8: The tree corresponding to the sentence Note the many instantiations of the same ENV -tree.
let a,b,c in c,c,b,c ni
.
which is used in an attribute equation of production def , constructs a tree structure of type INT . There are no complications in pushing incremental attribution through an unchanged NTA; the methods of [Yeh83] and [RT88] extend immediately. What is not so immediate, however, is what to do when the nonterminal attribute itself changes, as can be seen in the recently published algorithm in [TC90]. A correct and nearly optimal solution for this problem is presented in Chapter 3. We nish this paragraph by noticing that any function de ned in a functional language can be computed by a HAG which only uses copy rules and tree building rules as semantic functions. A proof can be found in Chapter 2, Section 2.4.
1.3.1.3 HAGs and editing environments This paragraph expresses some thoughts about HAGs and editing environments and is based on [TC90]. A weakness of the normal rst-order attribute grammars is their strict separation of syntactic and semantic levels, with priority given to syntax. The attributes are completely constrained by their de ning equations, whereas the abstract syntax tree is unconstrained, except by the restrictions of the underlying context-free grammar. The attributes, which are relied on to communicate context-sensitive information throughout the syntax tree, have no way of generating derivation trees. They can be used to diagnose or reject incorrect syntax a posteriori but cannot be used to guide
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS)
21
the syntax a priori. A few examples illustrate the desirability of permitting syntax to be guided by attribution:
In a forms processing environment, we might want the contents of a male/female eld to restrict which other elds appear throughout the rest of a form. In a programming language environment, we might want a partially successful type inference to provide a declaration template that the user can further re ne by manual editing. In a proof development or program transformation environment, we might want a theorem prover to grow the proof tree automatically whenever possible, leaving subgoals for the user to work on wherever necessary.
1.3.2 HAGs and related formalisms In this subsection we discuss a number of related approaches. At the end of this subsection HAGs are positioned between several other programming formalisms, and their strengths and weaknesses are placed into context.
1.3.2.1 ACGs Attribute coupled grammars were introduced in [GG84] in an attempt to model the multi-pass compilation process. Their model can be considered as a limited application of HAGs, in the sense that they allow a computed synthesized attribute of a grammar to be a tree which will be attributed again. This boils down to a HAG with the restriction that an NTA may be only instantiated at the outermost level.
1.3.2.2 EAGs Extended ax grammars [Kos91] may be considered as a practical implementation of Two-Level grammars. By making use of the pattern matching facilities in the predicates (i.e. nonterminals generating the empty sequence) it is possible to realize a form of control over a speci c tree. The style of programming in this way resembles strongly the conventional Gofer or Miranda style. An (implicitly) distinguished argument governs the actual computation which is taking place. Extensive examples of this style of formulation can be found in [CU77]. Here one may nd a thorough introduction to Two-Level grammars, and as an example a complete description of a programming language, including its dynamic semantics, is given. A generator (PREGMATIC) for incremental programming environments based on EAGs is described in [vdB92].
22
CHAPTER 1. INTRODUCTION
1.3.2.3 ASF+SDF The ASF+SDF speci cation formalism is a combination of two independently developed formalisms:
ASF, algebraic speci cation formalism [BHK89, Hen91], and SDF, syntax de nition formalism [HHKR89]. The ASF+SDF Meta-environment is an interactive development environment for the automatic generation of interactive systems for manipulating programs, speci cations or other texts written in a formal language. In [vdM91] layered primitive recursive schemes (layered PRS), a subclass of algebraic speci cations, are de ned which are used to obtain ne-grain incremental implementations in the ASF+SDF Meta-environment. Furthermore, [vdM91] gives translations from a layered PRS to a HAG and from a HAG to a not necessarily (layered) PRS.
1.3.2.4 Functional languages with lazy evaluation In paragraph 1.2.2.2 it was shown that attribute grammars may be directly mapped onto lazily-evaluated functional programming languages: the nonterminals correspond to functions, the productions to dierent parameter patterns and associated bodies, the inherited attributes to parameters and the synthesized attributes to elements of the result record. This mapping depends essentially on the fact that the functional language is evaluated lazily. This makes it possible to pass an argument which depends on a part of the function result. In functional implementations of AGs this seeming circularity is transformed away by splitting the function into a number of functions corresponding to the repeated visits of the nodes. In this way some functional programs might be converted to a form which no longer essentially depends on this lazy evaluation. All parameters in the attribute grammar formalism correspond to strict parameters in the functional formalism because of the absence of circularities. Most functional languages which are lazily evaluated, however, allow circularities. In that sense they may be considered to be more powerful.
1.3.2.5 Schema In this paragraph we will try to give a schema which may be used to position dierent programming formalisms against each other. The basic task to be solved by the dierent implementations will be to solve a set of equations. As a running example we will consider the following set:
1.3. HIGHER ORDER ATTRIBUTE GRAMMARS (HAGS) (1) (2) (3) (4)
x y z v
= = = =
23
5 x+z v 7
garbage collection (GC)
One of the rst issues we mention captures the essence of the dierence between functional and declarative styles on the one hand and the imperative styles on the other. While solving such a set of equations there may be a point that a speci c variable is not occurring any more in the set because it has received a value and this value has been substituted in all the formulae. The location associated with this variable may thus be reused for storing a new binding. In an imperative programming language a programmer has to schedule its solution strategy in such a way that the possibility for reuse is encoded explicitly in the program. An assignment not only binds a value to a variable, but it also destroys the previously bound value, and thus has the character of an explicitly programmed garbage collection action. So after substituting x in equation (2), we might forget about x and use its location for the solution of further equations.
direction (DIR)
The next distinction we can make is whether the equations are always used for substitution in the same direction, i.e. whether it is always the case that the left hand side is a variable which is being replaced by the right hand side in the other equations. This distinction marks the dierence between the functional and the logical languages. The rst are characterized by exhibiting a direction in the binding, whereas the latter allow substitutions to be bi-directional. Depending on the direction we might substitute (3) and (4) by a new equation z = 7 or (2) and (3) by y = x + v
sequencing (SEQ)
Sequencing governs the fact whether the equations have to be solved in the way they are presented, or whether there is still dynamic scheduling involved, based on the dependencies. In the latter case we often speak of a demand driven implementation, corresponding to lazy evaluation; in the rst case we speak of an applicative order evaluation, which has a much more restricted scheduling model. In the example it is clear that we cannot rst determine the value for x, then y and nally z and v. As a consequence some languages are not capable of handling the above set of equations.
dynamic set of equations (DSE)
One of the things we have not shown in our equations above is that often we have to do with a recursively de ned set of equations or indexed variables. In
24
CHAPTER 1. INTRODUCTION languages these are often represented by use of recursion in combination with conditional expressions or with loops. We make this distinction in order to distinguish between the normal AGs and the HAGs. GC DIR SEQ DSE Pascal ; ; ; + Lisp + ; ; + Gofer + ; + + + ; + ; AG HAG + ; + + Prolog + + +/; + + + Pred. Logic + + Figure 1.9: An overview of language properties
In the table in Figure 1.9 we have given an overview of the dierent characteristics of several programming languages. The +'s and ;'s are used to indicate the ease of use for a programmer in respect to his programming task, and thus do not re ect things like ecient execution or general availability. Based on this table we may conclude that HAGs bear a strong resemblance to functional languages like Gofer, Miranda [Tur85] or Haskell [HW+91]. Things which are still lacking are in nite data structures, polymorphism, and more powerful data structures. The term structures which are playing such a prominent r^ole in attribute grammars are not always the most natural representation.
Chapter 2 Higher order attribute grammars In this chapter higher order attribute grammars (HAGs) are de ned. In AGs there exists a strict separation between attributes and the parse tree. HAGs remove this separation. This is achieved by introducing a new kind of attributes, so-called nonterminal attributes (NTAs). Such nonterminal attributes play both the role of a nonterminal as well as an attribute. NTAs occur in the right-hand side of a production of the grammar and as attributes de ned by a semantic function in attribution rules. NTAs will be indicated by a overline, so NTA X will be written as X . During the (initial) construction of a parse tree a NTA X is considered as a nonterminal for which only the empty production (X ! ) exists. During attribute evaluation X is assigned a value, which is constrained to be a non-attributed tree derivable from X. As a result of this assignment the original parse tree is expanded with the non-attributed tree computed in X and its associated attributes are scheduled for computation. A necessary condition for a HAG to be well-formed is that the dependency graphs of the (partial) parse trees do not give rise to circularities; a direct consequence of this is that attributes belonging to an instance of a NTA should not be used in the computation leading to this NTA. In [Kas80] Ordered AGs (OAGs), a subclass of AGs, are de ned. In the same way Ordered HAGs can be de ned, such that an ecient and easy to implement algorithm, as for OAGs, can be used to evaluate the attributes in a HAG. First, attribute evaluation of HAGs is explained. The next section gives a de nition of HAGs based on normal AGs, several classes of HAGs and a de nition of ordered HAGs. In the last section it is shown that pure HAGs, which use only tree building rules and copy rules in attribution equations, have expressive power equal to Turing machines. 25
26
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
2.1 Attribute evaluation of HAGs procedure evaluate(T: a non-attributed labeled tree) let D = a dependency relation on attribute instances in
S = a set of attribute instances that are ready for evaluation , = attribute instances
D := DT(T) f the dependency relation over the tree T g S := the attribute instances in D which are ready for evaluation while S 6= ; do select and remove an attribute instance from S evaluate if is a NTA of the form X in T then Tnew := the non-attributed labeled tree computed in expand T at X with Tnew D := D [ DT(Tnew) S := S [ the attribute instances in D ready for evaluation
od
forall 2 successor() in D do if is ready for evaluation then insert in S od
Figure 2.1: Attribute evaluation algorithm Computation of attribute instances, expansion of a tree and adding new attribute instances is called attribute evaluation and might be thought to proceed as follows. To analyze a string according to its higher order attribute grammar speci cation, rst construct the parse tree where each X is considered as a nonterminal for which only the empty production (X ! ) exists. Then evaluate as many attribute instances as possible. As soon as the semantic function returning the value of X is computed, expand the tree at X and add the attribute instances resulting from the expansion. Continue the evaluation until there are no more attribute instances to evaluate and all possible expansions have been performed. The order in which attributes are evaluated is left unspeci ed here, but is subject to the constraint that each semantic function is evaluated only when all its argument attributes have become available. When all the arguments of an unavailable attribute instance have become available, we say it is ready for evaluation.
2.2. DEFINITION AND CLASSES OF HAGS
27
Using the observation to maintain a work-list S of all attribute instances that are ready for evaluation we get, as is stated in [Knu68, Knu71] and [Rep82], the attribute evaluation algorithm in Figure 2.1 (for a de nition of a labeled tree see De nition 2.2.3). The dierence with the algorithm de ned by [Rep82] is that the labeled tree T can be expanded during semantic analysis. This means that if we evaluate a NTA X , we have to expand the tree at the corresponding leaf X with the tree Tnew computed in X . Furthermore, the new attribute instances and their dependencies of the expansion (the set DT(Tnew)) have to be added to the already existing attribute instances and their dependencies, and the work-list S must be expanded by all the attribute instances in D that are ready for evaluation.
2.2 De nition and classes of HAGs In this section the de nition of HAGs based on AGs will be given. Then strongly and weakly terminating HAGs will be discussed.
2.2.1 De nition of HAGs In this subsection we will repeatedly use the attribute evaluation algorithm of Figure 2.1, the dependency relation D on attribute instances and the set S of attribute instances that are ready for evaluation mentioned in the attribute evaluation algorithm. Furthermore, the dependency relation DT(T) of attribute instances over a tree T is used (De nition 1.2.7, with one dierence; the term \structure tree corresponding to a sentence in L(G)" should be reduced to \structure tree"). This adaptation of De nition 1.2.7 is necessary because we will use the relation DT(T) for trees which are \under construction". A higher order AG is an extension of an AG and is de ned as follows:
De nition 2.2.1 A higher order attribute grammar is a 2-tuple HAG = (AG,NA). AG is an attribute grammar, and NA is the set of all nonterminal attributes as de ned in De nition 2.2.2.
De nition 2.2.2 For each p : X0 ! X1 : : :Xn;1 2 P the set of nonterminal attributes (NTAs) is de ned by
NTA(p) = fXj j Xj := f (: : :) 2 R(p) and (0 < j < n)g
28
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
The set of all nonterminal attributes (NA) is de ned by NA =
[
p2P
NTA(p)
If X 2 NTA(p) we write X . We have de ned a NTA as a part of the tree that is de ned by a semantic function. In the completeness De nition 2.2.6 of a HAG a NTA will be forced to be an element of the set of local attributes. An actual tree may contain NTAs (not yet computed nonterminal attributes) as leafs. Therefore we extend the notion of a tree by distinguishing two kinds of nonterminal instances, virtual nonterminal instances (NTAs without a value) and instantiated nonterminal instances (NTAs with a value and normal nonterminals). This extension of the notion of a normal structure tree is called a labeled tree.
De nition 2.2.3 A labeled tree is de ned as follows the leafs of a labeled tree are labeled with terminal instance symbols or virtual nonterminal instance symbols, the nodes of a labeled tree are labeled with instantiated nonterminal symbols.
De nition 2.2.4 A nonterminal instance of nonterminal X is labeled with symbol X and is called
a virtual nonterminal instance if X 2 NA and the semantic function de ning X
has not yet been evaluated an instantiated nonterminal instance if X 62 NA or X 2 NA and the semantic function de ning X has been evaluated From now on, the terms \structure tree" and \tree" are all used to refer to a labeled tree. It is the task of the parser to construct a labeled tree
which is derived from the root of the underlying context-free grammar, and which contains no instantiated nonterminal attributes (because they are lled in by attribution)
for a given string. This is a slightly dierent approach as suggested in the introduction where a NTA is considered as a nonterminal for which only the empty production exists. The reason for this approach is that a labeled tree makes it easy to argue about trees which are under construction (i.e., in the middle of attribute evaluation). The language
2.2. DEFINITION AND CLASSES OF HAGS
29
accepted by the parser, however, is the language described by the underlying contextfree grammar where a NTA is considered as a nonterminal for which only the empty production exists. The semantic functions and the types used in the semantic functions are left unspeci ed in the de nition of an AG. For HAGs, however, we add the remark that tree constructor functions and tree-types should be available as semantic functions and types for semantic functions, respectively. Furthermore, the semantic function which de nes a NTA X should compute (just like a parser) a labeled tree
which is derivable from the nonterminal X of the underlying context-free grammar, and
which contains no instantiated nonterminal attributes. This condition is stated below.
De nition 2.2.5 A semantic function f in a rule X := f (: : :) is correctly typed if f computes a non-attributed labeled tree derivable from X with no instantiated nonterminal attributes.
The set of local attributes is extended with NTAs in the following completeness de nition of a HAG.
De nition 2.2.6 A higher order attribute grammar is complete if the underlying AG is complete, for all productions p : Y ! 2 P , NTA(p) AL(p), and for all rules X := f ( ) in R(p), f is correctly typed. If we look at the attribute evaluation algorithm in Figure 2.1, there are two potential problems:
nontermination attribute instances may fail to receive a value The attribute evaluation algorithm in Figure 2.1 might not terminate if the labeled tree grows inde nitely, in which case there will always be virtual nonterminal attribute instances which can be instantiated. Figure 2.2 shows an example of a tree which may grow inde nitely depending on the function f.
30
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
Figure 2.2 shows how we present trees graphically. Productions are displayed as rectangles. The name of the production is given on the left in the rectangle. Nonterminals are shown in circles; the left-hand side nonterminal of a production is displayed at the top-line of the rectangle and the right-hand side nonterminal(s) is(are) at the bottomline. Attributes are displayed as squares (see for example Figure 2.3 or Figure 2.6); all input-attributes (i.e. inherited attributes of the left-hand side nonterminal and synthesized attributes of the right-hand side nonterminals) are drawn in the rectangle of a production, all output-attributes (i.e. synthesized attributes of the left-hand side nonterminal and inherited attributes of the right-hand side nonterminals) are drawn outside of the rectangle of a production. Note that when an entire tree is depicted with these productions, the \pieces" t nicely together. There are two reasons why the attribute evaluation algorithm in Figure 2.2 might fail to evaluate attribute instances:
a cycle shows up in the dependency relation D: attribute instances involved in the cycle will never be ready for evaluation, so they will never receive a value. there is a virtual nonterminal attribute instance, say X , which depends on a synthesized attribute of X . R :: ! X :: ! R ::= root X X := callX X ::= callX X X := if f (: : :) then callX else stop j stop Figure 2.2: Finite expansion is not guaranteed The second reason deserves some explanation. Suppose we have a tree T and X is a virtual nonterminal attribute instance in T. Furthermore the dependency relation D of all the attribute instances in T contains no cycles (Figure 2.3). If we take a closer look at node X in T, then if X did not depend on synthesized attributes of X it can be computed. But should X depend on synthesized attributes of X , as in Figure 2.3 it can't be computed. This is because the synthesized attributes
2.2. DEFINITION AND CLASSES OF HAGS
31
of X are computed after the tree is expanded. So a nonterminal attribute should depend neither directly nor indirectly on its own synthesized attributes. To prevent this we let depend every synthesized attribute of X on the NTA X . Therefore the set of extended direct attribute dependencies is de ned.
De nition 2.2.7 For each p : X0 ! X1 : : : Xn 2 P the set of extended direct attribute dependencies is given by
EDDP(p) = f( ! ) j := f (: : : : : :) 2 R(p)g
[ f(X ! ) j X 2 NTA(p) and 2 AS (X )g Thus a nonterminal attribute is computable if the dependency relation DT(T) (using the EDDPs) contains no cycles for any, possibly in nite, tree T. This result is stated in the following lemma.
Lemma 2.2.1 Every virtual nonterminal attribute is computable if there will be no cycles in DT(T) (using the EDDPs) for any, possibly in nite, tree T.
Proof The use of EDDP(p) prohibits a nonterminal attribute to be de ned in
terms of attribute instances in the tree which will be computed in . Suppose , with root node X, depends on attributes in the tree which is constructed in . The only way to achieve this is that somehow depends on the synthesized attributes of X, but by de nition of EDDP(p) all the synthesized attributes of X depend on and we have a cycle.
2 R :: ! X :: ! Int s R ::= root X X := f X.s X ::= stop X.s := 1
Figure 2.3: The nonterminal attribute can't be computed, a cycle occurs if the extra dependency is added (dashed arrow)
32
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
We are only interested in HAGs that allow us to compute all of the attribute equations in any structure tree. In traditional AGs there are two sources for attribute evaluation failing to compute all attributes: a cycle in the dependency graph of attributes and nontermination of a semantic function. In HAGs there are two extra sources for failing to compute all attributes: a NTA is de ned in terms of a synthesized attribute of itself and there could be in nite expansions of the tree. In traditional AGs no restriction on the termination of the semantic functions is posed in order for an AG to be well-de ned. In the sequel we will not do so for HAGs either and, furthermore, we will pose no restriction on the termination of tree expansion for a HAG in order to be well-de ned. The reason for this is that HAGs for which nite expansion of the tree is not guaranteed have the same expressive power as Turing machines, as will be shown in Section 2.4. We want these kind of HAGs to be well-de ned.
De nition 2.2.8 A higher order attribute grammar is well-de ned if, for each labeled structure tree, all attributes are computable using the algorithm in Figure 2.1.
It is clear that if D in the algorithm of Figure 2.1 never contains a cycle during attribute evaluation, all the (nonterminal) attribute instances are computable. Whether they will eventually be computed depends on the scheduling algorithm used in selecting elements from the set S. It is generally undecidable whether a given HAG will have only nite expansions (see Section 2.4). A sucient condition for well-de nedness of HAGs is the following condition.
Theorem 2.2.1 A higher order attribute grammar is well-de ned if the HAG is complete, and no labeled structure tree T contains cycles in DT(T), using EDDP as the relation to construct DT(T).
Proof It is clear that a well-de ned HAG must be complete. The second item guarantees that every (nonterminal) attribute is computable (Lemma 2.2.1).
2 Some classes of well-de ned HAGs, with respect to nite expansion, are considered in the next subsection. We used the terms \attribute evaluation" and \attribute evaluation algorithm" to de ne whether an AG is well-de ned. Instead of using an algorithm we could have de ned a relation on labeled trees, indicating whether a non-attributed labeled tree is well-de ned. We used the algorithm because from that it is easy to derive conditions under which a HAG is well-de ned.
2.2. DEFINITION AND CLASSES OF HAGS
33
2.2.2 Strongly and weakly terminating HAGs A HAG is called strongly terminating if nite expansion of the tree is guaranteed. A HAG is called weakly terminating if nite expansion is not guaranteed but at least possible. This section gives de nitions for both classes and a condition under which a HAG is strongly terminating.
De nition 2.2.9 An higher order attribute grammar is strongly terminating if it is well-de ned and there are only nite expansions of the tree during attribute evaluation.
A sucient, but not necessary, condition for strongly terminating grammars is given in the following condition.
Theorem 2.2.2 A higher order attribute grammar HAG is strongly terminating if the HAG is well-de ned, and on every path in every structure tree a particular nonterminal attribute occurs at most once.
Proof The attribute evaluation algorithm is activated starting with a nite labeled
tree. Every expansion costs one nonterminal attribute. Suppose the starting nite labeled tree meets the requirements of the above theorem and there are in nite expansions of the labeled tree. Then it is necessary for a branch in the tree to grow beyond any bound. So there will be more nodes in that branch than nonterminal attributes. This leads to a contradiction.
2 It is a decidable problem to verify whether a HAG obeys Theorem 2.2.2 and it can be solved in time polynomially depending on the size of the grammar. In weakly terminating grammars there is at least the guarantee that nite expansion is possible. De nition 2.2.10 A higher order attribute grammar is weakly terminating if it is well-de ned and all NTA X generate at least one nite derivation. As for Lemma 2.2.2, it is a decidable problem to nd out whether a HAG is weakly terminating and it can be solved in time polynomially depending on the size of the grammar. A weakly terminating HAG gives us the power to de ne and evaluate partial recursive functions. A HAG computing the factorial function is shown as an example in Figure 2.4.
34
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS R :: Int arg ! Int result F :: Int arg ! Int result R ::= root F F:arg := R.arg F := callF R:result := F:result
F ::= callF F F 1:arg := F 0:arg ; 1 F 1 := if F 0:arg 6= 0 then callF else stop F 0:result := F 0:arg F 1:result j stop F :result := 1
Figure 2.4: Computation of the factorial function with a HAG.
2.3 Ordered HAGs (OHAGs) [Kas80] de nes ordered attribute grammars (OAGs), a subclass of the well-de ned AGs. Whether a grammar is ordered can be checked by an algorithm, which depends polynomially in time on the size of the grammar. Furthermore, ecient incremental evaluators, using visit sequences, can be generated for OAGs. An AG is l-ordered if for each symbol a total order over the associated attributes can be found, such that in any context of the symbol the attributes may be evaluated in that order. [Kas80] speci es an algorithm to construct a particular total order out of a partial order which describes the possible dependencies between the attributes of a nonterminal. If the thus found total order does not introduce circularities the grammar is called ordered by Kastens. So the class of OAGs is a real subset of the class of l-ordered grammars. It would have been more obvious to call the l-ordered AGs ordered and the OAGs Kastens-ordered. We will use this approach for the de nition of ordered HAGs.
De nition 2.3.1 A HAG is ordered (OHAG) if for each symbol a total order over the associated attributes can be found, such that in any context of the symbol the attributes may be evaluated in that order.
First, a condition, based on OAGs, is given which may be used to check whether a HAG is ordered. Then visit sequences for OHAGs will be de ned.
2.3.1 Deriving partial orders from AGs To decide whether a HAG is ordered the HAG is transformed into an AG and it is checked whether the AG is an OAG. The derived orders on de ning attribute occurrences in the OAG can be easily transformed back to orders on the de ning occurrences of the HAG.
2.3. ORDERED HAGS (OHAGS)
35
Figure 2.5: The same part of a structure tree in a HAG and the corresponding reduced AG In a previous section (Lemma 2.2.1) it was shown that the EDDP ensures that every NTA can be computed. The reduced AG of a HAG is now de ned as follows:
De nition 2.3.2 Let H be a HAG. The reduced AG H ' is the result of the following transformations to H :
1. in all right-hand sides of the productions all occurrences of X are replaced by the corresponding X 2. all thus converted nonterminals X are equipped with an extra inherited attribute X.atree 3. all occurrences X in the left-hand side of the attribution rules are replaced by X.atree 4. all synthesized attributes of previously NTAs X now contain the attribute X.atree in the right-hand side of their de ning semantic function and are thus explicitly made depending on this attribute. The transformation is demonstrated in Figure 2.5. This de nition ensures that all synthesized attributes of NTA X (X.atree in the reduced AG) in the HAG can be only computed after NTA X (X.atree in the reduced AG) is computed.
Theorem 2.3.1 A HAG is ordered if the corresponding reduced AG is an OAG. Proof Map the occurrences of X.atree in the orders of the reduced AG derived from
a HAG to NTAs X . The result are orders for the HAG in the sense that the HAG is ordered.
2
36
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS R :: X tree ! X :: Int i,y ! Int s,z
R :: X tree ! X :: Int i,y X atree ! Int s,z
R ::= root X X X 0 := R.tree X 1 := R.tree X 0:i := X 1:s X 1:y := X 0:z
R ::= root X X X 0:atree := R.tree X 1:atree := R.tree X 0:i := X 1 :s X 1:y := X 0:z
X ::= stop0 X.z := X.i j stop1 X.s := X.y
X ::= stop0 X.z := 1 X.i X.atree j stop1 X.s := 1 X.y X.atree
1 x y = x R
X
tree
X
i
R
tree
y
s
z
X
i
y
s
R
X
s
y
i
z
stop1
tree
root i
y
s
z
X stop0
i
y
s
z
X
dependency s synthesized X.s i inherited X.i
i
y
s
z
stop1
R
X
z
z
root
stop0
s
stop0
root
X
y
i
X
i
y
i
y
s
z
s
z
stop1
tree
root X stop0
i
y
s
z
X stop1
Figure 2.6: A HAG is shown at the top-left and at the top-right the corresponding reduced AG is shown. Below a pictorial view of the productions of the reduced AG and three possible attributed trees are shown. The lowest attributed tree shows a cycle in the attribute dependencies which is only possible in the reduced AG (the attribute atree and its dependencies are omitted).
2.3. ORDERED HAGS (OHAGS)
37
We note that this theorem may reject a HAG, because the derived AG is not ordered; the test may be too pessimistic. Sometimes a HAG is ordered although the reduced AG is not an OAG, as is shown in Figure 2.6. The class of OAGs is a suciently large class for de ning programming languages, and it is expected that the above described way to derive evaluation orders for OHAGs provides a large enough class of HAGs.
2.3.2 Visit sequences for an OHAG The dierence between the OAG visit sequences as they are de ned by [Kas80] and the OHAG visit sequences is that in a HAG the instruction set is extended with an instruction to evaluate a nonterminal attribute and expand the labeled tree at the corresponding virtual nonterminal. The following introduction to visit sequences is almost literally taken from [Kas80]. The evaluation order is the base for the construction of a exible and ecient attribute evaluation algorithm. It is closely adapted to the particular attribute dependencies of the AG. The principle is demonstrated here. Assume that an instance of X is derived by S ) uY y !p uvXxy !q uvwxy ) s: Then the corresponding part of the structure tree is S
u
Y
y
v
X
x
production p production q w
An attribute evaluation algorithm traverses the structure tree using the operations \move down to a descendant node" (e.g. from Y to X) or \move up to the ancestor node" (e.g. from X to Y). During a visit of node Y some attributes of AF(p) are evaluated according to semantic functions, if p is applied at Y. In general several visits to each node are needed before all attributes are evaluated. A local tree walk rule is associated to each p. It is a sequence of four types of instructions: move up to the ancestor, move down to a certain descendant, evaluate a certain attribute
38
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
and evaluate followed by expansion of the labeled tree with the value of a certain nonterminal attribute. The last instruction is speci c for a HAG. Visit sequences for a HAG can be easily derived from visit sequences of the corresponding reduced AG. In an OAG the visit sequences are derived from the evaluation order on the de ning attribute occurrences. A description of the computation of the visit sequences in an OAG is given in [Kas80]. The visit sequence of a production p in an AG will be denoted as VS(p) and in the HAG as HVS(p).
De nition 2.3.3 Each visit sequence VS(p) associated to a rule p 2 P (p : X0 ! X1 : : :Xkpk;1) in an AG is a linearly ordered relation over de ning attribute occurrences and visits. VS(p) AV (p) AV (p); AV (p) = AF (p) [ V (p)
V (p) = fvk;i j 0 i < kpk; 1 k novXi g vk;0 denotes the k-th ancestor visit, vk;i, i > 0 denotes the k-th visit of the descendant Xi , kpk denotes the number of nonterminals in production p and novX denotes the number of visits that will be made to X . For the de nition of VS(p) see [Kas80]. We now de ne the HVS(p) in terms of the VS(p).
De nition 2.3.4 Each visit sequence HVS(p) associated to a rule p 2 P in a HAG is a linearly ordered relation over de ning attribute occurrences, visits and expansions. HVS(p) HAV(p) HAV(p); HAV(p) = AV (p) [ VE(p) VE(p) = fei j 1 i < kpkg where AV(p) is de ned as in the previous de nition. HVS(p) = fg( ) ! g() j ( ! ) 2 VS(p)g with g : AV(p) ! HAV(p) de ned as (
a is of the form Xi:atree g(a) = eai ifotherwise ei denotes the computation of the nonterminal attribute Xi and the expansion of the labeled tree at X i with the tree computed in Xi. Note that a virtual nonterminal can only be visited after the virtual nonterminal is instantiated. The visit sequences for OAGs are de ned in such a way that during a visit to a node at least one synthesized attribute is computed. Because all synthesized attributes of a virtual nonterminal X depend by construction on the nonterminal
2.4. THE EXPRESSIVE POWER OF HAGS
39
attribute, the corresponding attribute X.atree in the OAG will be computed before the rst visit. In [Kas80] it is proved that the check and the computation of the visit sequences VS(p) for an OAG depends polynomially in time on the size of the grammar. The mapping from the HAG to the reduced AG and the computation of the visit sequences HVS(p) depend also polynomially in time on the size of the grammar. So the subclass of well-de ned HAGs derived by computation of the reduced AG, analyzing whether the reduced AG is an OAG and computation of the visit sequences for a HAG can be checked in polynomial time. Furthermore an ecient and easy to implement algorithm, as for OAGs, based on visit sequences can be used to evaluate the attributes in a HAG.
2.4 The expressive power of HAGs In this section it is shown that pure HAGs have the same expressive power as Turing machines and are thus more powerful than pure AGs. A pure (H)AG is de ned as follows.
De nition 2.4.1 A (H)AG is called pure if the (H)AG uses only tree-building and copy rules in attribution equations.
First Turing machines will be de ned; then it is shown how Turing machines can be implemented with HAGs. The de nitions for Turing machines are largely taken from [HU79] and [MAK88].
2.4.1 Turing machines The Turing machine model we use has a nite control, an input tape which is divided into cells, and a tape head which scans one cell of the tape at a time. The tape is in nite both to the left and to the right. Each cell of the tape holds exactly one of a nite number of tape symbols. Initially, the tape head scans the leftmost of the m (0 m < 1) cells holding the input, which is a string of symbols chosen from a subset of the tape symbols called the input symbols. The remaining cells each hold the blank, which is a special tape symbol that is not an input symbol. In one move of the Turing machine, depending upon the symbol scanned by the tape head and the state of the nite control, the Turing machine 1. moves to a next state, 2. prints a symbol on the tape cell scanned, replacing what was written there, and
40
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS 3. moves its head left or right one cell.
The de nition of a Turing machine is as follows: De nition 2.4.2 A Turing machine is a 7-tuple M = (Q; ;; B; ; ; q0; F ), where
Q = fq0; : : :; qjjjMjjj;1g is the nite set of states (jjjMjjj denotes the number of
states), ; is the nite set of allowable tape symbols, B , a symbol of ;, is the blank, , a subset of ; not including B , is the set of input symbols, is the next move function, a mapping from Q ; to Q ; fL; Rg, ( may, however, be unde ned for some arguments) q0 2 Q is the start state, and F Q is the set of nal states. De nition 2.4.3 The language accepted by M, denoted L(M), is the set of words w in that cause M to reach a nal state as a result of a nite computation when w is placed on the tape of M, with M in starting state q0, and the tape head of M scanning the leftmost symbol of w. Given a Turing machine accepting a language L, we may assume, without loss of generality, that the Turing machine halts, i.e., has no next move, whenever the input is accepted. However, for words not accepted, it is possible that the Turing machine will never halt. In the next subsection a Turing machine will be modeled by a HAG with a so-called instantaneous description (ID), which is described below. A con guration of Turing machine M can be speci ed by 1. the string 1 printed on the tape to the left of the read/write head, 2. the current state q, and 3. the string 2 on the tape, starting at the read/write head and moving right. We may summarize this in the string
1q2 which is called an instantaneous description ID of M. Note that 1 and 2 are not unique, the string 12 must contain all the non-blank cells of M's tape, but it can be extended by adding blanks at either end.
2.4. THE EXPRESSIVE POWER OF HAGS
41
2.4.2 Implementing Turing machines with HAGs In this section we will consider a xed Turing machine M. M will be modeled by a HAG with an instantaneous description (ID). A pure HAG GM will be constructed which models, for any input string, the computation of M on this input string by expanding a tree with the successive IDs. The pure HAG GM for a given Turing machine M is shown in Figure 2.7 in which the following can be noted for the productions:
root. Sons of R are the input symbols in T and the NTA ID . The NTA
ID (representing an instantaneous description) initially contains the starting con guration of M. The root nonterminal R synthesizes one attribute result, which contains accept or reject when the machine halts.
qi , accept and reject. The qi production is not really one production, but a
family of productions (indicated by a box in the sequel), one for each state of M. The NTA ID will contain, after attribute evaluation, the next ID (thereby modeling one change of state in M) as a value. This is re ected by the equation ID := S.nextidi . Other information, namely what the tape looks like, comes in three parts. The rst T denotes the tape on the left of the cell that is being scanned, the S denotes the symbol on the scanned cell itself, and the second T denotes the tape on the right of the scanned cell. The other attribution equations tell symbol S what its environment, i.e. the rest of the tape, looks like. This seemingly redundant copying of information is necessary to avoid if-then-constructions in our attribution equations. The nonterminal ID computes the attribute result, which contains accept or reject when the machine halts. st . These are once again a family of productions, this time one for every terminal t 2 ;. Nonterminal S has six inherited attributes, left, lhead, ltail, right, rhead and rtail. The rst three of these describe the tape to the left of the cell represented by S and its head and tail, and the last three do the same for the tape to the right of this cell. Furthermore, S has synthesized attributes S.nextidi for each qi 2 Q. These are very important, as they contain the description of the next ID in the sequence. The attribution rules are not the same for all these rules, but rather depend on what the transition function looks like for a given state qi and particular terminal symbol t. Writing (qi; t) = (qj ; t0; L=R) if it's de ned, we discern the following cases:
42
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
{ L: S.nextidi := qj S.ltail S.lhead (newtape t0 S.right) { R: S.nextidi := qj (newtape t0 S.left) S.rhead S.rtail and if (qi; t) is unde ned: { qi a nal state: S.nextidi := accept { qi not a nal state: S.nextidi := reject
newtape and endtape. The nonterminal T represents the semi-in nite tape on
one side of a tape cell. This can be done because this tape contains only nitely many non-blank symbols. T has two synthesized attributes, head and tail containing the head and the tail of the tape represented by T. The somewhat peculiar attribute equation for T.tail in production endtape re ects the fact that in Turing machines, the empty list consists of an in nity of blanks (of which, at any time during the computation, we need to represent only the rst one).
De nition 2.4.4 The language accepted by GM , denoted L(GM ), is the set of words
w in for which the attribute R.result contains the value accept after termination of attribute evaluation. When attribute evaluation ends, the attribute result of the root of the tree indicates acceptance or rejection of a string. When attribute evaluation doesn't terminate, then neither does the corresponding computation of M. We have thus established the following:
Theorem 2.4.1 Given a Turing machine M, a HAG GM can be constructed such that L(GM ) = L(M). Using the above theorem we may conclude:
Corollary 2.4.1 Pure HAGs have Turing machine computing power and are thus more powerful than pure AGs which have no Turing machine computing power.
Finally, note that we could have put much more information in attribute result, e.g. the contents of the tape when the computation ends (implying that we can also directly compute partial recursive functions with HAGs) or even the entire sequence of IDs constituting the computation.
2.4. THE EXPRESSIVE POWER OF HAGS
R :: ! ID result ID :: ! ID result S :: T left S lhead T ltail T right S rhead T rtail ! ID nextid 0 : : : ID nextid jjjMjjj;1 T :: ! S head T tail R ::= root T ID ID := q0 (endtape sblank ) T.head T.tail R.result := ID.result ID ::= qi T S T ID S.left := T0; S.lhead := T0 .head; S.ltail := T0 .tail S.right := T1;S.rhead := T1.head; S.rtail := T1.tail ID := S.nextidi ID0.result := ID 1 .result j accept ID.result := accept j reject ID.result := reject S ::= st S.nextidi := see text T ::= newtape T S T0 .head := S; T0.tail := T1 j endtape S T.head := S; T.tail := endtape sblank
Figure 2.7: The HAG GM for a Turing machine M.
43
44
CHAPTER 2. HIGHER ORDER ATTRIBUTE GRAMMARS
Chapter 3 Incremental evaluation of HAGs This chapter presents a new algorithm for the incremental evaluation of Ordered Attribute Grammars (OAGs) which also solves the problem of the incremental evaluation of Ordered higher order attribute grammars (OHAGs). Two new approaches are used in the algorithm. First, instead of storing the results of the semantic functions in a tree, all results of visits to trees are cached. None of the attributes are stored in a tree but in a cache. Trees are built using a \hashing cons" [SL78], thus sharing multiple instances of the same tree and avoiding repeated attribution of the same tree with the same inherited attributes. Second, each visit computes not only synthesized attributes but also bindings for future visits. Bindings, which contain attribute values computed in one visit and used in future visits, are also stored in a cache. Future visits get the necessary earlier computed attributes (the bindings) as a parameter. One of the advantages of having all attribute values in a cache is that we nally managed to introduce a relative simple method for trading space for time to the AG world. A small cache means longer time for incremental evaluation, a larger cache means faster incremental evaluation. So there is no longer a necessity to have much memory available for incremental AG-based systems, but instead one can choose a size of cache-memory which behaves suciently well. Another advantage is that the new algorithm performs almost as good as the best evaluators known for normal AGs.
3.1 Basic ideas It is known that the (incremental) attribute evaluator for ordered AGs [Kas80, Yeh83, RT88] can be trivially adapted to handle ordered higher order AGs [VSK89]. The adapted evaluator, however, attributes each instance of a NTA separately. This leads to non-optimal incremental behavior after a change to a NTA, as can be seen in 45
46
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
the recently published algorithm of [TC90]. The algorithm presented in this chapter handles multiple occurrences of the same NTA (like multiple occurrences of any subtree) eciently, and runs in O(jAectedj + jpaths to rootsj) steps after modifying subtrees, where jpaths to rootsj is the sum of the lengths of all paths from the root to all modi ed subtrees. This run-time is almost as good as an optimal algorithm for rst-order AGs, which runs in O(jAectedj). The new incremental evaluator can be used for language-based editors like those generated by the Synthesizer Generator [RT88] and for minimizing the amount of work for restoring semantic values in tree-based program transformation systems [VvBF90]. The new algorithm is based on the combination of the following four ideas:
The algorithm computes attribute values by using visit functions. A visit func-
tion takes as rst parameter a tree and a part of the inherited attributes of the root of that tree. It returns a subset of the synthesized attributes. Our evaluator consists of visit functions that recursively call each other. A call to a visit function corresponds to a visit in a visit sequence of an ordered HAG. Instead of storing the results of semantic functions in the tree, as in conventional incremental evaluators, the results of visit functions are cached. This approach allows equal structured trees to be shared. It is also more ecient because a cache hit of a visit function means that this visit to (a possible large) tree can be skipped. Furthermore, a visit function may return the results of several semantic functions at a time. As in [TC90]'s algorithm, equal structured trees will be shared. In our algorithm this is the only representation for trees, thus multiple instances of the same tree will be shared. Because many instantiations of a NTA may exists, each with its own attributes, attributes are no longer stored within the tree, but in a cache. This enables sharing of trees without having to care for the associated attributes. In a normal incremental treewalk evaluator a partially attributed tree can be considered as an ecient way for storing memoisation information. During a reevaluation newly computed attributes can be compared with their previous values. Although the above idea seems appealing at rst sight, a complication is the fact that attributes computed in an earlier visit may have to be available for later visits. In order to solve this problem, so-called bindings are introduced. Bindings contain attribute values computed in one visit and used in future visits to the same tree: each visit function computes synthesized attributes and bindings for future visits. Bindings computed by earlier visits are passed as an extra parameter to visit functions.
3.2. PROBLEMS WITH HAGS
47
The visit functions may be implemented in any imperative or functional language. Furthermore, as a result of introducing bindings, visit functions correspond directly to supercombinators [Hug82]. Ecient caching is partly achieved by ecient equality testing between parameters of visit functions, which are trees, inherited attributes and bindings. Therefore, hash consing for constructing trees and bindings is used, which reduces testing for equality between trees and between bindings to a fast pointer comparison. Although the computation of bindings may appear to be cumbersome, they have a considerable advantage in incremental evaluation: they contain precisely the information on which visits depend and nothing more.
3.2 Problems with HAGs The main two new problems in the incremental evaluation of HAGs are the ecient evaluation of multiple instantiations of the same NTA and the incremental evaluation after a change to a NTA. In Chapter 1, Figure 1.7 we saw the replacement of a (semantic) lookup-function by a NTA. This NTA then takes the r^ole of a semantic function. As a consequence, at all places in an attributed tree where the lookupfunction would have been called the (same-shaped) NTA will be instantiated. Such a situation is shown in Figure 3.1 where T2 is the tree modeling e.g. part of the environment in T5, and is being joined with T3 and T4 giving rise to two larger environments. NTA1 and NTA2 are the locations in the attributed tree were these two environments are instantiated. These instantiations thus include a copy of the tree T2. The following can be noted with respect to incremental evaluation in Figure 3.1, where the situation (a) models the state before an edit action in the subtree indicated with NEW, and (b) the nal situation after the edit action and reevaluation needed:
NTA1 and NTA2 are de ned by attribution. Trees T2 and T2' are multiple instantiated trees in both (a) and (b). How
can we achieve an ecient representation for multiple instantiated (equal or non-equal attributed) trees like T2 and T2'?
NTA1 and NTA2 are updated when a subtree modi cation occurs at node
NEW. How can we eciently identify those parts of an attributed tree which have not changed (like T3 and T4 in (b)), derived from an NTA so that they can be reused after NTA1 and NTA2 have been updated?
48
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
NTA1
NTA2
X1
X2
NEW T1
T5
T3
T2
T4 T2
NTA1
NTA2
X1
X2
NEW T1
T5’
(a)
T3 T2’
T4 T2’
(b)
Figure 3.1: A subtree modi cation at node NEW induces subtree modi cations at node X1 and X2 in the trees instantiated at NTA1 and NTA2.
3.3 Conventional techniques Below, several incremental AG-evaluators are listed. All of them can be trivially adapted for the higher order case but none of them is capable of eciently handling multiple instantiations of the same NTA, nor of reusing slightly modi ed NTAs.
OAG [Kas80, RTD83]. See the previous chapter for more details. Optimal time-change propagation [RTD83]. This approach to incremental at-
tribute evaluation involves propagating changes of attribute values through a fully attributed tree. Throughout this process, each attribute is available, although possibly inconsistent; however if reevaluating an attribute instance yield a value equal to its old value, changes need not be propagated further.
Approximate Topological Ordering [Hoo86]. This approach is a graph evaluation strategy that relies upon a heuristic approach of a topological ordering of the graph of attribute dependencies.
Function caching [Pug88]. In this approach Pugh's caching algorithm was implemented in the functional language used for the semantic equations and functions in the AG.
The following observations hold for all of the above mentioned incremental evaluators:
Attributes are stored in the tree. The tree functions as a memoisation table for the semantic functions during incremental evaluation.
3.4. SINGLE VISIT OHAGS
49
Equal structured trees are not shared. This is no surprise because the attributes
are stored within the tree so that sharing is dicult, if not impossible. Furthermore, the opportunity for sharing does not arise too often in conventional AGs.
As will be shown later, the above two observations limit ecient incremental evaluation of HAGs.
3.4 Single visit OHAGs In this subsection we will introduce some methods needed for the ecient incremental evaluator. These methods will be explained by constructing an ecient incremental evaluator for single visit OHAGs. The class of single visit OHAGs is de ned as the subclass of the ordered HAGs in which there is precisely one visit associated with each production.
3.4.1 Consider a single visit HAG as a functional program The HAG shown in Chapter 1, Figure 1.7 is an example of a single visit HAG. The single visit property guarantees that the visit sequences VS(p) can be directly transformed into visit functions, mapping the inherited to the synthesized attributes.
3.4.2 Visit function caching/tree caching Now we take the decision to cache the results of the visit functions instead of storing the results of semantic functions in the tree. In this way copies of equal structured trees can be shared. It is also more ecient because a cache hit of a visit function means that this visit to a (possibly large) tree may be skipped. Furthermore, a visit function returns the results of several semantic functions at the same time. Note furthermore that we have modeled in this way the administration of the incremental evaluation by using the function caching. No separate bookkeeping for determining which attributes have changed and which visits should be performed is necessary. The possible implementation of function caching explained hereafter was inspired by [Pug88]. A hash table can be used to implement the cache. A single cache is used to store the cached results for all functions. Tree T, labeled with root N, is attributed by calling visit N T arguments
50
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
The result of this function is uniquely determined by the function-name, the input tree and the arguments of the function. In the following algorithms two predicates for equality testing, EQUAL and EQ, are used. EQUAL(x; y) is true if and only if x and y are equal values. EQ(x; y) is true if and only if x and y either are equal atoms or are the same instance of a non-atomic value (i.e., if x and y are non-atomic values, EQ(x; y ) is true if and only if both x and y point to the same data structure). The visit functions calls can then be memoized by encapsulating all calls to visit functions with the function in Figure 3.2 for which we assume that our language is not typed.
function cached apply(visit N, T, args) = index := hash(visit N, T, args) forall 2 cache[index] do if function = visit N and EQUAL(tree,T) and EQUAL(arguments,args) then return result od result := visit N T args cache[index] := cache[index] [ fg return result Figure 3.2: The function cached apply
To implement visit function caching, we need ecient solutions to several problems. We need to be able to
compare two visit functions eciently. This is possible because all visit func-
tions are declared at the global level and do not reference global variables. In such a case function comparison boils down to a fast pointer comparison of the start location of the code of the functions.
compute a hash index based on a function name and an argument list. For a discussion of this problem, see [Pug88].
determine whether a pending function call matches a cache entry, which requires ecient testing for equality between the arguments (in case of trees very large structures!) in the pending function call and in a candidate match.
Tree comparison is solved by using a technique which has become known as hashconsing for trees [SL78]. When hash-consing for trees is used, the constructor functions for trees are implemented in such a way that they never allocate new nodes with the same value as an already existing node; instead a pointer to that already
3.4. SINGLE VISIT OHAGS
51
existing node is returned. As a consequence all equal subtrees of all structures which are being built up are automatically shared. Hash-consing for trees can be obtained by using an algorithm such as the one described in Figure 3.3 (EQ tests true equality). As a result hash-consing allows constant-time equality tests for trees.
function hash cons(CONSTR, (p1, p2, . . . , pn )) = index := hash(CONSTR, (p1 , p2 , . . . , pn )) forall p 2 cache[index] do if p^.constructor = CONSTR and EQ(p^.pointers, (p1 , p2 , . . . , pn )) then return p
od
p := allocate constructor cell() p^ := new(CONSTR, (p1, p2 , . . . , pn )) cache[index] := cache[index] [ fpg return p Figure 3.3: The function hash cons
Now, the function call EQUAL(tree1, tree2) in cached apply may be replaced by a pointer comparison (tree1 = tree2). As for function caching, we need an ecient solution for computing a hash index based on a constructor and pointers to nodes.
3.4.3 A large example Consider again the HAG in Chapter 1, Figure 1.7, which describes the mapping of a structure consisting of a sequence of de ning identi er occurrences and a sequence of applied identi er occurrences onto a sequence of integers containing the index positions of the applied occurrences in the de ning sequence. Figure 3.4.a shows the tree for the sentence let a,b,c in c,c,b,c ni which was attributed by a call to visit ROOT (block(def(def(def(def empty decls a) b) c)) (use(use(use(use(use empty apps c) c) b) c)))
Incremental reevaluation after removing the declaration of c is done by calling block(def(def(def empty decls a) b)) (use(use(use(use(use empty apps c) c) b) c)))
visit ROOT (
52
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
[3,3,2,3]
[error,2,error,error] ROOT
ROOT DECLS
let
c
DECLS DECLS EMPTY
env
in
APPS
APPS
APPS c ENV * EMPTY (a)
DECLS
env
APPS
in
c ENV*
ni ENV
c ENV
ENV
APPS b ENV
b a
let
ni
ENV *
ENV
ENV*
c 3 ENV b 2 ENV
(b)
a 1 EMPTY
Figure 3.4: The tree before (a) and after removing c (b) from the declarations in let a,b,c in c,c,b,c ni. The * indicate cache-hits in ENV looking up c. The dashed lines between boxed nodes denote that these nodes are shared nodes. The resulting tree is shown in Figure 3.4.b, note that only the APPS-tree will be totally revisited (since the inherited attribute env changed), the rst visits to the DECLS and ENV trees generate cache-hits, and further visits to them are skipped. Simulation shows that, when using caching, in this example 75% of all visit function calls and tree-build calls that have to be computed in 3.4.b are found in the cache constructed in evaluation 3.4.a. So 75% of the \work" was saved. Of course removing a instead of c won't yield the same results.
3.5 Multiple visit OHAGs Although the idea of caching visit functions idea seems appealing at rst sight, a complication is the fact that attributes computed in an earlier visit sometimes have to be available for later visits and thus the model is not directly applicable to the multi-visit HAGs. To solve this problem so-called bindings are introduced. Bindings contain attribute values computed in one visit and used in a subsequent visit to the same tree. So each visit function computes synthesized attributes and bindings for subsequent visits. Each visit function will be passed extra parameters, containing the attribute values which were computed by earlier visits and that will be used in this visit. All the relevant information for the function is being passed explicitly as an argument, and nothing more.
3.5. MULTIPLE VISIT OHAGS
53
3.5.1 Informal de nition of visit functions and bindings First, visit sequences from which the visit functions will be derived are presented and illustrated by an example. Then visit functions and bindings for the example will be shown. Finally, incremental evaluation will be discussed.
3.5.1.1 Visit subsequences In the previous chapter the higher order equivalent of OAGs [Kas80], the so-called ordered higher order attribute grammars (OHAGs), were de ned. An OHAG is characterized by the existence of a total order on the de ning attribute occurrences for each production p. This order induces a xed sequence of computation for the de ning attribute occurrences, applicable in any tree production p occurs in. Such a xed sequence is called a visit sequence and is denoted by VS(p) for AGs and HVS(p) for OHAGs. In the rest of this thesis we will use the shorter VS(p) for HVS(p). VS(p) is split into visit subsequences VSS(p,v) by splitting after each \move up to the ancestor" instruction in VS(p). The attribute grammar in Figure 3.5 is used in the sequel to demonstrate the binding and visit function concepts.
3.5.2 Visit functions and bindings for an example grammar The evaluator is obtained by translating each visit subsequence VSS(p,v) into a visit function visit N v where N is the left hand side of p. All visit functions together form a functional (attribute) evaluator. A Gofer-like notation [Jon91] for visit functions will be used. Because visit functions are strict, which results in explicit scheduling of the computation, visit functions could also be easily translated into Pascal or any other non-lazy imperative language. Following the functional style we will have one set of visit functions for each production with left hand side N. The arguments of a visit function consist of three parts. The rst part is one parameter which is a pattern describing the subtree to which this visit is applied. The rst element of the pattern is a constant-name which indicates the applied production rule. The other elements are identi ers representing the subtrees of the node. The second part of the arguments represent the inherited attributes used in VSS(p,v). Before the third part of the arguments is discussed, note the following in Figure 3.5:
Attribute X.i is computed in VSS(p,1) and will be given as an argument to visit function visit X 1 because X.i is used in the rst visit to X (for the computation of X.s). Furthermore, attribute X.i is needed in the second visit to X (for the computation of X.z). In such a case, the dependency X.i ! X.z is said to cross a visit border (denoted by the dashed lines).
54
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
R :: Int i ! Int z N :: Int i; y ! Int s; z X :: Int i; y ! Int s; z
VS(p) VS(q) = VSS(p,1) = VSS(q,1) = Def X.i = Def X.s R ::= r N ; Visit X,1 ; VisitParent 1 N.i := R.i; N.y := N.s; R.z := N.z; ; Def N.s ; VSS(q,2) N ::= p X ; VisitParent 1 = Def X.z X.i := N.i; N.s := X.s; X.y := N.y; ; VSS(p,2) ; VisitParent 2 N.z := X.z + X.s; = Def X.y X ::= q INT ; Visit X,2 X.s := X.i; ; Def N.z X.z := X.y + X.i + INT.v; ; VisitParent 2 R
z i
r N
s i
y
z
s i
y
z
X
v visit border dependency dependency crossing visit border s synthesized X.s i inherited X.i
N
i
X
i
s
y
z
p
q INT
i
r
p X
z
R
s
y
z
q v
int
X
b syn. binding X.b c inh. binding X.c
Figure 3.5: An example AG (top-left), the dependencies (bottom-left), visit sequences (top right) and the dependencies with bindings (bottom-right). The dashed lines indicate dependencies of an attribute computed in the second visit on an attribute de ned in the rst visit. VS(r) is omitted.
3.5. MULTIPLE VISIT OHAGS
55
Because attribute X.i is not stored within the tree and because we do not
recompute X.i in visit X 2, attribute X.i will turn up as one of the results (in a binding) of visit X 1 and will be passed to visit X 2. A pictorial view of this idea is shown in the dependencies with bindings on the bottomright where the same idea is applied to attribute X.s. Note that all dependencies crossing a visit border are now eliminated and the binding computed by visit N 1 not only contains X.s but also the binding computed by visit X 1.
We are now ready to discuss the last part of the arguments of visit N v. This last part consists of the bindings for visit N v computed in earlier visits 1 : : : (v ; 1) to N. The results of visit N v consist of two parts. The rst part consists of the synthesized attributes computed in VSS(p,v). The last part consists of the bindings computed in visit N v and used in subsequent visits to N. So visit N v computes (novN -v) bindings, one for each subsequent visit. The binding containing attributes and bindings used in visit N (v + i) but computed in visit N v is denoted by N v!v+i . We now turn to the visit functions for the visit subsequences VSS(p,v) and VSS(q,v) of grammar in Figure 3.5. Attributes that are returned in a binding will be boxed. In the example this concerns X.i and X.s . The rst visit to N returns the synthesized attribute N:s and a binding N 1!2 containing X.s and binding X 1!2. Bindings could be implemented as a list in which case visit N 1 would look like: visit N 1 (p X) N.i = ( N.s, N 1!2) where X.i = N.i (X.s, X 1!2) = visit X 1 X X.i N.s = X.s N 1!2 = [ X.s , X 1!2 ] In the above de nition (p X) denotes the rst argument: a tree at which production p is applied, with one son, X. The second argument is the inherited attribute i of N. The function returns the synthesized attribute s and binding N 1!2 for the second visit to N. Note that N 1!2 is explicitly de ned in the where-clause of visit N 1. In visit N 2 the value of attribute X.s would have to be explicitly taken from N 1!2 by a statement of the form X.s = take N 1!2 1
where take l i takes the i-the element of list l. In order to avoid the explicit packing and unpacking of bindings in and from lists, so-called constructor-names are used. Constructor names can be used to create an element of a certain type and in the pattern matching of function-arguments. Constructor names are de ned in a datatype de nition. A suitable datatype de nition for N 1!2 is as follows
56
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS data Type N 1!2 = MK N 1!2 Type X.s Type X 1!2
This de nition also de nes the constructor name MK N 1!2 which is used to create an element of Type N 1!2. Now visit N 1 and visit N 2 are de ned as follows visit N 1 (p X) N.i = (N.s, (MK N 1!2 X.s X 1!2 )) where X.i = N.i (X.s, X 1!2) = visit X 1 X X.i N.s = X.s visit N 2 (p X) N.y (MK N 1!2 X.s X 1!2 ) = N.z where X.y = N.y X.z = visit X 2 X X.y X 1!2 N.z = X.z + X.s
Note the use of the constructor name MK N 1!2 for creating an element of Type N 1!2 in visit N 1 and for the pattern matching in visit N 2. The other visit functions have a similar structure. visit X 1 (q INT) X.i = (X.s, (MK X 1!2 X.i)) where X.s = X.i visit X 2 (q INT) X.y (MK X 1!2 X.i) = X.z where X.z = X.y + X.i + INT.v
The order of de nition and use in the where-clause are chosen in such a way that the visit functions may be also implemented in an imperative language. We nish this paragraph with the remark that the above example is a strongly simpli ed one. In the grammar of Figure 3.5 there is only one production (p) with left-hand side nonterminal N. If there is another production s with left-hand side N then the datatype de nition for binding N 1!2 would have been the following union: data Type N 1!2 = MK Np1!2 Type X.s Type X 1!2 j MK Ns1!2 : : : f the corresponding types of attributes and bindings to be saved in s g
Furthermore, all occurrences of N 1!2 and MK Np1!2 in the visit function de nition for production p would have been written as Np1!2 and MK Np1!2. This is the form used for the de nition of visit functions in the next subsection.
3.5. MULTIPLE VISIT OHAGS
57
3.5.3 The mapping VIS The mapping VIS constructs a functional evaluator in Gofer for an OHAG with the help of so-called annotated visit sequences AVS(p) and annotated visit subsequences AVSS(p). Therefore, we rst de ne annotated visit (sub)sequences. The visit functions and bindings are de ned thereafter. Finally, the correctness of VIS is proven.
3.5.3.1 Annotated visit (sub)sequences In annotated visit (sub)sequences remarks are added to the original instructions of the visit (sub)sequences. These remarks will be used for de ning the functional evaluator. In order to understand these remarks, the algorithm for computing visit sequences [Kas80, WG84, RT88] will be discussed now. The algorithm partitions the attributes of a nonterminal into sets of inherited and synthesized attributes. These sets are called partitions and form one of the ingredients of the visit sequence computation. Let p be a production with left-hand side nonterminal N . One of the relations N ,S N between the partitions I1N ,S1N . . . Inov N nov N and the visit subsequences VSS(p,1) . . . VSS(p,nov N ) is that at the end of each VSS(p,v) 1 v nov N the attributes from partitions I1N ,S1N . . . IvN ,SvN are guaranteed to be computed (here IjN denote inherited and SjN synthesized attributes). So after VSS(p,v-1) the attributes in IvN and SvN can be safely (i.e. they are ready for evaluation) computed; partition IvN contains those inherited attributes which are needed in VSS(p,v) but were not available in VSS(p,1) . . . VSS(p,v-1) . Thus the visit function that will compute the attributes in VSS(p,v) will have the inherited attributes of partition IvN amongst its parameters and will compute the synthesized attributes of partition SvN amongst its results. Because the visit functions will be de ned solely upon the annotated visit sequences these visit sequences will be annotated with the attributes in the aforementioned partitions. Figure 3.6 shows the annotated visit sequences belonging to the grammar of Figure 3.5. The annotated visit subsequences are now de ned as visit subsequences expanded with the following remarks:
At the beginning of each visit subsequence v each inherited attribute i from partition IvN is shown with an Inh i remark.
At the end of visit subsequence v each synthesized attribute s from partition SvN is shown with a Syn s remark.
The bindings which are eventually needed are shown in Inhb and Synb remarks.
58
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS AVS(p) = AVSS(p,1) = Inh N.i ; Def X.i ; Use N.i ; Visit X,1 ; Inp X.i ; Out X.s ; Outb X 1!2 ; Def N.s ; Use X.s ; Syn N.s ; Synb N 1!2 ; VisitParent 1
AVS(q) = AVSS(p,2) = AVSS(q,1) = Inh N.y = Inh X.i 1 ! 2 ; Inhb N ; Def X.s ; Def X.y ; Use X.i ; Use N.y ; Syn X.s ; Visit X,2 ; Synb X 1!2 ; Inp X.y ; VisitParent 1 1 ! 2 ; Inpb X ; AVSS(q,2) ; Out X.z = Inh X.y ; Def N.z ; Inhb X 1!2 ; Use X.z ; Def X.z ; Use X.s ; Use X.y ; Syn N.z ; Use X.i ; VisitParent 2 ; Use INT.v ; Syn X.z ; VisitParent 2
Figure 3.6: The annotated visit (sub)sequences for the grammar in Figure 3.5.
After each Visit instruction the inherited attributes and bindings for that visit
and the resulting synthesized attributes are shown by Inp, Inpb (for a binding), Out and Outb remarks. All Def a instructions are followed by Use b comments for all attributes b where a depends on.
The advantage of this form of annotated visit sequences is that we now have all information and dependencies available for deriving the visit subsequence functions and the bindings.
3.5.3.2 Visit functions We now turn to the de nition of visit functions.
De nition 3.5.1 Let H be an OHAG. The mapping VIS constructs a set of Gofer
functions (which will be called visit functions) for each nonterminal in the grammar H. The set of visit functions for nonterminal N consists of nov N visit functions of the form visit N v where 1 v nov N (see De nition 3.5.5).
The rst argument of visit N v is a tree with root N. Pattern matching in the rst argument is used to decide which production is applied at the root of the tree.
3.5. MULTIPLE VISIT OHAGS
59
The rest of the arguments is divided into two parts. The rst part consists of the inherited attributes from IvN . The second part consists of the bindings N 1!v , . . . , N (v;1)!v . In the following de nition of a binding, the name \son" refers to one of the right-hand side nonterminals of the production applied at the root of the tree that is passed to the visit function.
De nition 3.5.2 A binding N v!w (1 v < w nov N ) contains those attributes and (bindings of sons) which are used in visit N w but were computed in visit N v.
Note that the production which is applied at the root of the tree which is passed as the rst argument determines which attributes and bindings of sons are stored in N v!w . Therefore, so-called production de ned bindings are introduced. A production de ned binding Npv!w contains those attributes and bindings needed by visit N w and computed in visit N v when applied to a tree with production p at the root (visit N v (p : : :)). Actually, a binding N v!w is nothing more than a container which !w where p0 ; : : :; pn;1 are all may store the values of one of the sets Npv0!w ; : : :; Npvn; 1 productions with left-hand side N .
De nition 3.5.3 A production de ned binding Npv!w contains the set of attributes
and (bindings of sons) which are needed by visit N w and computed in visit N v when applied to a tree with production p at the root (visit N v (p : : :)).
De nition 3.5.4 The type of binding N v!w is de ned as the composite type w : : : Type bv!w data Type N v!w = MK Npv0!w Type bvp! p0 ;l;1 0 ;0 .. . j MK Npvn;!w1 Type bvp!n;w1;0 : : : Type bvp!n;w1;m;1
where p0, . . . , pn;1 are all n productions with left-hand side N and Type bvqi!;jw are the types of the binding elements bvqi!;jw . The MK Nqvi!w are constructor names which are used to construct an element of type Type Nqvi!w . Binding elements are attributes and bindings computed in visit N v that are also needed visit N w. The binding elements will be de ned in De nition 3.5.6. l and m are the number of binding !w . elements in, respectively, Npv0!w and Npvn; 1 The results of a visit function consist of two parts: the synthesized attributes in SvN and the bindings N v!(v+1) , . . . , N v!nov N . In order to avoid explicit packing and unpacking of bindings, the constructor names will be used for pattern matching in the binding arguments of visit functions and for constructing bindings in the results of visit functions.
60
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
De nition 3.5.5 The visit functions are now de ned as follows. For all nonterminals N in grammar H and for all productions p:N ! . . . X . . . with annotated visit subsequences AVSS(p,1) . . . AVSS(p,nov N ) de ne
visit N v (p . . . X . . . ) iNv;0 . . . iNv;c;1 (MK Np1!v b1p;!0 v . . . b1p;d!;v 1 ) ... (MK Np(v;1)!v bp;(v0;1)!v . . . b(p;ev;;1)1!v ) = (sNv;0, . . . , sNv;f ;1 , (MK Npv!v+1 bvp;!0 (v+1) . . . bvp;g!;(v1+1) ), ..., (MK Npv!nov N bvp;!0 nov N . . . bvp;h!;nov1 N )) where VBODY(p,v)
Here
fiNv;0, . . . , iNv;c;1g = fa j Inh a 2 AVSS(p,v)g (which is IvN ) and fsNv;0, . . . , sNv;f ;1g = fa j Syn a 2 AVSS(p,v)g (which is SvN ). The body VBODY(p,v) contains de nitions for Defs and Visit instructions in AVSS(p,v). For each Def instruction VBODY(p,v) contains a corresponding de ning equation in Gofer. Each Visit X,w instruction in AVSS(p,v) is translated into a Gofer equation of the form (sXw;0 , . . . , sXw;k;1 , X w!(w+1) , . . . , X w!nov X ) = visit X w X iXw;0 . . . iXw;l;1 X 1!w : : : X (w;1)!w
d, e, g and h are the number of binding elements in, respectively, Np1!v , Np(v;1)!v , Npv!v+1 and Npv!nov N . c and f are the number of elements in, respectively, IvN and SvN .
3.5.3.3 Bindings The binding elements bvp;i!w which are used in De nition 3.5.4 are de ned as follows.
De nition 3.5.6 The set of binding elements f bvp;!0 w ; : : :; bvp;n!;w1 g are those at-
tributes and bindings computed in AVSS(p,v) and used in AVSS(p,w). They are de ned as follows
f bvp;!0 w ; : : :; bvp;n!;w1 g = FREE(p,w) \ ALLDEF(p,v)
3.5. MULTIPLE VISIT OHAGS
61
Here FREE(p,w) is the set of attributes and bindings which are used but not de ned in AVSS(p,v) and ALLDEF(p,v) is the set of attributes and bindings which are de ned in AVSS(p,v).
De nition 3.5.7 The de nition of ALLDEF(p,v) and FREE(p,w) is as follows (here \\" denotes set dierence):
FREE(p,v) = USE(p,v) \ ALLDEF(p,v) USE(p,v) = f a j Use a 2 AVSS(p,v) _ Inp a 2 AVSS(p,v) _ Inpb a 2 AVSS(p,v) _ Syn a 2 AVSS(p,v) g [ f X j Visit X; i 2 AVSS(p,v) g ALLDEF(p,v) = f a j Def a 2 AVSS(p,v) _ Out a 2 AVSS(p,v) _ Outb a 2 AVSS(p,v) _ Inh a 2 AVSS(p,v) g
Note that NTAs can be de ned in a visit subsequence dierent from the one in which they are visited. This explains the occurrence of Visit in the de nition of USE. Figure 3.7 shows the derivation of the bindings for the grammar in Figure 3.5 and the corresponding AVSS(p,v) in Figure 3.6.
3.5.3.4 Correctness of VIS The following property holds for the mapping VIS.
Theorem 3.5.1 Let H be a strongly terminating ordered higher order attribute grammar, and let S be a structure tree of H. The execution of the functional program VIS(H) with input S terminates.
Proof In this proof we follow the approach taken for the mapping CIRC in [Kui89,
page 87]. First recall that a strongly terminating HAG is well-de ned and that there will be only nite expansions of the tree during attribute evaluation (see De nition 2.2.9). The Gofer program VIS(H) contains two kinds of functions: the visit functions and the semantic functions.
62
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS The type of binding N 1!2 in Figure 3.5 is as follows data Type N 1!2 = MK Np1!2 Type b1p;!0 2 : : : Type b1p;n!;2 1 where fb1p;!0 2; : : :; b1p;n!;2 1 g = fde nition of binding elementsg FREE(p,2) \ ALLDEF(p,1) = fde nition of FREE and ALLDEFg (USE(p,2) \ ALLDEF(p,2)) \ fN:i; X:i; X:s; X 1!2; N:sg = fde nition of USE and ALLDEFg (fN:y; X:y; X 1!2; X:z; X:s; N:zg \ fN:y; X:y; X:z; N:zg)
\ fN:i; X:i; X:s; X 1!2; N:sg = fde nition of \g fX 1!2; X:sg \ fN:i; X:i; X:s; X 1!2; N:sg = fde nition of \g fX 1!2; X:sg
Figure 3.7: The derivation of the bindings for the grammar in Figure 3.5. Note that the visit functions never cause non-termination: They split their rst argument, a nite structure tree, in smaller parts and pass these to the visit functions in their body. Because H is strongly terminating this is a nite process. The semantic functions are strict by de nition. Their execution does not terminate if they are called with a non-terminating argument. If their execution causes in nite recursion then that is an error in H. So, to show that the execution of VIS(H) terminates, it must be shown that the semantic functions are always called with well-de ned arguments. In order for the arguments to be well-de ned they must be computed and available before they are used in a semantic function call. Furthermore, the arguments should not cause in nite recursion. First note that all arguments for a semantic function f in a visit function are computed before f is called because a visit function is de ned by visit sequences, which are constructed in such a way that all arguments to semantic functions are computable before a semantic function is called. Second, the arguments for a semantic function f computed in the body of visit function v will not only be computed but also available before f is called, because an argument for f is either an inherited attribute parameter of v, an attribute computed in the body of v or an attribute stored in a binding for v (see the de nition of bindings). So all arguments to a semantic function are computed and available before the semantic function is called. Each call of a semantic function in the body of a visit function corresponds to a piece of the dependency graph DT(S). Suppose that, during the execution of VIS(H) S,
3.5. MULTIPLE VISIT OHAGS
63
function f is called. Let
a = f : : :b : : : c : : : be the function de nition that corresponds with that particular function call. Then DT(S) contains nodes corresponding to a, b and c (say , and ); furthermore, DT(S) contains edges from to and from to . So, if the computation of VIS(H) S leads to an in nite sequence of function calls then DT(S) must contain a cycle. This contradicts the assumption that H is well-de ned.
2
3.5.4 Other mappings from AGs to functional programs The idea to translate AGs into functions or procedures is not new. In [KS87, Kui89] the mappings SIM and CIRC are de ned. The reader is referred to [Kui89, pages 94{95] for a comparison of the mappings described in [Jou83, Kat84, Tak87, Kui89]. Most of those mappings are variants of the mapping SIM. The dierences between the mappings SIM, CIRC and VIS are as follows. The mapping SIM constructs a single function for each synthesized attribute. For every synthesized attribute X.s of an AG, SIM(AG) contains a function eval X.s, which takes as arguments a structure tree and all the inherited attributes of X on which X.s depends. The function eval X.s is used to compute the values of the instances X.s. Mapping CIRC translates each nonterminal X into a function eval X. The rst argument of eval X represents the structure tree. The other arguments represent the inherited attributes of X. The result of eval X is a tuple with one component corresponding to each synthesized attribute of X. In VIS visit sequences are translated into visit functions. Each nonterminal X is translated into n visit functions visit X v where n is the number of visits to X. CIRC constructs lazy functional programs. SIM and VIS construct strict functional programs. SIM and CIRC are used in [Kui89] to transform a functional program into a more ecient functional program. The other mappings described in [Jou83, Kat84, Tak87] are used to derive an evaluator for AGs. SIM constructs inecient evaluators because attributes might be computed more than once. CIRC constructs more ecient evaluators than SIM. VIS is used to derive ecient incremental evaluators for AGs. In [Pug88] and [FH88, Chapter 19] an incremental functional evaluator a la SIM and based on function caching is described. The dierence with VIS is that VIS is capable of handling the higher order case eciently because of the sharing of trees. Furthermore, VIS computes more attributes per visit function than SIM and
64
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
the bindings allow several visits to a node without the need to recompute values computed in an earlier visit.
3.6 Incremental evaluation performance In this section the performance of the functional evaluator derived by VIS with respect to incremental evaluation is discussed. We would like to prove that the derived incremental evaluator recomputes at most O(jAectedj) attributes. In the AG case the set of aected attributes is the set of attribute instances that receive a new value as a result of a subtree replacement and the new created attributes [RTD83]. If incremental AG-evaluators would be used for HAGs all attribute instances in trees derived from NTAs would be considered as new created attribute instances (and thus belonging to the set Aected) after a subtree replacement. In the de nition of Aected for HAGs the whole tree, including trees derived from NTAs, is compared with the tree before the subtree replacement. For the rest the de nition of Aected for HAGs is the same as for AGs. The O(jAectedj) wish for incremental evaluation can only be partly ful lled; it will be shown that the worst case boundary is given by O(jAectedj + jpaths to rootsj). Here paths to roots is the set of all nodes on the path to the initial subtree modi cation and the nodes on the paths to the root nodes of induced subtree modi cations in trees derived from NTAs. The paths to roots part cannot be omitted because the reevaluation starts at the root of the tree and ends as soon as all replaced subtrees are either reevaluated or found in the cache.
3.6.1 De nitions Let VIS be the mapping from an OHAG to visit functions as discussed in the previous section. Let H be an OHAG. Let T be a shared (hash-consed) tree attributed by VIS(H)(T). Let T' be the shared tree after a subtree replacement at node new and suppose T' was attributed by VIS(H)(T'). Furthermore, suppose that the size of the cache is large enough to cache all called functions.
De nition 3.6.1 De ne the set Aected to be the set of attribute instances in the
unshared version of tree T that receive a new value as a result of the subtree replacement at a node new (as in Reps's discussion [RTD83]) and the new created attribute instances in the unshared version T'.
De nition 3.6.2 De ne roots to be the following sets of nodes in T' fnewg [ fall root nodes of induced subtree replacements in trees derived from NTAsg
3.6. INCREMENTAL EVALUATION PERFORMANCE
65
De nition 3.6.3 Let path to root(r) be the set of all the nodes in T' that are an ancestor of r and r itself.
De nition 3.6.4 Let paths to roots be the set containing all nodes from [
i 2 roots
path to root(i)
3.6.2 Bounds First it is shown that the number of visit functions that needs to be computed after a subtree replacement (Aected Visits) is bounded by O(jAectedj + jpaths to rootsj). Because the number of semantic functions calls (Aected Applications) in a visit is bounded by a constant based on the size of the grammar Aected Applications is bounded by O(jAected Visistsj which is in turn bounded by O(jAectedj + jpaths to rootsj).
Lemma 3.6.1 Let Aected Visits be the set of visits that need to be computed and will not be found in the cache when using VIS(H)(T') with function caching for visits and hash-consing for trees. Then jAected Visitsj is O(jAectedj + jpaths to rootsj). Proof De ne the set Aected Nodes to be the set of nodes X in T such that X has an attribute in Aected. Clearly, jAected Nodesj jAectedj.
De ne Needed Visits(T') to be the set of all visits needed to evaluate T'. Let root(v) denote the root of the subtree that is the rst argument of visit function v. Since the number of visits to a node is bounded by a constant based on the size of the grammar, for all nodes r in T', j fv j v 2 Needed Visits(T') ^ root(v) = rg j is bounded by a constant. The only visits which have to be computed are those that were not computed previously. Therefore, v 2 Needed Visits(T') Aected Visits fv j ^ root(v) 2 (Aected Nodes [ paths to roots)g Therefore, which is
Aected Visits is O(jAected Nodesj + jpaths to rootsj)
O(jAectedj + jpaths to rootsj)
66
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS
2
Theorem 3.6.1 Let Aected Applications be the set of semantic function appli-
cations that need to be computed and will not be found in the cache when using VIS(H)(T') with function caching for visit functions and hash consing for trees. Then, Aected Applications is O(jAectedj + jpaths to rootsj). Proof Since the number of semantic function calls in a visit is bounded by a constant based on the size of the grammar, Aected Applications is O(jAected Visitsj) Using the previous lemma the theorem holds.
2
3.7 Problems with HAGs solved After a tree T is modi ed into T', T' shares all unmodi ed parts with T. To evaluate the attributes of T and T' the same visit function visit R 1 is used, where R is the root nonterminal. Note that tree T' is totally rebuild before visit R 1 is called, and all parts in T' that are copies of parts in T are identi ed automatically by the hash consing for trees. The incremental evaluator automatically skips unchanged parts of the tree because of cache-hits of visit functions. Hash consing for trees and bindings is used to achieve ecient caching, for which fast equality tests are essential. Because separate bindings for each visit are computed, we could have, for example, that visit N 1 and visit N 4 are recomputed after a subtree replacement, but visit N 2 and visit N 3 could be found in the cache and skipped. Some other advantages are illustrated in Figure 3.1, in which the following can be noted:
Multiple instances of the same (sub)tree, for example a multiple instantiated
NTA, are shared by using hash consing for trees (Trees T2 and T2'). Those parts of an attributed tree derived from NTA1 and NTA2 which can be reused after NTA1 and NTA2 change value are identi ed automatically because of the hash consing for trees (Trees T3 and T4 in (b)). This holds also for a subtree modi cation in the initial parse tree (Tree T1). Because trees T1, T3 and T4 are attributed the same in (a) and (b) they will be skipped after the subtree modi cation and the amount of work which has to be done in (b) is O(jAected T2'j + jpaths to rootsj) steps, where paths to roots is the sum of the lengths of all paths from the root to all subtree modi cations (NEW, X1 and X2).
3.8. PASTING TOGETHER VISIT FUNCTIONS
67
3.8 Pasting together visit functions In the foregoing sections we have shown how an incremental evaluator may be based on concepts like hash-consing and function caching. Here we will elaborate on some possibilities for optimization. A detailed description of these optimizations can be found in [PSV92].
3.8.1 Skipping subtrees An essential property of the construction of the bindings is that when calling a visit function with its bindings, these bindings contain precisely that information that will be actually used in this visit and nothing more. This is a direct result of the fact that these bindings were constructed during earlier visits to the nodes, at which time it was known what productions had been applied and what dependencies are actually occurring in the subtrees. There is thus little room for improvement here. The rst parameter of the visit functions, however, gives room for improvement: always the complete tree is passed and not only that subtree that will actually be traversed during this visit. In this way we might miss a cache hit when evaluating a changed tree. This eect is demonstrated in Figure 3.8. When editing the shaded subtree this has no in uence on the outcome of pass b, and may only in uence pass a. visit a
visit b
visit a
visit b
Figure 3.8: Changes in an unvisited subtree The following modi cation of our approach will take care of this optimization. When building the tree we compute simultaneously those synthesized attributes of the tree which do not depend on any of the inherited attributes. In this process we also compute a set of functions which we return as synthesized attributes, representing the visit functions parameterized with that part of the tree which will be visited when they are called. This process consists of the following steps:
68
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS 1. Every visit corresponds to a visit function de nition. At those places where the visit subsequences contain visits to sons, a formal function is called. Each visit function thus has as many additional parameters as is contains calls to sons. 2. The synthesized attributes computed initially represent functions mimicking the calls to the subtrees. These functions are used to partially parameterize the visit functions de nitions associated with the production applied at the current node under construction, and these resulting applications are in turn passed to higher nodes via the synthesized attributes.
As a consequence of this approach the top node of a tree is represented by a list of visit functions, all partially parameterized by the appropriate calls to their sons. Precisely those parts of the trees which are actually visited by these functions are thus encoded via the partial parameterization. If the function cache is extended in such a way as to be able to distinguish between such values, we do not have to build the trees anymore, and may simply use the visit functions as a representation.
3.8.2 Removing copy rules As a nal source for improvement we have a look at a more complicated case where we have visits which pass through dierent, but not distinct parts of the subtree. An example of this is the case were we model a language which does not demand identi ers to be declared before they may be used. This naturally leads to a two-pass algorithm: one pass for constructing the environment and the second pass for actually compiling the statements. We will base our discussion on the tree in Figure 3.9. We have indicated the data ow associated with the computation of the environment as a closed line, and the data
ow of the second pass which actually computes the code with a dashed line. Notice that the rst line passes through all the declaration nodes, whereas the second line passes through all the statement nodes. Suppose now that we change the upper statement in the tree, and thus construct a new root. If we apply the aforementioned procedure, we will discover that we do not have to redo the evaluation of the environment. The function computing this environment has not changed. The situation becomes more complicated if we add another statement after the rst one. Strictly speaking this does not change the environment either. However the function computing the environment has changed, and will have to be evaluated anew. This situation may be prevented by noticing the following. The rst visit to an L-node which has a statement as left son actually passes the environment attribute to its right son, visits this right son for the rst time and passes the result up to its father. No computation is performed. When writing this function, with
3.8. PASTING TOGETHER VISIT FUNCTIONS
69
L p L Decl
q L Stat
p L p
Decl
L q
Decl
L r
Stat
empty
Figure 3.9: Removing copy rules the aforementioned transformation in mind, as a -term we get f; x:f (x), where f represents the visit to the right son, and x the environment attribute. When we partially parameterize this function however with a function g, representing the visit to the right son, this rewrites to x:g(x), which is equal to g. In this way copy-chains may be short-circuited and the number of cache hits may increase by making more functions constructed this way to be equal. Consider, as an example, the rst pass visit functions for the grammar of Figure 3.9: visit L 1 (p D L) env = L.env where D.env = visit D 1 D env L.env = visit L 1 L D.env visit L 1 (q S L) env = L.env where f S contains no declarations g L.env = visit L 1 L env The visit functions for production p may be short-circuited to visit L 1 (p D (q S L1 )) env = L.env where D.env = visit D 1 D env f the copyrules for S may be skipped g These visit functions are merely meant to sketch the idea. In case L = (q S2 L2), we may short-circuit two statement nodes (and so on). This is what the aforementioned transformation is about. 1
70
CHAPTER 3. INCREMENTAL EVALUATION OF HAGS L.env = visit L 1 L D.env
visit L 1 (p D1 (p D2 L)) env = L.env where D1.env = visit D 1 D1 env L.env = visit L 1 (p D2 L) D1.env visit L 1 (p D (r empty)) env = L.env where L.env = visit D 1 D env
We conclude by noticing that whether these optimisations are possible or not depends on the amount of eort one is willing to spend on analyzing the grammar, reordering attributes, and splitting up visits into smaller visits. The original visit functions of [Kas80] were designed with the goal to minimize the number of visits to each node in mind. In the case of incremental evaluation one's goals however will be to maximize the number of independent computations and to maximize the number of cache hits.
Chapter 4 A HAG-machine and optimizations This chapter is divided into eight sections. The rst section describes the design dimensions and performance criteria for the HAG-evaluator strategy described in the previous chapter. The second section discusses static optimizations for bindings and visit functions and their eect on the static size of bindings in some \real" grammars. The third section discusses a general abstract HAG-machine (an abstract implementation of a HAG-evaluator) and the fourth section a space for time optimization for such a machine. In section ve implementation methods for an abstract HAG-machine are discussed. Section six presents a prototype HAG-machine in the functional language Gofer. The chapter ends with test results which give a limited indication which of the static visit function optimizations might be optimal for dynamic cache behaviour. Finally, some purge strategies will be compared.
4.1 Design dimensions and performance criteria The following three design dimensions of the HAG-evaluator can be distinguished: 1. Binding design. Several optimizations for bindings are possible. 2. Visit function design. The visit functions were de ned using Kastens visit sequences. Other visit sequences are possible and may lead to dierent and possibly more ecient visit functions. 3. Cache design. Here several dierent cache organization, purging and garbage collection strategies are possible. The following two performance criteria can be distinguished: 1. Size. The static and dynamic size of objects in a HAG-evaluator. 71
72
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS 2. Time. The absolute and relative time used for an (incremental) evaluation.
The table in Figure 4.1 shows possible measurements with respect to the performance criteria. Size Time Static Dynamic Absolute Relative bindings bindings seconds needed % needed calls (misses) cache for evaluation % hits Figure 4.1: Possible measurements The results in Figure 4.2 show which measurements with respect to the design dimensions will be discussed in the rest of this chapter. We decided to look at the static size of bindings with respect to binding and visit function optimizations because we wanted to have an indication of how many bindings occur in some \real" grammars. The decision to look at the relative time needed for an (incremental) evaluation was given by the type of prototype incremental evaluator we have built. The prototype we had in mind should allow us to experiment with the new visit function and binding concepts, the speed of incremental evaluation was a minor detail at that time. Design dimensions Binding Visit function Cache static size of bindings relative time in % of needed calls Figure 4.2: Measurements versus design dimensions discussed in this chapter
4.2 Static optimizations First two optimizations for bindings will be shown. Then, optimizations for visit functions will be shown. Finally, the eect of those optimizations on the static size of bindings in some \real" grammars will be shown.
4.2.1 Binding optimizations The de nition of bindings has been a very general one, in which no attention was paid to eciency. So bindings were introduced for the transfer of context between any pair
4.2. STATIC OPTIMIZATIONS
73
of passes. In practice many of these bindings will be always empty. This is what the rst optimization is about. Because it is the most important optimization, it will be discussed in detail. The second optimization reduces the number of attribute values in bindings.
4.2.1.1 Removing empty bindings First an example of bindings which are guaranteed to be always empty is shown. Then an algorithm for detecting bindings which are guaranteed to be always empty is discussed. The paragraph is nished with an example of bindings in a \real" grammar. Consider the following attributed tree fragment: Visit 1 Visit 2 Visit 3 Visit 4 s
N
i
X
i
y
z
y
z
y
z
p s
y
z
z y
z y
The only binding
Bindings guaranteed to be always empty
Here X.s 2 N 1!4 and N 1!2 = ; N 1!3 = ;
In the example above the bindings N 1!2 and N 1!3 are guaranteed to be always empty. These bindings can be removed from every visit function, thus saving space and time. Whether a binding will be always empty can be as follows statically deduced from the attribute grammar. Let X be a nonterminal and let novX be the number of visits to X. Then the following 1 2 2 (novX ; novX ) bindings will be computed for X:
X 1!2 X 1!3 : : : X 2!3 : : : ...
X 1!novX X 2!novX ... X (novX ;1)!(novX )
The contents of the bindings are computed in visit functions. Pattern matching in the rst argument of a visit function is used to decide which production is applied at the root of the tree. So the attributes and bindings of sons saved in a binding of a visit function depend on which production is applied at the root of the tree. The
74
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
bindings de ned for X in production p : X ! : : : are denoted by Xpv!w . So the type of binding X v!w is the union of all types of production de ned bindings Xpv0!w , !w where p0 . . . pn;1 are all productions with left-hand side X. Consequently, . . . , Xpvn; 1 binding X v!w is guaranteed to be always empty if all production de ned bindings Xpvi!w (0 i < n) are guaranteed to be always empty. By de nition a binding Xpvi!w contains:
attribute(s), in that case Xpvi!w is not empty and X v!w is not guaranteed to
be always empty binding(s) of son(s), in that case Xpvi!w is guaranteed to be always empty if the binding(s) of the son(s) are guaranteed to be always empty
The above observation leads to the following algorithm to detect whether a binding X v!w in grammar HAG is guaranteed to be always empty:
Algorithm 4.1 1. Let G be a directed graph with nodes for all X v!w in HAG and nodes for all attribute occurrences which may occur in a binding 2. For each X v!w in G and for all production de ned bindings Xpvi!w (0 i < n) where p0 . . . pn;1 are all productions with left-hand side X construct the following edges in G: For each attribute a in the de nition of Xpvi!w construct the edge (X v!w ,a) For each binding Y s!t in the de nition of Xpvi!w construct the edge (X v!w ,Y s!t ) 3. Compute the transitive closure of G
end algorithm Now binding X v!w is guaranteed to be always empty if there is no edge from X v!w to any attribute in G. It is easy to see that the complexity of this algorithm is bounded in time polynomially depending on the size of the input grammar. The following example shows some bindings in a\real" grammar. In order to do so, we have built a tool which analyzes a SSL-grammar (here SSL stands for Synthesizer Speci cation Language and it is the language in which editors for the Synthesizer Generator [RT88] are speci ed) for the presence of bindings and detects which bindings are guaranteed to be always empty. The output consists of two parts. The rst part shows the bindings needed per production. The second part reports which bindings are guaranteed to be always empty. An example of a binding report for
4.2. STATIC OPTIMIZATIONS
75
the supercombinator compiler grammar (see the last chapter) is shown in Figure 4.3 and Figure 4.4. Figure 4.3 shows the contents for bindings in all productions with left-hand side nonterminal nexp. The information in Figure 4.3 is listed as follows: NONT N VISITS n PROD p : N -> () v->w PROD q : N -> L M v->w attr BINDS_SONS L s->t
Such a listing is a textual representation of the binding occurrences. The rst line states the nonterminal (N) under consideration, together with its number of visits (n). Then, every production for the indicated nonterminal is listed (p and q here). Each production entry starts with a line describing the production (name, father and sons), followed by a list of bindings. Each binding entry starts with a line v->w describing the visit numbers of the binding, followed by either a list of attributes (attr) and a list of bindings for sons (BINDS SONS) or nothing to indicate that nothing has to be bound. The line L s->t states that binding Nqv!w contains binding Ls!t . NONT nexp VISITS 2 PROD nEmpty : nexp$1 -> () 1->2 PROD nId : nexp$1 -> INT$1 1->2 PROD nApp : nexp$1 -> nexp$2 nexp$3 1->2 BINDS_SONS nexp$3 1->2 nexp$2 1->2 PROD nLam : nexp$1 -> INT$1 nexp$2 cexp$1 1->2 cexp$1.surrapp cexp$1.envout cexp$1 BINDS_SONS cexp$1 3->4 2->4 1->4 PROD nConst : nexp$1 -> CON$1 1->2
Figure 4.3: Generated binding information per production
The following can be noted in Figure 4.3:
76
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
The bindings in production nEmpty, nId and nConst are empty. The binding for production nApp contains only bindings of sons. Figure 4.4 shows per nonterminal which bindings are guaranteed to be always empty (indicated by a *). exp binds_1->2* nexp binds_1->2
cexp binds_1->2* binds_1->3* binds_1->4* binds_2->3 binds_2->4 binds_3->4
Figure 4.4: Generated binding information per nonterminal. Bindings which are guaranteed to be always empty are denoted by a *. Note that the de nition for the binding nexp1!2 in production nLam in Figure 4.3 contains, among others, cexp1!4. But according to Figure 4.4 cexp1!4 will be guaranteed to be always empty. So cexp1!4 can be removed from the de nition for the binding nexp1!2 in production nLam and from all other visit functions and binding de nitions.
4.2.1.2 Removing inherited attributes The second binding optimization removes inherited attributes from bindings, as is illustrated in the following example z
R
i
N
i
r
p X
s
y
z
Here N.i 2 N 1!2 andVS(r) = VSS(r,1) = Def N.i ; Visit N,1 ; Def N.y ; Visit N,2 ; Def R.z ; VisitParent 1
Note that VS(r) is mapped into a single visit function visit R 1. Here N.i is bound, and still available for the second visit to N since the two visits to N occur in the same visit function visit R 1. So N.i can be passed directly as an argument to the second visit to N and can be removed from N 1!2. Of course all the visits to N should always
4.2. STATIC OPTIMIZATIONS
77
be in the same visit subsequence for this optimization to be valid. This optimization will not be used and discussed any further.
4.2.2 Visit function optimizations The visit functions in the previous chapter are de ned using Kastens visit sequences [Kas80]. Kastens algorithm for computing visit sequences consists of 5 steps. The rst paragraph discusses Kastens algorithm. The second paragraph discusses another optimization for visit functions which consists of altering step 3 of Kastens algorithm. The third paragraph discusses an optimization for visit functions which can be achieved by altering step 5 of Kastens algorithm.
4.2.2.1 Kastens algorithm For a detailed discussion of Kastens algorithm the reader is referred to [Kas80, RT88], a sketch of the algorithm, based on the algorithm given in [RT88], will be given here. Kastens algorithm computes the visit sequences which were introduced in Chapter 2. In determining the next action to take at evaluation time, a visit sequence evaluator does not need to examine directly any of the dependencies that exist among attributes or attribute instances; this work has been done once and for all at construction time and is compiled into the visit sequences. Constructing a visit sequence evaluator involves nding all situations that can possibly occur during attribute evaluation and making an appropriate visit sequence for each production of the grammar. Kastens's method of constructing visit sequences is based on an analysis of attribute dependencies. The information gathered from this analysis is used to simulate possible run-time evaluation situations implicitly and to build visit sequences that work correctly for all situations that can arise. In particular, the construction method ensures that whenever a Def instruction is executed to evaluate some attribute instance, all the attribute's arguments will already have been given values. The Kastens algorithm consists of ve distinct steps.
Algorithm 4.2 1. Step 1 Initialization of the TDP and TDS graphs. 2. Step 2 Computation of the dependence relations TDP and TDS. 3. Step 3 Compute nov N and N SN . distribute attributes in TDS(N) over partitions I1N S1N . . . Inov N nov N
78
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS 4. Step 4 Completion of TDP graphs with edges from lower-numbered partition-set elements to higher numbered elements. 5. Step 5 Create visit sequences from a topological sort of each TDP graph.
end algorithm The notation TDP(p) (standing for \Transitive Dependencies in a Production") is used to denote the transitive dependencies in production p. Initially TDP(p) contains the dependencies between attribute occurrences of production p in the grammar. We use the notation TDS(X) (standing for \Transitive Dependencies among a Symbol's attributes") to denote the covering dependence relation of nonterminal X. Initially all TDS(X) are empty. After step 2 the TDS(X) graphs are guaranteed to cover the actual transitive dependencies among the attributes of X that exist at any occurrence of X in any derivation tree. Note that the TDS relations are pessimistic precomputed approximations of the actual dependence relations. Furthermore, all dependencies in the TDS(X) relations have been induced in the TDP(p) relations. Step 3 distributes (with the help of the TDS(X) graphs) the attributes of each nonX S X ) of inherited attributes and terminal X into alternating groups (I1X S1X . . . Inov X nov X synthesized attributes. The attributes in each group IiX SiX are guaranteed to be computable in visit i to X. Furthermore, the sets IiX SiX are maximized; as much attributes as possible will be computed during each visit. The nal two steps convert the partition information into visit sequences. The step that actually emits the visit sequences (the fth and nal step) is carried out by what is essentially a topological sort of each of the grammar's TDP graphs; if we were to sort the TDP graphs computed by step 2, there is no guarantee that compatible visit sequences would be generated for productions which may occur next to each other in the tree. The purpose of step 4 is to ensure that the visit sequences are compatible with the partitions computed in step 3. Thus, the fourth step adds additional edges between attribute occurrences in the grammar's TDP graphs that are in dierent partitions. If any of the TDP relations is circular after step 1 or step 2, then the algorithm halts with failure. Failure after step1 indicates a circularity in the equations of an individual production; failure after step 2 can indicate that the grammar is circular, but step 2 can also fail for some noncircular grammars. If all the TDP(p) graphs are acyclic after step 4, then the grammar is ordered. If any cycles are introduced by step 4 the algorithm halts with failure. This failure is known as a type 3 circularity.
4.2. STATIC OPTIMIZATIONS
79
4.2.2.2 Granularity of visit functions In step 3 of Kastens algorithm the attributes of each nonterminal N are distributed N SN . over partitions I1N S1N . . . Inov N nov N As explained in Chapter 3, paragraph 3.5.3.1 visit function visit N v will compute the attributes de ned in VSS(p,v) and will have the attributes in partition IvN as part of its arguments and the attributes SvN amongst its results. In Kastens algorithm the number of visits to a node is minimized and the size of the partitions is maximized. As a consequence as many attributes as possible will be computed during each visit. Those attributes computed in a visit may very well be totally independent of each other. As a consequence, Kastens partitions might be split into independent parts to get a better incremental performance. We have examined further partitioning in two dierent ways, resulting in a total of three levels of granularity of visit functions. First, the two ways of splitting Kastens partitions are discussed with the help of an example. Finally, adaptations to step 3 of Kastens algorithm are shown.
Kastens visit functions Consider the following production p, the corresponding Kastens visit sequence and the corresponding visit function. Suppose that the level of Id does not depend on any attribute (i.e. it is a constant).
C :: ENV envin ! Int level ENV envout CODE comb Id :: ! Int level C ::= p Id C.level := Id.level C.envout := f C.envin C.comb := g C.envin VS(p) = VSS(p,1) = Visit Id,1 ; Def C.level ; Def C.envout ; Def C.comb ; VisitParent 1
visit C 1 (p Id) C.envin = (C.level, C.envout, C.comb) where Id.level = visit Id 1 Id C.level = Id.level C.envout = f C.envin C.comb = g C.envin
80
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Suppose production p is often applied with the same Id in a tree. Then all these occurrences will be shared. To get the level of C, the visit function visit C 1 will be called often with the same tree but dierent inherited attribute envin. Because the level of C and Id does not depend on any inherited attribute it will be always the same. So it would be better to have a single visit function which only computes the level (in order to get more cache hits). Or, in other words, the partition of the synthesized attributes of C computed by visit C 1 should be further sub-partitioned.
Single synthesized and disjoint fully connected visit functions One approach is to derive one visit function for each synthesized attribute. The visit sequence and visit function of our previous example would then look like this: VS(p) = VSS(p,1) = Visit Id,1 ; Def C.level ; VisitParent 1 ; VSS(p,2) ; Def C.envout ; VisitParent 2 ; VSS(p,3) ; Def C.comb ; VisitParent 3
visit C 1 (p Id) = C.level where Id.level = visit Id 1 Id C.level = Id.level visit C 2 (p Id) C.envin = C.envout where C.envout = f C.envin visit C 3 (p Id) C.envin = C.comb where C.comb = g C.envin
Now we have one separate visit function which computes the level. But, the synthesized attributes of the last two visit functions both depend on the same inherited attribute, and might easily be put into one second visit function! Visit functions which are partitioned in this way will be called disjoint fully connected visit functions.
How to compute all levels of granularity Recall that step 3 of Kastens algorithm partitions the attributes of each nonterminal N S N . Single synthesized visit functions can be obN into partitions I1N S1N . . . Inov N nov N N tained by constraining jSj j = 1(1 j nov N ) during the computation of partitions in step 3. Disjoint fully connected visit functions are obtained by splitting each IvN SvN pair as follows. Suppose nonterminal C has two inherited (1,2) attributes and four synthesized attributes (3,4,5,6). Let the transitive dependencies among the attributes of C that
4.2. STATIC OPTIMIZATIONS
81
exist at any occurrence of C in any derivation tree be as shown in Figure 4.5.a. The edges between synthesized attributes ((3,4), (3,5) and (4,5)) are induced by dependencies throughout the tree. When Kastens partitioning is used all attributes of C are computed in one visit visit C 1 (as indicated by the circle in Figure 4.5.a). The disjoint fully connected partitioning Figure 4.5.b is obtained from Figure 4.5.a by clustering synthesized attributes which have common inherited attributes in the following way: Kastens partitioning
Disjoint fully connected partitions
I1 = {1,2} S1= {3,4,5,6}
I1 = {} I2 = {1} I3 = {2} S1 = {3} S2 = {4,5} S3 = {6}
1
Inherited attributes
2
3
6 4
5
Synthesized attributes
visit_C_1
1
3
visit_C_1 (a)
2
6 4
5
visit_C_2 visit_C_3 (b)
Figure 4.5: Kastens partitioning (a) and disjoint fully connected partitioning (b)
Algorithm 4.3 1. Let G be the dependency graph between attributes as shown in Figure 4.5.a. 2. Remove all edges between synthesized attributes in G. 3. Make all edges in G undirected 4. Compute the transitive closure of G 5. Add all edges removed in step 2 6. Do a topological sort of the disjoint fully connected partitions in G and make them the new partitions (resulting in I1N S1N , I2N S3N and I3N S3N in Figure 4.5.b).
end algorithm Many other ways of constructing partitions which are more ne grained than Kastens are possible. This is just one approach which seems to give a better incremental behaviour of the visit functions.
82
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
4.2.2.3 Greedy or just in time evaluation Step 5 of Kastens algorithm emits the visit sequences by what is essentially a topological sort of each of the grammar's TDP (Transitive Dependencies in a Production) graphs. Kastens topological sort is a greedy approach: compute attributes as soon as possible in a visit sequence, even if their rst use is in a future visit. This greedy method, however, has consequences for the bindings, because attributes will be computed as soon as they can be computed and have to be stored in bindings if they are needed in future visits. Therefore, we have also implemented the opposite of greedy evaluation, which we call just in time evaluation. With this method attributes will be scheduled just in time for computation.
4.2.3 Eect on amount of bindings in \real" grammars This subsection shows the eect of binding and visit function optimizations on the static size of bindings in three \real" grammars. The following three grammars were analyzed:
Super The supercombinator compiler as explained in the last chapter. Pascal The Synthesizer Generator Pascal demo editor with static semantic checking.
Format The Synthesizer Generator text formatting demo editor for rightjusti ed, paginated text.
The table in Figure 4.6 is organized as follows. The gures of each grammar are shown in a single row. A row starts with the name of the grammar. The next column contains the size of the grammar in total numbers of nonterminals (nt) and the total number of productions (pr). The rest of the row is subdivided into the six dierent visit function evaluators w.r.t. granularity and greediness: KG (Kastens Greedy), KJ (Kastens Just in time), FG (Fully connected Greedy), FJ (Fully connected Just in time), SG (Single synthesized Greedy) and SJ (Single synthesized Just in time). Each column for a visit function evaluator shows two numbers in the top row: the total number of attribute occurrences which have to be stored in bindings (ba) and the maximum number of visits to a nonterminal (mv). Below the top row are two columns which display gures about bindings before and after applying the removal of bindings which are guaranteed to be always empty. Each column shows the total number of added synthesized binding attributes (sb) and the total number of nonterminals (nb) which have such attributes added as discussed in Chapter 3, Section 3.5.2. The legenda gives a pictorial view of the organization of the column which displays the
4.2. STATIC OPTIMIZATIONS
83
size of the grammar and the columns which display the numbers of a particular type of visit function evaluator. Consider for example the single synthesized grain non-greedy (SJ) visit functions in Figure 4.6 for the supercombinator compiler:
11 binding elements (ba) are responsible for all bindings. The maximum number of visits to a nonterminal (mv) is 4.
8 synthesized binding attributes (sb) have to be added to 3 nonterminals (nb). After removal of bindings which are always empty 4 sb and 2 nb remain.
The following can be noted grammarwise in the table of Figure 4.6:
Super
The Kastens visit functions (KG and KJ) of this grammar are single-visit, so there are no bindings. There are 3 visits to a nonterminal in disjoint fully connected (FG and FJ) visit functions. Closer inspection of the generated visit functions reveals (not shown in the table) that there is only one nonterminal cexp with 3 visits in the FG and FJ case. In the SG and SJ case there are 4 visits to cexp. So two synthesized attributes were clustered in the FG and FJ case.
Pascal
The maximum number of visits in the Kastens (KG and KJ) and disjoint fully connected case (FG and FJ) are the same. The number of nonterminals, however, with the maximum number of visits is dierent (not shown in the table). Closer inspection reveals (not shown in the table) that in the Kastens case there is only one nonterminal with three visits, but in the disjoint fully connected case there are nine nonterminals with three visits. In order to generate the disjoint fully connected and the single synthesized grain visit functions several \type 3 circularities" were successfully removed. A \type 3 circularity"[RT88] indicates that the grammar is de nitely not circular, but that there is a circularity induced by the dependencies that are added between partitions. Such circularities can be removed by adding extra dependencies. The removal of these \type 3 circularities" had no eect on the Kastens partitioning.
Format
In the Kastens (KG and KJ) and disjoint fully connected (FG and FJ) case there is a maximum of two visits to a nonterminal. In both cases to one and the same nonterminal (not shown in the table). So here we see an example for which the disjoint fully connected partitioning did not work well. The explanation is that the attributes are \too much" connected with each other.
84
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Gram. Super
Sz 8 23
KG 01 0 0 0 0 Pascal 79 41 3 203 35 23 33 22 Format 4 12 2 7 1 1 1 1
KJ 01 0 0 0 0 41 3 35 18 33 17 12 2 1 1 1 1
FG 12 3 5 5 3 3 49 3 57 34 39 26 12 2 1 1 1 1
FJ 63 5 2 3 2 50 3 57 24 39 20 12 2 1 1 1 1
SG 18 4 8 7 3 3 165 5 95 57 46 33 69 9 75 23 3 3
SJ 11 4 8 4 3 2 113 5 95 48 46 27 61 9 75 23 3 3
Legenda Sz Ev. type nt ba mv pr sb sb nb nb
Figure 4.6: The eect of the static optimizations on the amount of bindings in several \real" grammars. A general observation is that the greedy visit functions generate more attributes in bindings (ba) than the non-greedy versions. This was expected because with greedy evaluation attributes are scheduled for computation as soon as they can be computed and therefore have to be saved in bindings. Furthermore, note that the removal of bindings which are guaranteed to be always empty reduces the number of added synthesized binding attributes (sb) and nonterminals which have such attributes added (nt) up to a half.
4.3 An abstract HAG-machine This section discusses a general abstract implementation of a HAG-evaluator (a HAGmachine) based on memo functions and hash consing for trees and bindings, as proposed in the previous chapter. There are three reasons why an abstract machine is discussed here:
to give precise de nitions for garbage collection and purging which will be used later on,
to provide a framework for the discussion of a space for time saving method for the abstract HAG-machine, and
to provide a framework for understanding the prototype HAG-machine in Gofer. For the rest of this section it is assumed that all trees and bindings will be hashconsed as described in Chapter 3. Memo-ization of functions is implemented in the same way.
4.3. AN ABSTRACT HAG-MACHINE
85
4.3.1 Major data structures Five major data structures can be distinguished in our machine:
The visit functions will be evaluated on a stack. A hash table which contains references to memo-ed tree constructor calls (tree nodes). A hash table which contains references to memo-ed binding constructor calls. A hash table which contains references to memo-ed function calls. A heap will be used to store objects.
4.3.2 Objects The following objects are distinguished: 1. Non-tree attribute values They are stored in the stack and directly in bindings and memo-ed function calls. 2. Tree nodes They are represented uniquely by a reference, built using hash-consing and stored in the heap. 3. Bindings They are represented uniquely by a reference, built using hash-consing and stored in the heap. They contain non-tree attribute values and tree attribute values (tree nodes) as elements. 4. Memo-ed function calls They are represented uniquely by a reference and stored in the heap. They contain a function name, its input-parameters and the corresponding result.
4.3.3 Visit functions The HAG-evaluator consists of a set of recursive visit functions which call each other. The results and arguments of visit functions can be any object, except memo-ed function call entries. Visit functions will be memo-ed. This means that all invocations of visit functions can be thought to be encapsulated by the function memo with signature
86
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS visit function tree inh. attr. bindings ! syn. attr. bindings
As a side eect, memo creates memo-ed function call entries on the heap. The main loop of the HAG-machine is evaluated on the stack and is as follows: Shared root := initial tree
while true do od
Shared root := user edits(Shared root) fedit the old tree and construct new oneg results := memo( visit R 1, Shared root, input parameters )
The stack will not only inhabit the visit functions, but also, on the bottom of the stack, there will be a global variable called Shared root which contains the root of the parse tree. An example world with some inhabitants during attribute evaluation is shown in Figure 4.7. Stack
Heap
Shared_root
Hash Tables Trees
Bindings
Visit_R_1
Memo−ed functions
Attribute Memo−ed Tree node function call Collectable Binding
Figure 4.7: A snapshot of the stack, heap and hash tables during attribute evaluation.
4.3.4 The lifetime of objects in the heap In this section we will discuss the lifetime of objects in the heap. The following two properties should hold in our machine:
4.4. A SPACE FOR TIME OPTIMIZATION
87
Property 4.3.1 Objects on the heap which are referenced from the stack, the hash
tables or from the heap will not be deleted from the heap. Objects on the heap which are not referenced may be deleted from the heap at any time.
Property 4.3.2 References from the hash tables to heap objects may be deleted at any time.
Removing references from the hash tables will not cause objects, which are essential for the attribute evaluation, to be deleted, since they are referenced from the stack. Removing references from the hash tables, however, has an eect on the amount of tree and binding sharing and the amount of memo-hits in future attribute evaluations; both amounts are likely to decrease when references from any of the hash tables are deleted, thus resulting in more time consuming re-evaluations.
4.3.5 De nition of purging and garbage collection We are now ready to formalize the meaning of garbage collection and purging.
De nition 4.3.1 Garbage collection is the removal of heap objects which are not referenced (in order to create new heap space).
De nition 4.3.2 Purging is the removal of references from hash tables followed by garbage collection.
We will call the removal of references from the function call hash table function call purging. Tree purging and binding purging are de ned in the same way. The performance and space consumption of our incremental evaluator depends heavily on having good purging strategies. Note that memo-ed visit function calls are thus far only reachable from the hash table of memo-ed function calls and thus purging will indeed lead to the eective reclaiming of garbage cells.
4.4 A space for time optimization This section discusses a space for time optimization (called the pruning optimization) for the abstract HAG-machine described in the previous section. First the pruning optimization is described. Then a criterium will be given for static detection of the applicability of the optimization.
88
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
4.4.1 The pruning optimization The idea for the pruning optimization is as follows. Suppose the result of a memo-ed function call is a large tree. In order to save the memory occupied by the tree, the tree can be replaced by a reference to the memo-ed function call by which it was created. When the tree is needed again it may be recomputed by re-evaluation of the memo-ed function call. Consider for example an incremental LATEX editor with two screens; one for straight text input and the other one showing the formatted text. The formatted text is represented by a tree. Not the whole formatted text will be shown in the output screen. The parts of the formatted text which are not shown could be pruned with the pruning optimization in order to save memory. When the formatted text is needed again, it can be recomputed. The de nition of pruning is as follows:
De nition 4.4.1 Pruning is the replacement of a reference to a tree by a reference to a memo-ed function call which created that tree.
Note that purging removes references from hash tables, whereas pruning removes a reference from inside the heap. An example of how the pruning optimization works is shown in Figure 4.8 where the following can be noted: Stack
Stack r s1
r s2
f
Memo−ed function call Function− name
Memo−ed function call Results
Arguments (a)
(b)
Figure 4.8: An example of the pruning optimization: replacement of a tree (pointed to by r) by a reference f to a memo-ed function call which computed that tree. Before (a) and after (b) the replacement.
The reference r (representing a tree) points to the same node in (a) and (b).
4.4. A SPACE FOR TIME OPTIMIZATION
89
References from the original tree to its sons (s1,s2) will be cut after the replace-
ment (b). As a result, a (possible large) tree may be purged and collected from the heap. The memo-ed function call becomes indirectly reachable via r and f from the stack in (b). As soon as reference r will be de-referenced, the memo-ed function call has to be re-invoked in order to recompute the tree; the situation of (a) is thus reestablished. The recomputation of a memo-ed function call only succeeds when the arguments of the function stay intact.
We will pay some attention to the last condition. In the example of Figure 4.8 the root of the tree to be pruned is not reachable from the arguments of the memo-ed function call. If the root is part of the arguments, however, then the pruning is not possible since it would destroy the arguments of the memo-ed function call. This leads to the following condition for the pruning optimization to be applicable:
Condition 4.4.1 If there are no references from the arguments of the memo-ed visit function entry to the root of the tree to be replaced then that tree can be replaced by a reference to the memo-ed visit function. Note that this method can be also used for other objects (like bindings) computed by visit functions. In order to detect whether the root of the result tree is part of the arguments, the arguments can be tested for the presence of the root.
4.4.2 Static detection The root of a result tree can't be part of an argument if the root cell is always constructed by a constructor function which is guaranteed to be never used during the construction of an argument. In order to guarantee this statically we approximate this condition by computing the sets of all possibly occurring constructor functions in the arguments and the result. If these two sets are disjoint then it is safe to prune the result. As an example consider the visit function visit STAT :: STAT ! BOX which translates statements to boxes. All productions for nonterminal STAT and BOX are given in Figure 4.9. For convenience the name of the productions will be used as constructor function names. There are three constructor functions which can be applied at the root of the result of visit Stat:
90
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS STAT ::= statseq STAT STAT j ifstat EXP STAT STAT j assign ID EXP
BOX ::= hconc BOX INT BOX j vconc BOX INT BOX j create BOX STRING
Figure 4.9: Productions for STAT and BOX RootConstructors(BOX ) = fhconc, vconc, createg
The constructors which can be used during the construction of any argument can be computed with the following equation: Constructors(STAT ) = fstatseq, ifstat, assigng [ Constructors(EXP) [ Constructors(ID) If RootConstructors(BOX ) and Constructors(STAT ) are disjunct then the result of visit STAT is guaranteed to be always prunable. We expect this property to hold especially when the computation actually describes a pass-like structure, i.e. where a large data-structure is computed out of earlier computed data-structures.
4.5 Implementation methods for the HAGmachine The following two subsections discuss implementation methods for garbage collection methods and purging methods.
4.5.1 Garbage collection methods There are three old and simple algorithms for garbage collection upon which many improvements have been based: 1. Reference counting Each object has a count which indicates how many references there are to that object. When the count reaches zero, the object can be removed from the heap. This approach is applicable here because we don't have cyclic dependencies. 2. Mark scan collection Here all references from the stack are followed and each referenced object is marked non-removable. Next, all removable objects are deleted from the heap. See [McC60] for more details.
4.5. IMPLEMENTATION METHODS FOR THE HAG-MACHINE
91
3. Stop and copy collection Here all references from the stack are followed and each referenced object is copied to a second heap. Then the old heap is destroyed. For more details and improvements see [FY69, App89, BW88b]. Stop and copy collection does not work directly here. The reason is that hash consing and memo-ing both use addresses for the calculation of an hash index. After a copy, objects are reallocated onto a new address, and the hash consing won't work anymore. This problem can be solved as follows. In the original hash-consing algorithm the addresses of the objects are used for calculating the hash-index and testing equality of objects. Instead of the address of an object an unique tag stored with the object could be used. In that way the references to the objects will become transparent for the hash-consing, and stop and copy collection can be applied. All these methods can be used to implement the garbage collection in the HAGmachine.
4.5.2 Purging methods A central question in the implementation of a (visit) function caching system is what purging strategy to employ. Earlier work of function caching generally leaves out the question of what purging strategy to employ, relying on the users to explicitly purge items from the cache, or propose a strategy such as LRU (Least Recently Used) without any analysis of the appropriateness of that strategy. Hilden [Hil76] examined a number of purging schemes experimentally for a speci c function and noted that \some intuitively promising policy variants do not seem to work as well as their competitors, and conversely". Pugh [Pug88] describes a formal model that allows the potential of a function cache to be described. This model is then used to design an algorithm for maintaining a function cache. Although this algorithm will choose the best entry to be eliminated, it is mainly of theoretical interest because it assumes the sequence of future function calls to be known and doesn't care about overhead. From this algorithm a practical cache replacement strategy is derived that performs better than currently used strategies. [Pug88, page 28] compares function caching with paging; deciding which elements to purge from a function cache bears some similarities to deciding which element to purge from a disk or memory cache. However, two basic dierences limit the applicability of disk and memory caching schemes for function caching in general and for HAG caches in particular:
The cost to recompute an entry not in the function cache varies, based on both
92
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS the inherent complexity of the function call and on the other contents of the cache. The frequency of use of an entry in the function cache depends on what else is in the cache.
For a start, we will examine the strategies LRU, FIFO and LIFO for several \real" grammars in the last section of this chapter. The problem of nding a good purging strategy is a topic for future research.
4.6 A prototype HAG-machine in Gofer This subsection discusses the instantiation of a HAG-machine in the functional programming environment Gofer [Jon91]. The language supported by Gofer is both syntactically and semantically similar to that of the functional programming language Haskell [FHPJea92]. Some features common to both include:
Non-strict semantics (lazy evaluation) Higher Order functions Pattern matching Gofer evaluates expressions using a technique sometimes described as \lazy evaluation" which means that no expression is evaluated until its value is actually needed. We have considered two alternatives, one for simulating and one for implementing, a HAG-machine in Gofer:
Simulate function caching. The cache simulation can simply consist of tracing
the function calls. After a run of the evaluator the \cache" can be analyzed. If a certain function call occurs more than once this means a cache hit has occurred. Extend the Gofer implementation with memo functions. The rst alternative works only for small grammars because the function call trace uses far more memory and time than available. Therefore we have chosen the second alternative allowing us to experiment with several dierent HAG-evaluators and purging strategies. The extension of Gofer with memo functions was done by [vD92]. The rest of this subsection is organized as follows: rst the dierences between full and lazy memo functions are explained. Then the lazy memo implementation in Gofer is discussed. Finally, the implementation of a HAG-machine in memo-extended Gofer is discussed.
4.6. A PROTOTYPE HAG-MACHINE IN GOFER
93
4.6.1 Full and lazy memo functions The following introduction to full and lazy memo functions is based on [FH88, Chapter 19]. Other references can be found there. The concept of function memoization was originally introduced by [Mic68], and operates by replacing certain functions by corresponding memo functions. A memo function is like an ordinary function except that it \remembers" some or all of the arguments it has been applied to, together with the corresponding results computed at these occasions. Ordinary memo functions, which we call full memo functions, are required to reuse previously computed results, whenever they are applied to arguments equal to previous ones. Lazy memo functions, however, need only do so if they are applied to arguments which are identical to previous ones; that is to arguments stored in the same place in memory. Two objects are therefore identical if 1. they are stored at the same address, i.e. are accessed by the same pointer; 2. they are equal atomic values, e.g integers, characters, booleans etc. Lazy memo functions were introduced with the intention of being used in lazy implementations of functional languages where the arguments no longer need to be completely evaluated | only to WHNF (Weak Head Normal Form). An important feature of lazy memoization is the way it handles cyclic structures, although this feature will not be used in the Gofer HAG-machine. To end this discussion on lazy memo functions we will show how full memoization can be achieved by lazy memo functions. The key to this is to ensure that the test for identity becomes equivalent to the test for equality. This is already the case for atoms, and would also be the case if all data-structures were stored uniquely. This means that if any pair of data structures are the same, whether or not the arguments of memo functions, they must share the same locations in storage. We can de ne full memo functions in terms of lazy ones by this approach, using a \hashing cons" (see also Figure 3.3). A hashing cons (hcons) is the same as the constructor function, cons, but does not allocate a new cell if one already exists with identical head and tail elds. Of course, a hashing version can be de ned for any constructor function, but we will restrict our discussion to the list constructor cons for simplicity. We can easily de ne hcons as a lazy memo function; we shall use Gofer [Jon91] notation but will annotate the declaration with memo to indicate that the function is to be memoized:
memo hcons :: ( ! []) ! [] hcons a b = a:b
94
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Now, using hcons, we can de ne the function unique that makes a unique copy of an object unique :: ! unique (a:b) = hcons a (unique b) unique [ ] = [ ]
Thus, if a and b are two equal structures, unique(a) and unique(b) are identical, which follows easily by structural induction, the claim being true by de nition for atomic a and b. Of course, this scheme incurs the same penalties as any other that implements full memo functions, namely complete evaluation of arguments, ineciency in the comparison of argument values (unique is a recursive function), and increased complexity in managing the memo table.
4.6.2 Lazy memo functions in Gofer The common way to indicate that a function has to be memoized is to annotate the function de nition with the keyword memo. Mainly for ease of implementation another solution was chosen in the form of extending Gofer with a primitive built in function. The primitive function memo has two arguments: a function and an argument to that function. It has the signature (in Gofer notation): memo :: ( ! ) ! !
The call memo f x 1. evaluates both f and x to weak head normal form 2. if f was already applied to x (x is identical to a previous argument) then return memoized value else evaluate the call (f x) to weak head normal form, store the result and return it The following example shows a memoized version of the Fibonacci function in Gofer: m b 0 = 1 m b 1 = 1 m b n = (memo m b(n-1)) + (memo m b(n-1))
4.6. A PROTOTYPE HAG-MACHINE IN GOFER
95
A call to m b n with n > 1, however, will result in at least two calls to (memo m b) because the toplevel application of m b isn't memoized. A full memoized version of the function can be achieved by de ning memo b = memo m b
and using memo b in the toplevel application. In the current implementation only integers and characters are considered to be atomic. All other objects have to be memoized, as described earlier, in order to ensure that equal structured objects are identical. The cache is organized as follows in the current implementation: the memo-ed function calls and their results are stored in a cache. The cache is organized as a hash table with a list of function/result entries (cache entries) at each index. The function name is used for the hashing to an index. Three purging strategies are implemented on the list of cache entries at each index: LRU, FIFO and LIFO. Purging takes place when the total number of cache entries in the cache exceeds a user settable purge limit. The mark scan garbage collection in Gofer was adapted to handle the cache properly. The next subsection discusses an implementation of a HAG-evaluator with the use of lazy memo functions in Gofer.
4.6.3 A Gofer HAG-machine The HAG-evaluator consists of de nitions for visit functions, tree constructor functions, binding constructor functions, and semantic functions. The visit functions will be memo-ed. All tree constructor functions and binding constructor functions will be hash-consed with the help of memo functions. Furthermore, all non-integer and non-character values (integers and characters are the only atomic objects) will be hash-consed. The following two alternatives for memoing functions with more than one argument were considered: 1. f x y can be memo-ed by memo (memo f x) y. 2. f x y can be memo-ed by (memo f) (tpl2 x y) where tpl2 is a memo-ed tuple constructor and the de nition of f x y becomes f (x,y). We have taken the latter approach because this allows us to read the hits on f directly, which was not possible when using the rst alternative.
96
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
By hash-consing all data-structures, and thus implicitly realising the eects of the function unique, we have already converted all visit functions into their strict counterparts. By pre xing all semantic functions by the Gofer built-in operator strict, we have nally succeeded in converting an attribute evaluator which essentially depended on lazy semantics into one with strict semantics. This evaluator models equivalent implementations in more conventional languages like Pascal or C.
4.7 Tests with the prototype HAG-machine Here we show some results of tests with the Gofer prototype HAG-machine. In order to test grammars of non-trivial size we have built a tool which generates the six dierent Gofer HAG-evaluators (KG, KJ, FG, FJ, SG and SJ) from a SSL-grammar. Many tests are possible with the Gofer prototype. The results of tests which will be shown here serve only as a limited indication how such evaluators behave. No general conclusions should be drawn from these tests. The generated evaluators have as input a parse tree and as output the display unparsing as it would be shown in the Synthesizer Generator. The Gofer memo implementation will show the hits and misses for the visit, tree and binding function calls after each evaluation. Furthermore, we have a function test hits which takes as arguments the type of HAG-evaluator to be used, two slightly dierent abstract syntax trees, the purging strategy and the cache size. The cache size denotes the maximum number of cache entries in the cache. When this number is exceeded purging will take place. The call test hits ev type T1 T2 purge type cache size results in four gures (here vcalls(T) and ccalls(T) denote respectively the number of visit function calls and the number of tree and binding constructor calls needed to evaluate T and vcalls nocache(T) denotes the number of visit function calls needed to evaluate T with a cache size of 0):
the percentage of visit function calls needed for evaluating T2 (thereby using cache entries generated by the evaluation of T1) after evaluating T1, (T 2 after T 1) 100 vcalls vcalls nocache(T 2)
the percentage of constructor calls needed for evaluating T2 after evaluating T1,
(T 2 after T 1) 100 ccalls ccalls nocache(T 2)
4.7. TESTS WITH THE PROTOTYPE HAG-MACHINE
97
the percentage of visit function calls needed for evaluating T2 only (from scratch),
(T 2) 100 vcallsvcalls nocache(T 2)
the total number of visit function, tree, binding and memo-tupling calls (or, in other words, the total misses) in evaluating T2 after evaluating T1.
The most interesting gures are the \percentage of needed visit function calls", because saving such a call means skipping a visit to a (possibly) large subtree.
4.7.1 Visit function optimizations versus cache behaviour In this paragraph we are interested in the incremental behaviour of the six HAGevaluators . Therefore we have tested the supercombinator compiler grammar. In order to get an idea of the performance of the HAG-evaluators we have performed 30 subtree replacements on abstract syntax trees for the supercombinator compiler. No purge strategy and an in nite (in practice large enough) cache was used. Suppose R is the set which contains 30 pairs of abstract syntax trees (the 30 subtree replacements) then Figure 4.10 shows for each evaluator type (ev type 2 (KG,KJ,FG,FJ,SG,SJ)) the average percentages of the results of all calls to test hits in the formula:
8(T1,T2) 2 R :
test hits ev type T1 T2 none 1
The following can be noted in Figure 4.10:
The FG (Fully connected Greedy) HAG-evaluator has the greatest reduction
in percentage of visit function calls of all evaluators. FG (36%) is a factor of 2 better in reduction of percentage of visit function calls than the KG (78%) evaluator. This is because it uses the least percentage of visit function calls in the non-incremental case (50%). The Greedy versions of the F and S evaluators use both a less percentage of visit function calls than the Just in time versions. There is no dierence between the KG and KJ case because both are single visit. A possible explanation for the better performance of the Greedy F and S evaluators is that many attributes are computed by non-injective functions. So, early computation of attributes might lead to the same results as previous values and, consequently, more visit functions will be called with the same arguments.
98
average needed calls in % 100 90 80 70 60 50 40 30 20 10 0
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
KG KJ FG FJ
SG
SJ
K Kastens F Fully connected S Single synthesized G Greedy J Just in time visit functions, incremental visit functions, non-incremental constructors, incremental type
Figure 4.10: The average percentage needed calls for 30 subtree replacements in several HAG-evaluators for the supercombinator compiler. The black bars show the average needed percentage of visit function calls after a subtree replacement. The white bars show the average needed percentage of visit function calls for the attribution of the nal tree only. The dashed bars show the average needed percentage of tree and binding constructor calls for the incremental case.
4.8. FUTURE WORK AND CONCLUSIONS
99
4.7.2 Purge methods versus cache behaviour In this paragraph we are interested in the incremental behaviour of the purge strategies LRU, FIFO and LIFO. The same set of 30 subtree replacements as in the previous paragraph were taken. The HAG-evaluator FG was taken for the whole test. Suppose R is the set which contains 30 pairs of abstract syntax trees (the 30 subtree replacements) then Figure 4.11 shows one line for each purge type (purge type 2 (LRU,FIFO,LIFO)). Each line was obtained by measuring at several dierent cache sizes (cache size 2 0,50,150, . . . ,3500). Each thus obtained point shows the average total number of all needed calls (or, in other words, all misses) of the results of all calls to test hits in the formula:
8(T1,T2) 2 R :
test hits FG T1 T2 purge type cache size
The following can be noted in Figure 4.11:
For cache sizes less than 1000 and greater than 1600 the strategies LRU and
FIFO seem to be better than LIFO. Between 1000 and 1600 LIFO seems to be better than LRU and FIFO. The total number of average needed total calls is 3500. This explains why all three curves become at for a cache size near 3000 and higher.
4.8 Future work and conclusions The following questions remain open:
In [Pug88] a mathematical prediction model for a function cache is described. From this model a practical purging strategy algorithm is derived. Can a practical purging strategy for our HAG-evaluator be derived in the same way? Are there other, better, purging strategies? Is the space for time pruning optimization really necessary and will it be possible to implement it eciently? What is the general behaviour of the HAG-evaluator? The results of the tests give only a limited indication of the behaviour of the HAG-evaluator. No general conclusions can be drawn from these results. More grammars should be tested in order to draw general conclusions.
100
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
average total needed calls 3500
LRU FIFO
3000
LIFO 2500 2000 1500 1000 500 0
cache size 0
1000
2000
3000
Figure 4.11: A comparison of LRU, FIFO and LIFO purge strategies.
4.8. FUTURE WORK AND CONCLUSIONS
101
How do our HAG-machine and optimizations compare in practice with the
techniques used in the Synthesizer Generator [RT88]? In order to get a fair comparison, the HAG-evaluator should be implemented in a fast imperative language, like C or Pascal, which is straightforward since our visit functions are strict.
We have shown the design dimensions of the HAG-evaluator described in the previous chapter. Several optimizations were shown and implemented. The eect of the optimizations on the static and dynamic parts of a HAG-evaluator were shown. Furthermore, a HAG-machine (a general abstract implementation of a HAGevaluator) was proposed. Several implementation models and optimizations were discussed of which the new space for time optimization makes it possible to delete large intermediate results until they are needed again. Then, a prototype HAG-machine in the functional language Gofer (extended with memo functions) was discussed. A tool was designed to translate some large \real" SSL-grammars into Gofer. The chapter ended with the results of some tests.
102
CHAPTER 4. A HAG-MACHINE AND OPTIMIZATIONS
Chapter 5 Applications This chapter discusses two HAGs. The rst section discusses a prototype program transformation system which was developed with the SG in four man-months. This is very fast, compared with the development-time of other program transformation systems. The prototype supports the construction and manipulation of equational proofs, possibly interspersed with text. Its intended use is in writing papers on algorithm design, automated checking of the derivation and in providing mechanic help during the derivation. The editor supports online de nition of tree transformations (so-called dynamic transformations); they can be inserted and deleted during an edit-session, which is currently not supported by the SG. The whole prototype, including the dynamic transformations, was written as an attribute grammar. The second section discusses a compiler for supercombinators and is an example of the use of higher order attribute grammars.
5.1 The BMF-editor This section describes a prototype program transformation system made in four manmonths with the attribute grammar based Synthesizer Generator (SG) [RTD83]. The prototype transformation system (the BMF-editor) supports the interactive derivation of equational proofs in the Bird-Meertens formalism (BMF) [Bir87, Mee86]. Doing a derivation in BMF boils down to repeatedly applying transformations to BMF-formulas. For a BMF-editor to be of practical use, the user should be able to add transformations which are derived during the development so they can be reused further on in the derivation. The transformations supported by the SG, however, can only be entered at editor-speci cation time. Dynamic transformations can be entered and deleted 103
104
CHAPTER 5. APPLICATIONS
during the edit-session. Furthermore, the applicability and direction of applicability of a dynamic transformation on a formula is indicated and updated incrementally. The dynamic transformations are implemented with an attribute grammar. The CSG proof editor [RA84] is a proof-checking editor where the inference rules are embedded in the editor as an attribute grammar. The editor keeps the user informed of errors and inconsistencies in a proof by reexaming the proof's constraints after each modi cation. In the attribute grammar based Interactive Proof Editor (IPE) [Rit88] the applicabilaty of a dynamic transformation can be shown on demand but not incrementally. The use of an attribute grammar based system like the SG was the key to relative easy and fast development of the BMF-editor. First, because the SG generates a user interface and environment for free. Second, because BMF-formulas, dynamic transformations and the derivation itself are represented easily by attribute grammars. Third, because all the incremental algorithms in the BMF-editor are generated for free without any explicit programming. The functionality of the BMF-editor lies somewhere between a full edged program transformation system and a computer supported system for equational or formal reasoning. The construction time of program transformation systems like the PROSPECTRA system [KBHG+87], the KIDS system, the TAMPR system and the CIP-S system (for an overview see [PS83]), was considerably longer because almost all these systems were totally written by hand without using any tools. The construction time of computer supported systems for formal reasoning like LCF, NuPRL, the Boyer-Moore theorem prover and the CSG proof editor (for an overview see [Lin88]), was in most cases also considerably longer for the same reasons. The complete BMF-editor, including the dynamic transformations, consists of 3700 lines of pure SSL (the attribute grammar speci cation language of the SG), without using any non-standard SSL constructions. Therefore, the system is easily portable to any machine capable of running the SG. The whole BMF-editor was written by Aswin van den Berg [vdB90]. For a great part it was this exercise which prompted the development of HAGs, since HAGs could have been helpfull for implementing parts of the BMF-editor. During the time of development of the BMF-editor, however, HAGs were not yet implemented in the SG. Fortunately, the SG provided facilities to simulate the eects of HAGs. Such simulations were, however, hard to write and understand, which made the implementation of some parts of the BMF-editor a tedious process. The rest of this section is organized as follows. Subsection 5.1.1 introduces BMF and shows a sample derivation in BMF . Subsection 5.1.2 discusses the components, the look and feel, the abstract syntax and the dynamic transformations of the BMFeditor. A large example of a derivation with the editor is presented at the end of Subsection 5.1.2. Further suggestions for improving the editor are discussed in Subsection 5.1.3. Finally, the conclusions are presented in Subsection 5.1.4.
5.1. THE BMF-EDITOR
105
5.1.1 The Bird-Meertens Formalism (BMF) BMF is a lucid proof style based upon equational reasoning. A derivation in BMF starts o from an obviously correct, but possibly very inecient algorithm which is transformed into an ecient algorithm, by making small, correctness-preserving transformation steps using a library of rules. Each transformation step rewrites (part of) a formula, by another formula. For a BMF-editor to be of practical use it should be possible to intersperse text together with the development of the program. This is similar to the WEB-system described in [Knu84]. The dierence with the WEB-system is that we want to derive programs from speci cations using small correctness-preserving transformations, instead of using a stepwise re nement approach. By using a transformation system which contains a library of rules, it is possible to verify and steer our derivation, thereby overcoming the proof obligation still present in the WEB-system. Furthermore, as in the WEB-system, it should be possible to lter the nal program out of the le containing the text and the derivation. Just transforming would then be the same as writing articles in the system without writing text. Because we believe that proofs (or derivations) have to be engineered by a human rather than by the computer, we insist on manual operation. Therefore, the program transformation system can be considered to be a specialized editor.
5.1.1.1 Some basic BMF Here we present some basic BMF . In the following subsection we use this in a small derivation. This short introduction was inspired upon [Bir87]. All operators work on lists, list of lists, or elements of lists (integers or lists). Lists are nite sequences of values of the same type. Enumerated lists will be denoted using square brackets. The primitive operation on lists is concatenation, denoted by the sign ++. For example: [1] ++ [2] ++ [1] = [1; 2; 1] The operator = (pronounced \reduce") takes a binary operator on its left and a list on its right and \puts" the operator between the elements of the list. For example, ++= [[1]; [2]; [1]] = [1] ++ [2] ++ [1] Binary operators can be sectioned. For example ( 1) denotes the function ( 1) 2 = 2 1 The brackets here are essential and should not be omitted.
106
CHAPTER 5. APPLICATIONS
The operator (pronounced \map") takes a function on its left and a list on its right and applies the function to all elements of its list. For example, (plus 1) [1; 2; 1] = [(plus 1) 1; (plus 1) 2; (plus 1) 1] Function application associates to the left, function composition is denoted by a centralized dot () which has higher priority than application.
5.1.1.2 A sample derivation The following transformations are used in the forthcoming derivation: lif == (plus 1) ++= f De nition of lif g F ++= == ++= F f Map promotion g
The rst rule de nes the function lif, which concatenates all sublist of the list and then increments all elements of the list by one. The map promotion rule states that rst concatenating lists and then mapping F is the same as rst mapping F to all the sublists of the list and then concatenating the results. Here F is a free variable, which can be bound to a BMF-formula. The following (short) derivation states that the function lif can be computed by rst concatenating all sublist(s) of the list (++= ) and then incrementing all elements of the resulting list ((plus 1)) or by rst incrementing all the elements of the sublist(s) of the list ((plus 1)) and then concatenating the result (++= ). The names of the applied transformation rules are shown between braces. n o rcl lif = De nition of lif (plus 1) ++= n o = Map promotion ++= (plus 1) In each transformation step the selected transformation is applied on the selected term. For example, in the second step of the sample derivation the map promotion rule is the selected transformation and (plus 1) ++= is the selected term. Note, however, that the selected term is not necessarily the complete term.
5.1.2 The BMF-editor The BMF-editor supports the components used in the sample derivation. First these components will be discussed. Then the appearance to the user, the possible user ac-
5.1. THE BMF-EDITOR
107
tions, the abstract syntax of BMF-formulas and the system, and, nally, the dynamic transformations will be discussed. A library of rules (the dynamic transformations) is supported and adding newly derived rules to this library is straightforward. The direction in which (a subset of all) transformations are applicable on a newly selected (part of a) BMF-formula is updated incrementally and shown directly on the screen. Just as in a written derivation, the system keeps track of the history of the derivation. Furthermore, it is possible to start a (dierent) subderivation anywhere in the tree. Therefore, a forest of derivations is supported, thus facilitating a trial and error approach to algorithm development. Because a typical BMF-notation uses many non-ascii symbols, it has been made possible to select an arbitrary notation (e.g. LATEX) as unparsing for the internal representation of a BMF-formula. For this purpose, the editor maintains an editable list of displaybindings.
5.1.2.1 Appearance to the user
Figure 5.1: The Base View and Display Bindings View of the sample derivation in the BMF-editor, the Display Bindings in the Base View are hidden. The editor displays the de nitions of the dynamic transformations and the derivation in almost the same order as in the sample derivation. Transformations are shown as two BMF-formulae separated by an ==-sign. A transformation is preceded by its name. The direction in which a transformation may be
108
CHAPTER 5. APPLICATIONS
applied to a BMF-formula is denoted by < and > signs in the ==-sign. The selected transformation and the selected term are shown between the dynamic transformations and the forest of derivations. Nodes in the derivation tree are labeled with BMF-formulae; the edges of the tree are marked with justi cations. A justi cation is a reference to a transformation in the list of dynamic transformations. At all times only one path in the derivation tree is displayed. Left and right branches are indicated by ^-symbols. A displaybinding is shown as the internal representation of the BMF-formula followed by the unparsing. The dynamic transformation, selected transformation and term, the derivation and the displaybindings are shown on the Base View and main window. The dynamic transformations and the displaybindings in the Base View can be hidden by the user. Beside the Base View, various other views on the main window are possible. There is one global cursor for all views. The following other views are available:
Transformations View
Displays all dynamic transformations.
Applicable Transformations View
Displays all the transformations that are applicable on a subterm of a selected term.
Transformable Terms View
Displays all (sub)terms in the whole derivation on which a selected transformation is applicable. These terms are shown together with the possible results of the transforming.
DisplayBindings View
Displays all displaybindings.
Figure 5.1 shows the Base View and Display Bindings View of the sample derivation in the BMF-editor, the Display Bindings in the Base View are hidden.
5.1.2.2 User actions A dynamic transformation can be inserted and deleted by edit-operations. A BMFformula can be entered by structure editing or by typing the internal representation of a BMF-formula. There are shortcuts for frequently used BMF-constructions. For example, f is parsed correctly. We will explain how to apply a transformation by doing the second transformation (map promotion) of the sample derivation. Commands to the system are given
5.1. THE BMF-EDITOR
109
through built-in commands (SG-transformations), these will be indicated in boldface in the sequel of this section. Before applying a transformation the user must duplicate (dup) the last BMFformula in the derivation in order to keep the history of the derivation. Unfortunately, this must be done manually because the built-in SG-transformations do not allow to modify a tree which is not rooted by the node where the current cursor in the structure-tree is located. Then, the BMF-formula to be transformed is selected with the mouse and the select command. Now the system suggests which transformations are possible in the Transformations View or Applicable Transformations View. Because there is one global cursor for all views, clicking on one of the transformations in the Transformations View selects the corresponding transformation in the Base View. Selecting a dynamic transformation is done in the same way as selecting the term to be transformed. Both selections are shown as the selected transformation and the selected term. Figure 5.2 illustrates the situation before applying the map promotion rule. Next, the transformation can be applied by giving the do transform command. Figure 5.1 illustrates the situation after the transformation. Several improvements on this scheme are implemented: A set of dynamic transformations can be selected with the mouse and the select and add select commands. Then, the system suggests which BMF-formulae in the derivation can be transformed with the selected transformations by showing them in the Transformable Terms View. Clicking on a result in the Transformable Terms View automatically selects the transformable term in the Base View (the highlighted parts in Figure 5.2), then the do transform command can be given. In case there are more transformations possible, the user is asked to choose one. Analogously, a set of terms can be selected. The Transformations and Applicable Transformations View display all applicable transformations on this set. Then the user can choose which transformation should be applied. Other available commands are:
simplify
Simplify a BMF-formula (including removal of redundant brackets).
new right, new left, right and left
Focus on the (new) subderivation on the right or left and continue with a (dierent) subderivation.
comment
Insert text between derivation steps.
110
CHAPTER 5. APPLICATIONS
Figure 5.2: The sample derivation before applying map promotion and after duplication of the last BMF-formula in order to keep the history of the derivation. Note the various Views. A displaybinding can be entered by giving ascii-symbols or their integer-values and choosing a suitable (LATEX) font using SG-transformations. Parts of the dynamic transformations, the derivation and the displaybindings can be saved and loaded with the built-in save and load facilities of the SG.
5.1.2.3 The abstract syntax We have chosen a compact and uniform abstract syntax for BMF-formulae. The compact representation of BMF-formulas was necessary to minimize the attribution rules for the pattern-matching and program-variable binding in the BMF-formulae. There is only one representation for BMF-formulae containing operators. For example, a + b + c is represented as (+; [a; b; c]); the in x operator followed by a list of operands. All operators in BMF are represented by in x operators in the grammar. In BMF three types of operators can be distinguished; pre x, post x and in x operators. The pre x application f x can be seen as the in x application f preapplic x
5.1. THE BMF-EDITOR
111
where preapplic is the in x operator that applies its left operand to its right operand. Analogously, the postapplic in x operator can be de ned. There is no dierence between operands and operators, they are both represented by TERM s. A TERM is described by the following production rules: ::= TERMCONST j TERMVAR j ( TERM , [ TERMLIST ] ) TERMLIST ::= NOTERM j TERM , TERMLIST TERM
A TERM can be a standard-term (preapplic, postapplic, composition, map, reduce and list) or a user-de ned term, both described by TERMCONST , or a program-variable matching any term (TERMVAR). Program-variables start with an uppercase letter, standard and user-de ned terms with a lowercase letter. Associated with each TERM are xed priorities. The terms composition, map and reduce denote the corresponding notion in BMF. The last term, list, is used to represent the lists of BMF. As an example, the internal representation of ++= is: (postapplic; [++; = ]) In order to achieve the correct unparsing of this simple representation into BMFnotation, special unparsing rules for the standard terms are de ned. For example: (preapplic; [f; x]) (postapplic; [f; ]) (; [f; g; h]) (list; [1; 2; 1])
is unparsed as is unparsed as is unparsed as is unparsed as
fx f fgh [1; 2; 1]
The root-production of the system is now as follows: BMF-editor ::= TRANSLIST DERIVATION DISPLAYLIST TRANSLIST represents the list of dynamic transformations, DISPLAYLIST represents the editable list of displaybindings of terms. A dynamic transformation, named Label, is described by the following production: TRANS ::= f Label g TERM == TERM
112
CHAPTER 5. APPLICATIONS
A derivation is a list of terms separated by =-signs and the names of the transformation applied: DERIVATION ::= TERM j TERM = f Label g DERIVATION
In the actual implementation a more complicated grammar is used for the treestructure of derivations and for the possibility to add comment in derivations.
5.1.2.4 Dynamic transformations Transformations in the SG can be de ned only at editor-speci cation time. Dynamic transformations can be entered and deleted at editor-run-time. Just as for standard SG-transformations the applicability of a dynamic transformation is computed incrementally. In the PROSPECTRA project [KBHG+87] a brute force approach was taken. After adding a new transformation the complete PROSPECTRA Ada/Anna subset editor was regenerated. Our prototype emulates dynamic transformations using standard SSL attribute computation. This emulation will be explained hereafter. As was said in Subsection 5.1.2.3, a dynamic transformation consists of a name (Label) and a left-hand side and right-hand side pattern (TERM s). A dynamic transformation is applicable on term T if the left-hand side or the right-hand side matches with term T . For example the dynamic transformation F ++= == ++= F f Map promotion g
is applicable to the term (plus 1) ++= which then can be transformed into ++= (plus 1) Note that the program-variable F is bound to (plus 1).
5.1. THE BMF-EDITOR
113
The applicability test and actual application of a dynamic transformation to a term proceeds in four phases: pattern-matching, program-variable binding (uni cation), computation of the transformed term and replacement of the old term by the transformed term. Pattern-matching, program-variable binding and computation of the transformed term is done by attribuation inside terms. The replacement of the old term by the transformed term is carried out by activating the SG-transformation do transform (see also Subsection 5.1.2.2). The rst three phases (pattern-matching and program-variable binding and computation of the transformed term) require both the selected transformation and the selected term. To bring these together in an attribute grammar can be done in two complementary ways. Either the term to be transformed is inherited by the dynamic transformation or the dynamic transformation is inherited by the term to be transformed. Both ways are depicted in Figure 5.3. The rst way is used to compute the applicability direction: the selected term is an inherited attribute of the selected transformation. The second way is used to apply the selected transformation to the selected term: the selected transformation is an inherited attribute of the selected term. Also the Transformable Terms View is implemented in this way. matching + binding Term (plus 1)* ++/
Term
instantiation Term
Term
F * ++/ == ++/ F** dynamic transformation
Term
F * ++/ == ++/ F** dynamic transformation
matching + binding + instantiation
Term
(plus 1)* ++/
Term ++/ (plus 1)**
Term ++/ (plus 1)**
Figure 5.3: Two complementary ways of matching, binding of program-variables and computation of the transformed term. In order to keep the pattern-matching simple we do not take the associativity of operators into account. So the TERM 1 H (represented as (; [1; H])) does not match with the TERM 1 b c (represented as (; [1; b; c])). As a result, the matchtime is linear in the size of the tree. Furthermore, a program-variable can be bound only once to another term.
114
CHAPTER 5. APPLICATIONS
Pattern-matching and computation of bindings use the inherited attribute pat and synthesized attributes applic and bindings of TERM . A TERM (the pattern-TERM ) is given as an inherited attribute to the TERM it should match (the match-TERM ). A short description of each attribute is given.
pat
This attribute is used to distribute the pattern-TERM over the tree representing the match-TERM . Every node in this tree inherits that part of the pattern-TERM it should match. applic This boolean attribute is used to synthesize whether the pattern-TERM matches. The top-most applic attribute in the tree representing the matchTERM is true if all patterns in this tree match and there are no con icting bindings. bindings This attribute contains the list of program-variable bindings.
5.1.2.5 A large example This example, taken from [Bir87], shows some steps in the derivation of an O(n) algorithm for the mss problem. The mss problem is to compute the maximum of the sums of all segments of a given sequence of (possibly negative) numbers. This example illustrates the use of where-abstraction and conditions in the BMF-editor. The conditions are tabulated and automatically instantiated but not checked by the editor. First some de nitions necessary to de ne mss are given. The function segs returns a list of all segments of a list. For example, segs [1; 2; 3] = [[]; [1]; [1; 2]; [2]; [1; 2; 3]; [2; 3]; [3]]
The maximum operator is denoted by ", for example 2"4"3 = 4 Now mss can be de ned as follows
mss = "= += segs Direct evaluation of the right-hand side of this equation requires O(n3 ) steps on a list of length n. There are O(n2) segments and each can be summed in O(n) steps, giving O(n3) steps in all.
5.1. THE BMF-EDITOR
115
Without further explanation of the applied transformation rules we illustrate three situations in the derivation of a linear time algorithm for the mss problem. Figure 5.4 shows the start of the derivation together with all necessary displaybindings and transformations. Figure 5.5 illustrates the situation before applying Horner's rule. In Figure 5.6 the whole derivation is shown; note the instantiation of the whereabstraction and the conditions after applying Horner's rule.
Figure 5.4: The de nition of mss and all necessary transformations and displaybindings for the derivation of a linear time algorithm for mss. The last formula "= (! = e) in Figure 5.6 is a maximum reduce composed with a left-accumulation . Left accumulation is expressed with the operator ! = . For example,
(! = e) [a1; a2; : : :; an] = [e; e a1; : : : ; ((e a1) a2) : : : an] The maximum reduce composed with the left-accumulation can be easily translated into the following loop in an imperative language. Using hopefully straightforward notation, the value "= (! = e) is the result delivered by the following imperative program (a b = (a + b) " 0):
int
a,b,t; a := 0; t := 0; b in x a := max(a+b,0);
for do
116
CHAPTER 5. APPLICATIONS
Figure 5.5: The situation before applying Horner's rule.
Figure 5.6: The whole derivation of a linear time algorithm for the mss problem. Note the instantiation of the where abstraction and the conditions after applying Horner's rule.
5.2. A COMPILER FOR SUPERCOMBINATORS
117
t := max(t,a)
od return t
5.1.3 Further suggestions In a future version it should be possible to generate a LATEX document by combining the comments and the derivation. Also program-code (for example Gofer) might be generated from the derivation. A rst attempt of implementing both features is already done using the same technique as was used for the displaybindings. Incremental type checking and consistency checking of the derivation (for example after deletion of a transformation) should be performed. The dynamic transformations now only use pattern-matching. The dynamic tnsformations could easily be extended to conditional and parameterized dynamic transformations (see also [San88]). At edit-time, some complexity-measure of an algorithm should be indicated and updated incrementally.
5.1.4 Conclusion A prototype program transformation system for BMF has been developed in four man-months with the attribute grammar based SG. The BMF-editor was written by Aswin van den Berg [vdB90]. The use of an attribute grammar based system has signi cantly speeded up the building of such a complex system. Part of the motivation for extending AGs with higher order attributes stems from the tedious process of implementing certain parts of the BMF-editor without HAGs. Dynamic transformations, which provide insertion and deletion of a transformation during an edit-session, are a great help for making derivations in an interactive program transformation system. Dynamic transformations are particular useful, because their applicability can be indicated and updated incrementally.
5.2 A compiler for supercombinators In this section, taken from [SV91, Juu92, PJ91] (which describe all a HAG for compiling supercombinators), we will give a description of the translation of a -expression into supercombinator form. The purpose of this section is to serve as an example of the use of higher order attribute grammars. The SSL-grammar used for testing in Chapter 4 was taken from [Juu92].
118
CHAPTER 5. APPLICATIONS
In implementing the -calculus, one of the basic mechanisms which has to be provided for is -reduction, informally de ned as a substitution of the parameter in the body of a function by the argument expression. In the formal semantics of the calculus this substitution is de ned as a string replacement. It will be obvious that implementing this string replacement as such is undesirable and inecient. We easily recognise the following disadvantages: 1. the basic steps of the interpreter are not of more or less equal granularity 2. the resulting string may contain many common subexpressions which, when evaluated, all result in the same value 3. large parts of the body may be copied and submitted to the substitution process, which are not further reduced in the future but instead are being discarded because of the rewriting of an if-then-else- reduction rule 4. because substitutions may de ne the value of global variables of -expressions de ned in the body of a function, the value of these bodies may change during the evaluation process. It is thus almost impossible to generate ecient code which will perform the copying and substitution for this inner -expression. The second of these disadvantages may be solved by employing graph-reduction instead of string reduction. Common sub-expressions may be shared in this representation. To remedy the other three problems [Tur79] shows how any lambda-expression may be compiled into an equivalent expression consisting of SKI-combinators and standard functions only. In the resulting implementation the expressions are copied and substituted \by need" by applying the simple reduction rules associated with these combinators. Although the resulting implementation, using graph reduction, is very elegant, it leads to an explosion in the number of combinator occurrences and thus of basic reduction steps. In [Hug82] supercombinators are introduced; although the rst and third problem are not solved its advantages in solving the fourth problem are such that it is still considered an attractive approach. In this section we will describe a compiler for converting lambda-expressions completely into supercombinator code in terms of higher order attribute grammars. The algorithm is based on [Hug82]. The basic idea of a supercombinator is to de ne for each function which refers to global variables, an equivalent function to which the global variables are being passed explicitly. The resulting function is called a combinator, because it does not contain any free variables any more. At the reduction all the global variables and the actual argument are substituted in a single step. Because the code of the function may be considered as an invariant of the reduction process it is possible to generate machine code for it, which takes care of construction of the graph and the substitution process.
5.2. A COMPILER FOR SUPERCOMBINATORS
119
The situation has then become fairly similar to the conventional stack implementations of procedural languages, where the entire context is being passed (usually called the static link) and the appropriate global values are being selected from that context by indexing instructions. The main dierence is that not the entire environment is being passed, but only those parts which are explicitly being used in the body of the function. As a further optimisation subexpressions of the body, which do not depend on the parameter of the function, are abstracted and passed as an extra argument. As a consequence their evaluation may be shared between several invocations of the same function.
5.2.1 Lambda expressions As an example consider the lambda expression f = [x : [y : ([z : z (x y y) (z ( y) y)] x) 7]]. In this expression , and 7 are constant functions, e.g. the add and successor operation, and the number 7. Note that f a = ([z : z ( a a) (z ( a) a)] ) 7 = ( ( a a) ( ( a) a)) 7 Expression f may be thought of as a tree. This mapping is one to one since we assume application () to be left-associative. The corresponding abstract syntax tree|in linear notation|has the form lop x (lop y (lap(lap(lco() lap(lop(z lap(lap(lid(z) lap(lap(lid(x) lid(y)) lid(y))) lap(lap(lid(z) lap(lco( ) lid(y))) lid(y)) ) ) lid(x) ) ) lco(7) ) ) ) where we use the following de nition for type LEXP representing lambda-expressions LEXP ::= lop ID LEXP f-introductiong j lap LEXP LEXP ffunction applicationg j lid ID fidenti er occurrenceg j lco ID fconstant occurrenceg The type ID is a standard type, representing identi ers. Another standard type is INT ; it is used to represent natural numbers. In order to model the binding process we will introduce a mapping from trees labeled with identi ers (ID ) to trees labeled with naturals (INT ) instead:
120
CHAPTER 5. APPLICATIONS NEXP ::= nop INT NEXP j nap NEXP NEXP j nid INT j nco ID
In this conversion, identi ers are replaced by a number indicating the \nesting depth" of the bound variable. Hence, x, y, and z from our example will be substituted by 1, 2, and 3 respectively. Constants are simply copied. Although this mapping could be formulated in any \modern" functional language, we are striving for a higher order attribute grammar, so this is a good point to start from. The nonterminal LEXP will have two attributes. The rst, an inherited one, will contain the environment, i.e. the bound variables found so far associated with their nesting level. A list l of ID 's with index-determination (l;1) suits our needs (note that [x; y; z];1(x) = 1). The second attribute, a synthesized one, returns the \numbertree" of the above given type NEXP . LEXP :: [ID ] env ! NEXP nexp LEXP ::= lop ID LEXP :(ID in LEXP 0:env ) LEXP 1:env := LEXP 0:env ++ [ID ] LEXP 0:nexp := nop ((LEXP 1:env );1 (ID )) LEXP 1:nexp j lap LEXP LEXP LEXP 1:env := LEXP 0:env ; LEXP 2 :env := LEXP 0 :env LEXP 0:nexp := nap LEXP 1:nexp LEXP 2:nexp j lid ID ID in LEXP 0 :env LEXP 0:nexp := nid ((LEXP 0:env );1 (ID )) j lco ID LEXP 0:nexp := nco ID
Since we will follow the convention that the startsymbol of a (higher order) attribute grammar cannot have inherited attributes we introduce an extra nonterminal START : START :: ! NEXP nexp START ::= root LEXP LEXP.env := [ ] START.nexp := LEXP.nexp
The lambda expression we gave at the start of this paragraph \returns" the following attribute:
5.2. A COMPILER FOR SUPERCOMBINATORS
121
nop 1 (nop 2 (nap(nap(nco() nap(nop(3 nap(nap(nid(3) nap(nap(nid(1) nid(2)) nid(2))) nap(nap(nid(3) nap(nco( ) nid(2))) nid(2))
)
)
)
)
nco(7)
nid(1)
)
)
)
5.2.2 Supercombinators Before starting to generate supercombinator code we would like to stress that it is easier to derive supercombinator code from NEXP shaped expressions than from LEXP shaped expressions. Thus, the supercombinator code generator attributes the NEXP -tree, not the LEXP -tree. This is where higher order attribute grammars come into use for the rst time: the generated NEXP tree is substituted for a nonterminal attribute. START :: ! CEXP cexp START ::= root LEXP NEXP LEXP.env := [ ] NEXP := LEXP.nexp START.cexp := NEXP.cexp
The nonterminal NEXP has a synthesized attribute of type CEXP . This type, representing supercombinator code, is de ned as CEXP ::= cop [INT ] CEXP j cap CEXP CEXP j cid INT j cco ID As may be seen from the above de nition, combinators generally have multiple parameters. With cop [3; 1; 2] E we denote a combinator with three dummies. In standard notation this would be written as [312 : E ] which is equivalent to [3 : [1 : [2 : E ]]]. Let us have a closer look at expression e = [z : z (x y y) (z ( y) y)] which is a subexpression of our previous example. Any subexpression of (the body of) e that does not contain the bound variable (z) is called free. So x, y, , x y, y, and x y y are free expressions. Such expressions can be abstracted out, an example being f = [1234 : 4 (1 2) (4 3 2)] (x y) y ( y).
122
CHAPTER 5. APPLICATIONS
This transformation from e to f improves the program since, for example, x y only needs to be evaluated once, rather than every time f is called. Of course f is not optimal yet: the best result emerges when all maximal free expressions are abstracted out. . .
. .
z
.
. x
y
y
.
z σ
y
y
Figure 5.7: The paths (nodes) from the root to the tips containing the current dummy are indicated by thick lines (shaded circles) thus clearly isolating the maximal free expressions. As may be seen from Figure 5.7, x y y, y, and y are maximal free expressions. In order to generate the supercombinator for e, each maximal free expression is replaced by some dummy. We reserve the index \0" for the actual parameter introduced by the . [z : z (|x {zy y )} (z (|{z y )} |{z} y )] 1
2
3
Hence we nd as a possible supercombinator: = [1230 : 0 1 (0 2 3)] with bindings f1 7! x y y ; 2 7! y ; 3 7! yg so that e equals (x y y) ( y) y We will now describe an algorithm which nds all maximal free expressions. We could associate a boolean with each expression indicating the presence of the current parameter in the expression. This attribution then depends on this parameter. So, if we are interested in the maximal free expressions of the surrounding expression, we would have to recalculate these attributes. We use another approach instead: a level is associated with each expression indicating the nesting depth of the most local variable occurring in that expression. If this
5.2. A COMPILER FOR SUPERCOMBINATORS
123
depth equals the nesting depth of the current parameter, the expression contains this parameter as a subexpression and hence it is not free. Since we substituted all identi ers in LEXP by a unique number indicating their depth, the level of an expression simply is the maximum of all numbers occurring in that expression.
CEXP :: ! INT level CEXP ::= cop [INT ] CEXP CEXP 0 :level := 0 j cap CEXP CEXP CEXP 0 :level := CEXP 1 :level " CEXP 2 :level j cid INT CEXP 0 :level := INT j cco ID CEXP 0 :level := 0
Combinators and constants form a special group. They contain no free variables so their level is set to 0, the \most global level"|the unit element of \"". On the other hand, there is no need to abstract out expressions of level 0, since they are irreducible. They form the basis of the functional programming environment.
As a next step, let us concentrate on generating the bindings. A binding is a pair n 7! c with n 2 INT and c 2 CEXP . Since no variable may be bound more than once, we need to know which variables are already bound when we need a new binding. So, we introduce an \environment-in" (initially empty) and an \environment-out" (returning all maximal free subexpressions).
124
CHAPTER 5. APPLICATIONS CEXP :: INT n fINT 7! CEXP g bin ! INT level fINT 7! CEXP g bout CEXP cexp CEXP ::= cop [INT ] CEXP CEXP 0:level := 0; CEXP 0:bout := CEXP 0:bin CEXP 0:cexp := CEXP 0 j cap CEXP CEXP CEXP 1:n := CEXP 0 :n ; CEXP 2:n := CEXP 0 :n CEXP 0:level := CEXP 1 :level " CEXP 2:level if (CEXP 0 :level = CEXP 0:n ) _ (CEXP 0:level = 0) then CEXP 1 :bin := CEXP 0 :bin CEXP 2 :bin := CEXP 1 :bout ; CEXP 0:bout := CEXP 2:bout CEXP 0 :cexp := cap CEXP 1:cexp CEXP 2:cexp else CEXP 0 :bout := CEXP 0:bin t fjCEXP 0 :bin j + 1 7! CEXP 0g CEXP 0 :cexp := cid (CEXP 0 :bout ;1(CEXP 0)) j cid INT CEXP 0:level := INT f CEXP 0:level > 0 g if (CEXP 0 :level = CEXP 0:n ) then CEXP 0 :bout := CEXP 0:bin ; CEXP 0:cexp := cid 0 else CEXP 0 :bout := CEXP 0:bin t fjCEXP 0 :bin j + 1 7! CEXP 0g CEXP 0 :cexp := cid (CEXP 0 :bout ;1(CEXP 0)) j cco ID CEXP 0:level := 0; CEXP 0:bout := CEXP 0:bin CEXP 0:cexp := CEXP 0
Since we are not interested in the body of a combinator, we leave out the attributes of CEXP 1 in cop [INT ] CEXP 1. The operator t is de ned as follows:
S t fn 7! cg := if c 2 range(S ) then S else S [ fn 7! cg thus performing common-subexpression optimisation. This ensures that the bindings generated for the body of [y : y x x] are f1 7! xg instead of f1 7! x ; 2 7! xg The nal addition is devoted to generating the combinator body itself. Each time a subexpression c generates a binding n 7! c, expression c is replaced by a reference to the newly introduced variable: cid n.
5.2.3 Compiling So far we described properties of the supercombinator code. Now we are ready to discuss the actual compilation of NEXP to CEXP . In order to achieve this, we already
5.2. A COMPILER FOR SUPERCOMBINATORS
125
extended NEXP with a synthesized attribute of type CEXP . This attribute will contain the supercombinator code of the underlying NEXP expression. Compilation of nap, nid, and nco is straightforward, nop still requires some work because the applications to the abstracted expressions have to be computed. In case of a nop INT NEXP , we must eliminate the and introduce a . Hence we must determine the combinator body and bindings of c. This simply means that we have to attribute expression c! Therefore we introduce a nonterminal attribute: NEXP :: ! CEXP cexp NEXP ::= nop INT NEXP CEXP CEXP := NEXP 1:cexp CEXP:n := INT ; CEXP :bin := fg NEXP 0:cexp := fold (cop (1(a) ++ [0]) CEXP:cexp ) 2(a) where a = tolist CEXP:bout j nap NEXP NEXP NEXP 0 :cexp := cap NEXP 1:cexp NEXP 2:cexp j nid INT NEXP 0 :cexp := cid INT j nco ID NEXP 0 :cexp := cco ID
where \tolist" converts a set of bindings to a list of bindings and fold :: CEXP ! [CEXP ] ! CEXP fold c [ ] =c fold c (m ++ [a]) = cap (fold c m) a 1 :: [INT 7! CEXP ] ! [INT ] 1 [ ] =[ ] 1 (o ++ [n 7! c]) =(1:o) ++ [n] 2 :: [INT 7! CEXP ] ! [CEXP ] 2 [ ] =[ ] 2 (o ++ [(n 7! c]) =(2:o) ++ [c] The function \tolist" that converts a set to a list oers a lot of freedom: we may pick any order we want. We may exploit this freedom to generate better code: order the expressions in such a way that their levels are ascending. Since application is left associative this results in the largest maximal free expressions for the surrounding expression.
126
CHAPTER 5. APPLICATIONS
Chapter 6 Conclusions and future work This chapter discusses some conclusions and suggestions for future research. The conclusions will be presented rst.
6.1 Conclusions Chapter 2 de nes a class of ordered HAGs for which ecient evaluation algorithms can be generated and presents an ecient algorithm for testing whether a HAG is a member of a sucient large subclass of ordered HAGs. Finally, Chapter 2 shows that pure HAGs, which have only tree building rules and copy rules as semantic functions, have expressive power equivalent to Turing machines. Pure AGs do not have this power. By now, HAGs are implemented in the SG. The creators of the SG stated in [TC90] that \The recently formalized concept of HAGs provides a basis for addressing the limitations of the (normal) rst-order AGs" and \We adopt this terminology, as well as the idea, which we had independently hit upon in order to get around the limitations . . . ". The SG is no longer an academic product. In September 1990 the company GrammaTech was founded for the purpose of oering continuing support, maintenance, and development of the SG on a commercial basis. Currently more than 320 sites in 23 countries have licensed the SG. The SG release 3.5 (September 1991) and higher supports HAGs. Chapter 3 shows that conventional incremental AG-evaluators cannot be extended straightforwardly to HAGs without loosing their optimal incremental behaviour. Therefore, a new incremental evaluation algorithm for (H)AGs was introduced which handles the higher order case eciently. Our algorithm is the rst algorithm in which all attributes are no longer stored in the tree, but in a memoization table. There is thus no longer the necessity to have 127
128
CHAPTER 6. CONCLUSIONS AND FUTURE WORK
much memory available for incremental AG-based systems. Another interesting new aspect of our algorithm is that much memory means fast incremental evaluation and little memory means slow incremental evaluation. The whole prototype program transformation system (the BMF-editor) discussed in Chapter 5 was written as an AG in four man-months. It shows that an AG-based approach signi cantly speeds up the development time of such complex systems. Part of the motivation for the development of HAGs stems from the tedious process of implementing some parts of the BMF-editor without HAGs. At the time the BMFeditor was developed the SG did not support HAGs. The SG did, however, provide facilities to simulate the eects of HAGs. Such simulations were hard to write and understand. Furthermore, the prototype supports dynamic transformations which are transformations that can be entered and deleted during an edit-session. The applicability and direction of applicability of a dynamic transformation on a formula is indicated and updated incrementally. One of the main reasons for the relative short development time and the succesful implementation of dynamic transformations is that the algorithm needed for the incremental evaluation is generated automatically.
6.2 Future work 6.2.1 HAGs and editing environments This thesis did not discuss the practical problems which arise when HAGs are implemented in language-based environment generators like the SG. These problems, possible solutions and open questions are addressed in [TC90]. One of the main problems lies in an apparent contradiction between the desire to de ne parts of the derivation tree via attribute equations on one hand, and the wish to modify these parts manually.
6.2.2 The new incremental evaluator There is a certain eciency problem which is inherent in the use of (H)AGs. The problem is that (H)AGs have strict local dependencies among attribute values. Consequently, attributed trees have a large number of attribute values that must be updated. In contrast to (H)AGs, imperative methods for implementing the static semantics of a language can, by using auxiliary data structures to record nonlocal dependencies in the tree, skip over arbitrarily large sections of the tree. Attributeupdating algorithms would visit them node by node.
6.2. FUTURE WORK
129
In the last section of Chapter 3 a sketch is given of some improvements for the new incremental evaluator. One of these improvements is a method for eliminating copy rules. This might solve the above mentioned eciency problems with (H)AGs and is a topic for future research. Chapter 4 introduces a HAG-machine (an abstract implementation of the HAGevaluator described in Chapter 3). Furthermore, several cache organization, purging, and garbage collection strategies for this machine are introduced. At the end of Chapter 4 some tests are carried out with a prototype HAG-machine in the functional language Gofer. The results of these tests give only a limited indication about the incremental behaviour of this prototype implementation. It is not clear what the best cache organization, purging, and garbage collection strategies are. Finding good strategies is a topic for future research.
6.2.3 The BMF-editor Several possible improvements for the BMF-editor discussed in Chapter 5 are given next. First, it should be possible to generate a LATEX document by combining the comments and the derivation. Also program-code (for example Gofer) could be generated from the derivation. Finally, at edit-time, some complexity-measure of an algorithm might be suggested and updated incrementally. Finally, after 4 years work I dare to say that we have accomplished most of our goals. We de ned a new formalism and a new, promising, incremental evaluation strategy.
130
CHAPTER 6. CONCLUSIONS AND FUTURE WORK
References [App89]
Andrew W. Appel. Simple generational garbage collection and fast allocation. Software-Practice and Experience, 19(2):171{183, 1989. [B+76] J.W. Backus et al. Modi ed report on the algorithmic language Algol 60. The Computer Journal, 19(4), 1976. [BC85] G.M. Beshers and R.H. Champbell. Maintained and constructor attributes. In ACM SIGPLAN '85 Symposium on Language Issues in Programming Environments, pages 121{131, Seattle, Washington, June 25-28 1985. [BFHP89] B. Backlund, P. Forslund, O. Hagsand, and B. Pehrson. Generation of graphic language oriented design environments. In 9th IFIP International Symposium on Protocol Speci cation, Testing and Veri cation. Twente University, April 1989. [BHK89] J.A. Bergstra, J. Heering, and P. Klint. Algebraic Speci cation. ACM Press Frontier Series. The ACM Press in co-operation with AddisonWesley, 1989. [Bir84] Richard S. Bird. The promotion and accumulation strategies in transformational programming. TOPLAS, 6(4):487{504, 1984. [Bir87] R. Bird. An introduction to the theory of lists. In M. Broy, editor, Logic of Programming and Calculi of Discrete Design. Nato ASI Series Vol. F.36, Springer-Verlag, 1987. [BW88a] Richard Bird and Philip Wadler. Introduction to Functional Programming. International Series in Computer Science. Prentice Hall, 1988. [BW88b] Hans-Juergen Boehm and Mark Weiser. Garbage collection in an uncooperative environment. Software-Practice and Experience, 18(9):807{ 820, 1988. [CU77] J. Craig Cleaveland and Robert C. Uzgalis. Grammars for Programming Languages. Elsevier North-Holland Inc., New York, 1977. 131
132 [FH88]
REFERENCES
Anthony J. Field and Peter G. Harrison. Functional Programming. International Computer Science Series. Addison-Wesley Publishing Company Inc., Workingham, England, 1988. [FHPJea92] J.F. Fasel, P. Hudak, S. Peyton-Jones, and P. Wadler et al. Special issue on the functional programming language haskell. SIGPLAN Notices, 27(5), May 1992. [FY69] Robert R. Fenichel and Jerome C. Yochelson. A LISP-garbage collector for virtual-memory computer systems. Communications of the ACM, 12(11):611{612, 1969. [FZ89] P. Franchi-Zannettacci. Attribute speci cations for graphical interface generation. In G.X. Ritter, editor, Eleventh IFIP World Computer Congress, pages 149{155, New York, August 1989. Information Processing 89, Elsevier North-Holland Inc. [GG84] Harald Ganzinger and Robert Giegerich. Attribute Coupled Grammars. In B. Lorho, editor, SIGPLAN Notices, pages 157{170, 1984. [Hen91] P.R.H. Hendriks. Implementation of Modular Algebraic Speci cations. PhD thesis, University of Amsterdam, 1991. [HHKR89] J. Heering, P.R.H. Hendriks, P. Klint, and J. Rekers. The syntax de nition formalism SDF - reference manual. SIGPLAN Notices, 24(11):43{ 75, 1989. [Hil76] J. Hilden. Elimination of recursive calls using a small table of "randomly" selected function values. BIT, 8(1):60{73, 1976. [HK88] Scott E. Hudson and Roger King. Semantic feedback in the Higgens uims. IEEE Transactions on Software Engineering, 14(8):1188{1206, August 1988. [Hoo86] R. Hoover. Dynamically Bypassing Copy Rule Chains in Attribute Grammars. In Proceedings of the 13th ACM Symposium on Principles of Programming Languages, pages 14{25, St. Petersburg, FL, Januari 13-15 1986. [HU79] John E. Hopcroft and Jerey D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley Publishing Company Inc., 1979. [Hug82] R. J. M. Hughes. Super-combinators: A New Implementation Method for Applicative Languages. In Proceedings of the ACM Symposium on Lisp and Functional Programming, pages 1{10, Pittsburgh, 1982.
REFERENCES
133
[Hug85]
R. J. M. Hughes. Lazy Memo-functions. In Proceedings Conference on Functional Programming and Computer Architecture, pages 129{146, Nancy, 1985. Springer-Verlag.
[HW+91]
Paul Hudak, Phil Wadler, et al. Report on the programming language Haskell, a non-strict purely functional language (version 1.1). Technical report, Yale University/Glasgow University, August 1991.
[JF85]
G.F. Johnson and C.N. Fischer. A metalanguage and system for nonlocal incremental attribute evaluation in language-based editors. In Twelfth ACM Symposium on Principles of Programming Languages, pages 141{151, January 1985.
[Joh87]
Thomas Johnsson. Attribute Grammars as a Functional Programming Paradigm. Springer-Verlag, pages 154{173, 1987.
[Jon91]
Mark P. Jones. Introduction to Gofer 2.20. Oxford PRG, November 1991.
[Jou83]
Martin Jourdan. An ecient recursive evaluator for strongly noncircular attribute grammars. Rapports de Recherche 235, INRIA, October 1983.
[JPJ+90]
M Jourdan, D. Parigot, C. Julie, O. Durin, and C. Le Bellec. Design, Implementation and Evaluation of the FNC-2 Attribute Grammar System. In ACM SIGPLAN '90 Conference on Programming Languages Design and Implementation, pages 209{222, June 1990.
[Juu92]
Ben Juurlink. On the ecient incremental evaluation of a HAG for generating supercombinator code. Department of Computer Science, Utrecht University, Project INF/VER-92-02, 1992.
[Kas80]
Uwe Kastens. Ordered Attributed Grammars. Acta Informatica, 13:229{256, 1980.
[Kat84]
T. Katayama. Translation of attribute grammars into procedures. TOPLAS, 6(3):345{369, July 1984.
[KBHG+87] B. Krieg-Bruckner, B. Homann, H. Ganzinger, M. Broy, R. Wilhelm, U. Moncke, B. Weisberger, A. McGettrick, I.G. Campbell, and G. Winterstein. PROgram development by SPECi cation and TRAnsformation. In ESPRIT Conference 86. North-Holland, 1987. [Knu68]
D. E. Knuth. Semantics of context-free languages. Math. Syst. Theory, 2(2):127{145, 1968.
134 [Knu71]
REFERENCES
D. E. Knuth. Semantics of context-free languages (correction). Math. Syst. Theory, 5(1):95{96, 1971. [Knu84] D.E. Knuth. Literate programming. The Computer Journal, 27, 1984. [Kos91] C.H.A. Koster. Ax Grammars for Programming Languages. In H. Alblas and B. Melichar, editors, Attribute Grammars, Applications and Systems, International Summer School SAGA, Lecture Notes in Computer Science 545, pages 358{373. Springer-Verlag, June 1991. [KS87] M.F. Kuiper and S.D. Swierstra. Using Attribute Grammars to Derive Ecient functional programs. In Computing Science in the Netherlands, CSN 87, SION, Amsterdam, November 1987. Stichting Mathematisch Centrum. [Kui89] Matthijs F. Kuiper. Parallel Attribute Evaluation. PhD thesis, Dept. of Computer Science, Utrecht University, 1989. [Lin88] P.A. Lindsay. A survey of mechanical support for formal reasoning. Software Engineering Journal, pages 3{27, January 1988. [LMOW88] P. Lipps, U. Moencke, M. Olk, and R. Wilhelm. Attribute (re)evaluation in OPTRAN. Acta Informatica, 26:218{239, 1988. [M+86] James H. Morris et al. Andrew: A distributed personal computing environment. Communications of the ACM, 29(3):184{201, March 1986. [MAK88] Robert N. Moll, Michael A. Arbib, and A.J. Koufry. An introduction to formal language theory. Springer-Verlag, 1988. [McC60] John McCarthy. Recursive functions of symbolic expressions and their computation by machine. Communications of the ACM, 3(1):184{195, 1960. [Mee86] L.G.L.T. Meertens. Algorithmics - towards programming as a mathematical activity. In J.W. de Bakker, M. Hazewinkel, and J.K. Lenstra, editors, CWI Symposium on Mathematics and Computer Science, pages 289{334. CWI Monographs Vol. 1, 1986. [Mic68] Donald Michie. "Memo" Functions and Machine Learning. Nature, 218:19{22, April 1968. [Pfr86] M. Pfreundschuh. A Model for Building Modular Systems Based on Attribute Grammars. PhD thesis, The University of Iowa, 1986. [PJ91] Maarten Pennings and Ben Juurlink. Generating Supercombinator code using Higher Order Attribute Grammars. Unpublished, May 1991.
REFERENCES
135
[PK82]
Robert Paige and Shaye Koenig. Finite dierencing of computable expressions. TOPLAS, 4(3):402{454, 1982.
[PS83]
H. Partsch and R. Steinbruggen. Program Transformation Systems. Computing Surveys, 15(3):199{236, September 1983.
[PSV92]
Maarten Pennings, S. Doaitse Swierstra, and Harald H. Vogt. Using cached functions and constructors for incremental attribute evaluation. In Programming Language Implementation and Logic Programming, 4th International Symposium, PLIP '92, Lecture Notes in Computer Science 631, pages 130{144, Leuven, Belgium, August 26-28 1992. SpringerVerlag.
[Pug88]
William W. Pugh. Incremental Computation and the Incremental Evaluation of Functional Programs. PhD thesis, Tech. Rep. 88-936, Department of Computer Science, Cornell University, Ithaca, N.Y., August 1988.
[RA84]
T. Reps and B. Alpern. Interactive proof checking. In 11th Annual ACM Symposium on Principles Of Programming Languages, 1984.
[Rep82]
Tom Reps. Generating language based environments. PhD thesis, Tech. Rep. 82-514, Department of Computer Science, Cornell University, Ithaca, N.Y., August 1982.
[Rit88]
Brian Ritchie. The Design and Implementation of an Interactive Proof Editor. PhD thesis, Technical Report CSF-57-88, Department of Computer Science, University of Edinburgh, October 1988.
[RT87]
Thomas Reps and Tim Teitelbaum. Language Processing in Program Editors. IEEE Computer, pages 29{40, November 1987.
[RT88]
Tom Reps and Tim Teitelbaum. The Synthesizer Generator: A System for Constructing Language-Based Editors. Springer-Verlag, NY, 1988.
[RTD83]
Tom Reps, Tim Teitelbaum, and Alan Demers. Incremental ContextDependent Analysis for Language Based Editors. TOPLAS, 5(3):449{ 477, July 1983.
[San88]
R.G. Santos. Conditional and parameterized transformations in CSG. Technical Report S.1.5.C2-SN-2.0, PROSPECTRA Study Note, 1988.
[SDB84]
M. Schartz, N. Deslile, and V. Begwani. Incremental compilation in Magpie. In ACM SIGPLAN '84 Symposium on Compiler Construction, pages 121{131, Montreal, Canada, June 20-22 1984.
136
REFERENCES
[SL78]
J.M. Spitzen and K.N. Levitt. An example of hierarchical design and proof. Communications of the ACM, 21(12):1064{1075, 1978.
[SV91]
Doaitse Swierstra and Harald H. Vogt. Higher Order Attribute Grammars. In H. Alblas and B. Melichar, editors, Attribute Grammars, Applications and Systems, International Summer School SAGA, Lecture Notes in Computer Science 545, pages 256{296, Prague, Czechoslovakia, June 1991. Springer-Verlag.
[Tak87]
Masato Takeichi. Partial parametrization eliminates multiple traversals of data structures. Acta Informatica, 24:57{77, 1987.
[TC90]
Tim Teitelbaum and R. Chapman. Higher-Order Attribute Grammars and Editing Environments. In ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, pages 197{208, White Plains, New York, June 1990.
[Tur79]
David A. Turner. A New Implementation Technique for Applicative Languages. In Software, Practice and Experience, pages 31{49, 1979.
[Tur85]
David A. Turner. Miranda: A non-strict functional language with polymorphic types. In J. Jouannaud, editor, Functional Programming Languages and Computer Architecture, pages 1{16. Springer-Verlag, 1985.
[vD92]
Leen van Dalen. Incremental evaluation through memoization. Master's thesis, Department of Computer Science, Utrecht University, INF/SCR92-29, 1992.
[vdB90]
Aswin A. van den Berg. Attribute Grammar Based Transformation Systems. Master's thesis, Department of Computer Science, Utrecht University, INF/SCR-90-16, June 1990.
[vdB92]
M.G.J. van den Brand. PREGMATIC , A Generator For Incremental Programming Environments. PhD thesis, Katholieke Universiteit Nijmegen, November 1992.
[vdM91]
E.A. van der Meulen. Fine-grain incremental implementation of algebraic speci cations. Technical Report CS-R9159, Centrum voor Wiskunde en Informatica (CWI), Amsterdam, 1991.
[VSK89]
Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Higher Order Attribute Grammars. In ACM SIGPLAN '89 Conference on Programming Language Design and Implementation, pages 131{145, Portland, Oregon, June 1989.
REFERENCES [VvBF90]
137
Harald H. Vogt, Aswin v.d. Berg, and Arend Freije. Rapid development of a program transformation system with attribute grammars and dynamic transformations. In Attribute Grammars and their Applications, International Conference WAGA, Lecture Notes in Computer Science 461, pages 101{115, Paris, France, September 19-21 1990. SpringerVerlag. [vWMP+75] A. van Wijngaarden, B.J. Mailloux, J.E.L. Peck, C.H.A. Koster, M. Sintzo, C.H. Lindsey, L.G.L.T. Meertens, and R.G. Fisker. Revised report on the Algorithmic language Algol 68. Acta Informatica 5, pages 1{236, 1975. [WG84] W.M. Waite and G. Goos. Compiler Construction. Springer-Verlag, 1984. [Yeh83] D. Yeh. On incremental evaluation of ordered attributed grammars. BIT, pages 308{320, 1983.
Bibliography Preliminary versions of parts of this thesis were published in the following articles. Doaitse Swierstra and Harald Vogt. Higher Order Attribute Grammars = a merge between functional and object oriented programming. Technical Report 90-12, Department of Computer Science, Utrecht University, 1990. Doaitse Swierstra and Harald H. Vogt. Higher Order Attribute Grammars. In H. Alblas and B. Melichar, editors, Attribute Grammars, Applications and Systems, International Summer School SAGA, Lecture Notes in Computer Science 545, pages 256{296, Prague, Czechoslovakia, June 1991. Springer-Verlag. Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Higher Order Attribute Grammars. In ACM SIGPLAN '89 Conference on Programming Language Design and Implementation, pages 131{145, Portland, Oregon, June 1989. Harald H. Vogt, S. Doaitse Swierstra, and Matthys F. Kuiper. Ecient incremental evaluation of higher order attribute grammars. In J. Maluszynski and M. Wirsing, editors, Programming Language Implementation and Logic Programming, 3rd International Symposium, PLILP '91, Lecture Notes in Computer Science 528, pages 231{242, Passau, Germany, August 26-28 1991. Springer-Verlag. Harald H. Vogt, Aswin v.d. Berg, and Arend Freije. Rapid development of a program transformation system with attribute grammars and dynamic transformations. In Attribute Grammars and their Applications, International Conference WAGA, Lecture Notes in Computer Science 461, pages 101{115, Paris, France, September 19-21 1990. Springer-Verlag.
138
Samenvatting Computers worden geprogrammeerd m.b.v. een programmeertaal. Een compiler (vertaler) vertaalt een programma, dat door mensen geschreven is in een zogeheten \hogere programmeertaal", in machine-opdrachten die een computer direct kan uitvoeren. Mensen programmeren namelijk niet graag in machine-opdrachten omdat deze ver afstaan van de concepten van het op te lossen probleem. Attributen grammatica's worden gebruikt om een (hogere) programmeertaal te beschrijven. Een programma wordt in een attributen grammatica gerepresenteerd door een (ontleed)boom. Zo'n boom bestaat uit knopen die met elkaar verbonden zijn. De knopen bevatten attributen. De berekening van de attributen wordt beschreven door de attributen grammatica. Al geruime tijd is het mogelijk om uit een attributen grammatica automatisch een compiler voor de beschreven programmeertaal te genereren. Zo'n compiler bouwt bij een programma een ontleedboom op en berekent vervolgens de attributen. Als er geen attributen met een foute waarde worden berekend, is het programma correct. De compiler uitvoer, de lijst van machine-opdrachten, is dan beschikbaar in een van de attributen. Een compiler is een typisch voorbeeld van een traditioneel niet interactief programma. Sinds een jaar of tien is men ook in staat om automatisch een interactieve \compiler" te genereren uit een attributen grammatica. Zo'n incrementeel systeem controleert tijdens het intikken van een programma op eventuele fouten en kan ook tijdens het intikken de machine-opdrachten berekenen. Inmiddels worden attributen grammatica's ver buiten de compilerbouw toegepast en is het mogelijk om interactieve systemen zoals rekenmachines, spreadsheets, layoutprocessors, bewijs-veri catoren en programma transformatie systemen te genereren. In normale attributen grammatica's wordt de vorm van de ontleedboom compleet bepaald door de invoer-tekst. Dit proefschrift behandelt een nieuwe uitbreiding van attributen grammatica's waarmee de stricte scheiding tussen boom en attributen in normale attributen grammatica's opgeheven kan worden. Deze nieuwe uitbreiding van attributen grammatica's worden hogere orde attributen grammatica's genoemd en worden gede nieerd in Hoofdstuk 2. In hogere orde attributen grammatica's kan de ontleedboom worden uitgebreid door een stukje boom berekend in een attribuut. 139
Nadat de ontleedboom is uitgebreid met een nieuw stukje boom kunnen de attributen in het nieuwe stukje boom ook weer berekend worden. Het voordeel van hogere orde attributen grammatica's is dat zij een grotere beschrijvingskracht hebben dan normale attributen grammatica's. Multi-pass compilers, bijvoorbeeld, zijn makkelijk te beschrijven met hogere orde attributen grammatica's maar moeilijk met normale attributen grammatica's. Meer voorbeelden zijn te vinden in Hoofdstuk 1. Voorts behandelt Hoofdstuk 3 van dit proefschrift een nieuwe incrementele evaluatiemethode voor (hogere orde) attributen grammatica's. In alle tot nu toe bestaande incrementele evaluatiemethodes wordt de gehele boom met attributen opgeslagen in het geheugen. In de nieuwe incrementele evaluatiemethode worden de attributen niet langer opgeslagen in het geheugen, maar in een cache. Hoe groter de cache, des te sneller de incrementele evaluatie. Het is dus niet langer noodzakelijk om veel geheugen beschikbaar te hebben voor op attributen grammatica's gebaseerde incrementele systemen. Onze methode is de eerste die dat mogelijk maakt en zou incrementele systemen makkelijker toepasbaar kunnen maken in de praktijk. De volgende zaken komen verder nog aan bod in dit proefschrift:
Hoofdstuk 1 geeft een informele inleiding en een formele de nitie van normale attributen grammatica's.
In Hoofdstuk 2 wordt een klasse van geordende hogere orde attributen gram-
matica's gede nieerd waarvoor eciente evaluatie algoritmes kunnen worden gegenereerd. Voorts wordt er een eciente methode gegeven om te testen of een hogere orde attributen grammatica in die klasse valt. Tenslotte wordt er aangetoond dat pure hogere orde attributen grammatica's (zonder externe semantische functies) dezelfde berekeningskracht bezitten als Turing machines. Pure normale attributen grammatica's bezitten die kracht niet.
Hoofdstuk 4 behandelt een abstracte machine voor de nieuwe incrementele evaluatiemethode uit Hoofdstuk 3. Ook worden een aantal optimalisaties en implementatietechnieken voor deze machine behandeld. Het hoofdstuk wordt afgesloten met resultaten van tests met een prototype machine gemaakt in de functionele taal Gofer. De resultaten van deze tests zijn bemoedigend, maar geven slechts weinig indicatie van het algemene gedrag van de nieuwe incrementele evaluatiemethode. Het kiezen van de juiste implementatietechnieken vergt meer onderzoek.
Twee applicaties van hogere orde attributen grammatica's worden behandeld in Hoofdstuk 5. We noemen hier alleen de eerste. Het betreft een prototype programma transformatie systeem, de BMF-editor. Deze is gemaakt met het op 140
attributen grammatica's gebaseerde systeem genaamd de Synthesizer Generator (SG). De SG is een generator waarmee incrementele systemen kunnen worden gegenereerd uit attributen grammatica's. Helaas ondersteunde de SG geen hogere orde attributen grammatica's, maar via een omweg zijn we er toch in geslaagd deze constructie te implementeren. Deze exercitie toont de waarde van (hogere orde) attributen grammatica's voor de implementatie van programma transformatie systemen aan. Ondertussen zijn de hogere orde attributen grammatica's gemplementeerd in de SG. De reactie van de makers van de SG op de hogere orde attributen grammatica's was als volgt [TC90] \Het recent geformaliseerde concept van hogere orde attributen grammatica's vormt een basis om de beperkingen van normale attributen grammatica's aan te pakken" en \Wij nemen de terminologie en ook het idee erachter over ...". De SG is niet langer een academisch product. In september 1990 werd de rma GrammaTech opgericht met het doel om de ondersteuning, het onderhoud en de ontwikkeling van de SG op een commerciele basis te continueren. De SG versie 3.5 (september 1991) en hoger voorziet in hogere orde attributen grammatica's.
141
Curriculum Vitae Harald Heinz Vogt 8 mei 1965 : geboren te Rotterdam 1977-1983
: Gymnasium- Thorbecke Scholengemeenschap te Utrecht.
1983-1988
: Studie Informatica Rijskuniversiteit Utrecht
1988-1992
: Onderzoeker In Opleiding (OIO) in dienst bij de Nederlandse organisatie voor Wetenschappelijk Onderzoek (NWO) in het NFI-project Speci cation and Transformation of Programs (STOP), projectnummer NF-63/62-518.
142
Acknowledgements First of all, I would like to thank my promotor Doaitse Swierstra for the stimulating discussions and for showing me new ways of looking at existing things. Furthermore, Doaitse provided a nice work-environment and he was a pleasant fellow-traveler on our travels through foreign parts of the world. This research would not have been possible without the help of numerous people. I want to thank them all, in particular: Matthijs Kuiper, for placing the LRC-processor at my disposal. This enabled me to implement the algorithms and to obtain the test-results discussed in Chapter 4. Maarten Pennings, who was a pleasant roommate during the last year of my work on my thesis. He provided many suggestions for improvement, and was always willing to listen. The members of the review-committee, Prof. Dr F.E.J. Kruseman Aretz, Prof. Dr J. van Leeuwen and Prof. L. Meertens for reviewing my thesis. All persons who have commented on previous versions of this text, especially Doaitse Swierstra and Maarten Pennings. All students who contributed to this thesis. I am especially grateful to Aswin van den Berg, who did a marvelous piece of work constructing the BMF-editor discussed in Chapter 5. Afterwards, Aswin joined the Synthesizer-crew at Cornell University and helped implementing higher order attribute grammars in the Synthesizer Generator. Finally, I would like to thank my family and friends for their interest and support.
143