INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE
Integrating Natural Semantics and Attribute Grammars: the Minotaur System Isabelle Attali , Didier Parigot
N˚ 2339 Septembre 1994
PROGRAMME 2 Calcul symbolique, programmation et ge´nie logiciel
ISSN 0249-6399
apport de recherche
1994
Integrating Natural Semantics and Attribute Grammars: the Minotaur System Isabelle Attali , Didier Parigot Programme 2 | Calcul symbolique, programmation et genie logiciel Projets CROAP et ChLoE Rapport de recherche n2339 | Septembre 1994 | 16 pages
Abstract: This paper describes the principles and the functionalities of the Minotaur system. Minotaur is a generic interactive environment based on the integration of the Centaur system and the FNC-2 system, two systems widely used to specify syntax and semantics of programming languages and generate ecient semantic tools from these speci cations. We show how Attribute Grammars techniques can be adequate for evaluation of a quite large subclass of Natural Semantics speci cations, including speci cations of an arithmetic calculator, a tree transformation, a type-checker for an Algol-like language, ... For this subclass of Natural Semantics speci cations, the Minotaur system automatically generates an incremental and ecient (in time and memory) evaluator which gives to Natural Semantics an industrial strength implementation. Key-words: Speci cations, Natural Semantics, Attribute Grammars, Programming Environments
(Resume : tsvp)
[email protected] [email protected]
Semantique Naturelle + Grammaires Attribuees = le systeme Minotaur
Resume : Ce rapport decrit les principes et les fonctionnalites du systeme Minotaur. Minotaur est
un environnement generique interactif fonde sur l'integration des systemes Centaur et FNC-2, deux systemes largement utilises pour la speci cation des langages de programmation (syntaxe et semantique) et la generation, a partir de ces speci cations, d'outils semantiques puissants. Nous montrons comment les techniques issues des Grammaires Attribuees sont applicables a l'evaluation d'une classe large de speci cations en Semantique Naturelle, incluant par exemple une calculatrice, une transformation d'arbre, un veri cateur de typage pour un langage de type Algol, ... Pour cette classe de speci cations en Semantique Naturelle, le systeme Minotaur genere automatiquement un evaluateur incremental et performant (en temps et en memoire), ce qui confere a la Semantique Naturelle un schema d'execution adapte aux applications de taille reelle. Mots-cle : Speci cations, Semantique Naturelle, Grammaires Attribuees, Environnements de Programmation
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
3
1 Introduction This paper describes the principles and the functionalities of the Minotaur system. Minotaur is a generic interactive environment based on the integration of the Centaur system [9, 19] and the FNC-2 system [23, 24, 22], two systems widely used to specify syntax and semantics of programming languages and generate ecient semantic tools from these speci cations. The Centaur system is dedicated to graphical interactive tools for program interpretation using Natural Semantics [27] and the FNC-2 system is devoted to compilation and transformation applications via Attribute Grammars (AGs) [30]. Natural Semantics and Attribute Grammars are well-known methods for the speci cation of semantics on an abstract syntax. Both formalisms are expressive, declarative, and are straightforwardly executable (the reader could refer to [13, 14, 18, 27, 30, 38, 39] for more details about these methods). However, these two frameworks are used in dierent contexts, depending on the nature of the application: Attribute Grammars have proved to be a useful formalism for static program analysis (type-checking, translations) while Natural Semantics, in addition, deals with program interpretation. This major dierence (static vs dynamic semantics) is due to the evaluation processes on a given abstract syntax tree. In Attribute Grammars, this tree is decorated with attribute values in a static and deterministic way, provided there is no circularity in the dependency graph between attributes, but the tree itself is constant during the computations. These evaluation aspects have been widely investigated and the main results consist in varied strategies depending on the class of the AG (see a complete bibliography in [13], and a survey in [1]). For instance, Strongly Non Circular (SNC) AGs are usually implemented in a recursive tree-walk [12, 20, 17] and Ordered AGs are evaluated in an iterative algorithm based on visit sequences [28]. All these strategies have been designed with two main goals: (1) provide a reasonable memory cost (since the number of attributes is usually tremendous) and (2) provide an incremental re-evaluation after a tree transformation. As a consequence, most modern attribute systems (see Synthesizer Generator, FNC-2, Mercury, Mjlner/Orm) include powerful memory management and incremental evaluation [17, 25, 29, 32, 33, 35, 39] and AGs have been repeatedly and successfully used in real applications such as construction of editors, translators, and compilers. On the other hand, the framework of Natural Semantics lacks an industrial strength implementation despite a large theoretical background (Natural Deduction [18] and Structural Operational Semantics [38]). Natural Semantics rules de ne a logic and are used as a proof-theoretic tool to prove theorems within that logic, building proof trees in a recursive top-down strategy involving uni cation. This kind of execution model is close to the tree-walk strategies in AGs except that the proof tree in Natural Semantics is not always directly related to the input tree. Moreover, proof tree building is not implemented in a satisfactory manner regarding memory storage and incrementality features: to turn Natural Semantics de nitions into executable code, we chose to translate Typol rules (our computer version of Natural Semantics in the Centaur system [14]) into Prolog clauses, taking advantage of the similarity of Prolog variables and variables in inference rules. Considering our experiments in the Centaur system, this implementation is not adequate for four reasons: 1. the operational semantics of Prolog (depth- rst and left-to-right execution strategy) is a constraint on the Typol formalism; 2. dynamic type checking (using possibly delayed constraints) is required in Prolog to safely evaluate Typol speci cations. 3. the memory storage in Prolog is not optimal for Typol evaluation (due to the management of backtracking) and we cannot handle real-size applications (or if we want to do so, we need to pollute the speci cation with cuts); 4. the proofs in Prolog are not built in an incremental manner and it is not ecient to re-build a complete proof tree for a semantic treatment during the editing process. From this statement, our goal is to characterize a subclass of Natural Semantics powerful enough to describe an extensive collection of semantics (both in static and dynamic domains) and ecient enough to go a step further than toy examples and manage real-size applications. It has been proven that, for any speci cation in the addressed subclass, there exists an Attribute Grammar which evaluates it [5, 2]. We propose here a practical application of these theoretical results in order to provide an incremental and optimized implementation of Natural Semantics de nitions (in this subclass) via an automatically generated attribute evaluator. A similar approach has been discussed in [41], but with a severely restricted subclass of Typol programs, called UI-TYPOL, where uni cation is not allowed. Prolog implementation
4
Isabelle Attali , Didier Parigot
of Natural Semantics was also discussed and criticized in [37], and a new language (RML) and implementation strategy for Natural Semantics were suggested via a Continuation Passing Style representation to C code. To reach our goal, we had to design (or use) tools such as: a membership test to determine whether a given Typol speci cation can be evaluated by an AG; a translator from Natural Semantics to Attribute Grammars; an ecient attribute evaluator generator; an interactive setting for the generated evaluators. Our solution results in the Minotaur system: a combination of Centaur and FNC-2 which addresses realsize semantic applications. Minotaur uses the FNC-2 processors at generation-time to build attribute evaluators from semantic speci cations and the Centaur components for syntactic features and userinterface at run-time. This combination has been possible because both Centaur and FNC-2 are open systems: on one hand, Centaur provides a collection of accessible tools such as a kernel for the representation and manipulation of abstract syntax trees, primitives for the man-machine interface and for communication; on the other hand, FNC-2 is exible enough to generate an attribute evaluator acting on abstract syntax trees coming from Centaur. This kind of connection is very dicult in closed attribute systems because they have their own internal representation for trees and then duplication of data structures and data exchanges are mandatory for interaction.
We have used the Minotaur system for several applications now and positive experimental results tend to prove that we have reached our goal to get an ecient evaluation of powerful semantic speci cations in term of time, memory storage and incrementality features. The next section presents Natural Semantics and its implementation in the Centaur system. Section 3 is dedicated to Attribute Grammars and the FNC-2 system. In Section 4, we compare both frameworks, discuss the expressive power of the considered subclass for attribute evaluation, and outline the translation scheme from Natural Semantics to Attribute Grammars. Section 5 brie y describes the architecture of the Minotaur system. In Section 6, we establish the validity of our approach on a example of practical interest and we evaluate the performance of our system (in terms of memory and time). Finally, Section 7 concludes the paper, and suggests new directions for the future.
2 Natural Semantics in the Centaur System Semantic aspects in the Centaur system are handled with Natural Semantics [27] and its implementation, the Typol formalism [14]. The general idea of a semantic de nition in Natural Semantics is to provide axioms and inference rules that characterize semantic behaviors to be de ned on language constructs. Within Natural Semantics, a semantic de nition is identi ed with a logic and reasoning with the language is proving theorems within that logic. Axioms and rules are used as a proof-theoretic tool to generate new facts (proof trees) from existing facts in a non-deterministic manner due to the relational presentation of the formalism. Thus there can be several proof trees for the same fact. In the Typol formalism, inference rules indicate how a sequent (the conclusion of the rule) expressing some relation between some hypothesis and some property on an abstract syntax term, the subject, may be deduced from other sequents or predicates (called premises). Typol rules are of the form: H1 T 1 : S 1 Hn Tn : Sn (r) H T :S `
`
`
The numerator can also contain predicates (for auxiliary computation) of the form:
pred(1; ; l
!
1 ; ; m )
Variables may occur anywhere in a rule and allow this rule to be instantiated during a proof. The terms
H; Hi are called inherited positions, and the terms S; Si synthesized positions. Notice that positions H; H1 ; ; Hn ; S; S1 ; ; Sn may be tuples (as in Figure 1). The sequents H T : S; Hi Ti : Si are
`
`
strongly typed and must be declared with judgements, which make it possible to perform type inference
5
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
on variables. Valid abstract syntax patterns (the subjects) belong to some formalism (namely an ordersorted algebra composed with operators and sorts). Other variables can also be typed with prede ned types such as string or integer. From a Typol rule, we de ne the set input(r) of input positions composed from the positions H , S1 ; ; Sn , 1 ; ; m and the set output(r) of output positions composed from the positions S , H1 ; ; Hn , 1 ; ; l . These two sets are needed to de ne the ow of informations in a proof tree. Roughly speaking, input positions are computed by the outer context and used in rule r to compute the output positions which are then transmitted to the outer context. This kind of ow ensures that all output positions are computable. A typical Typol rule coming from the speci cation developed in section 6.1 (see Figure 5) is shown below:
' EXP1 : 1 ; 1 ; 1 ' EXP2 : 2 ; 2 ; 2 append arcs(1; 2 3 ) append states(1; 2 3) or(1; 2 ' plus(EXP1 ; EXP2 ) : 3 ; 3 ; 3 ; `
`
!
!
!
3 )
`
Figure 1: A Typol rule In this rule, the subject is a plus expression de ning a language recognized by some automaton to be de ned. Automata are characterized by three components: a set of arcs (), a set of initial states (), and a boolean () expressing whether or not the empty string belongs to the generated language. The ' variable denotes the set of states following the current state. The resulting automaton for a plus of two sub-expressions is inductively computed from the two sub-automata: the resulting set of arcs is the union of the two sets of arcs, the resulting set of initial states is the union of the two sets of initial states, and the empty string belongs to the resulting language if it belongs to one of the two languages. In Natural Semantics, proofs are done using structural induction on subjects since the initial goal to prove contains a complete abstract syntax term (i.e. is this program well-typed ?). This property links the given abstract syntax term and the resulting proof tree for the initial goal. Depending on some properties of the Typol rules (see Section 4), the resulting proof tree can be bigger than the given abstract syntax term (and even in nite); this makes it possible to express Dynamic Semantics in Natural Semantics. In the setting of interactive programming environments, incrementality is considered as a major goal for eciency and user-friendliness reasons (see for instance [6, 32, 33, 39, 43]). The Centaur system also enhances interactivity in syntactical aspects. Parsing is incremental: any sentence of the object language corresponding to a valid sort can be parsed and an abstract syntax subterm is constructed. Pretty-printing of abstract syntax trees is also incremental so the user is not annoyed with screen redisplays during the editing process. However, semantical aspects in the Centaur system can not be handled in an incremental manner and a call to a type-checker or a translator on the whole program has to be done on request after this program has been edited and (even slightly) modi ed. As in [3, 4], we consider the current implementation in Prolog (with uni cation and backtracking) as an obstacle to any attempt towards incrementality. Moreover, our applications in Typol are more or less limited to toy examples due to over ows in the Prolog stack.
3 Attribute Grammars in the FNC-2 System Since Knuth's initial paper [30], Attribute Grammars have been widely used in translation, compilercompiler techniques and de nitions for programming languages. In this section, we recall some basic notations about Attribute Grammars and we present the major features of the FNC-2 system, an AG processing system; we focus more particularly on its speci cation language Olga and the advantages of the generated attribute evaluators. An Attribute Grammar is an abstract syntax (a -signature) augmented with semantic de nitions dealing with two disjoint nite sets of symbols (attributes): INH and SYN. For each phylum X , we associate two disjoint nite sets of symbols: inherited and synthesized attributes (INH(X ); SYN(X )). For each operator p ; p : X0 X1 Xn , semantic de nitions describe local dependencies between the values of attributes and de ne SYN(X0) and INH(Xi) (output attributes) in terms of INH(X0) and SYN(Xi) (input attributes). O
2 P
2 O
!
P
6
Isabelle Attali , Didier Parigot
Evaluating an AG with respect to an abstract syntax tree can be viewed as decorating the nodes in the tree with the values of attributes. The major area of active research in AGs is the design of automaticallygenerated ecient attribute evaluators (see [13] for an annotated bibliography). For this purpose, dierent subclasses (based on partial orders between attributes) have been introduced, and associated membership tests have been developed (e.g. OAG [28], l-ordered [8], SNC [12, 20, 17]). With these subclasses, ecient (optimized, incremental) evaluators can be automatically generated by computing at generation time an evaluation order on attributes. The speci city of the FNC-2 system [23, 24, 22] is to consider that an Attribute Grammar speci es, and an attribute evaluator implements, an attributed-tree to attributed-tree mapping. In this scheme, the intermediate trees are described by abstract syntaxes extended with attribute declarations. In the FNC-2 system, Attribute Grammars are written in Olga [21] and specify the computations performed by one or more passes, according to some input and output data (attributed abstract trees). The Olga formalism was designed for the description of all aspects of an AG. This, of course, involves constructs to declare attributes, access them in semantic rules, but also constructs to describe pure calculations without resorting to an external language. Moreover, the Olga formalism has been extended with the notion of pattern-based tree attribution [15]: each operator is expressed as a pattern with variables. A typical Olga rule coming from the speci cation developed in Section 6.1 (see Figure 6) is shown below:
where plus -> EXP1 EXP2 use $epsilon := or($epsilon(EXP1), $epsilon(EXP2)); $alpha := append_arcs($alpha(EXP1),$alpha(EXP2)); $sigma := append_states($sigma(EXP1),$sigma(EXP2)); $phi(EXP1) := $phi; $phi(EXP2) := $phi; end where ;
Figure 2: An Olga rule The major advantages of the FNC-2 system are:
its expressive power: the accepted class of attribute grammars is the SNC class, which is large enough to cover usual needs for static semantics in programming languages; the eciency of the generated evaluators: the generated evaluators are deterministic, since they are based on a total evaluation order on each sort of the abstract syntax, thanks to a transformation from the SNC subclass to the l-ordered one (see [24, 22, 35] for more details); the exibility of the generated evaluators: from the same speci cation, the generated evaluators can be alternatively implemented with or without memory optimization, in an exhaustive or incremental manner, in a sequential or parallel manner, and in C or Lisp.
The FNC-2 system provides a very ne static analysis of the lifetime of each attribute instance, which in turn allows to determine the most ecient way to store it. This is possible because of the necessity to have a statically-determinable total evaluation order to produce visit-sequence-based evaluators (as described in [29]). In [25], we give exact conditions to decide whether a temporary attribute can be stored in a global variable rather than in a stack (all temporary attributes can be, at worst, stored in a stack) and whether a non-temporary attribute can be stored in a global variable. Finally, the evaluator generator is able to produce incremental attribute evaluators (see [1, 39] for related work). Our method, presented in [24, 22, 35], is based on the subclass of SNC AGs called Doubly NonCircular (DNC) [17]. This class is however larger than the l-ordered class, and our SNC to l-ordered transformation makes it possible to actually use this method for any SNC AGs. Thus, we can generate an evaluator which is able to start at any node in the tree. Moreover, a set of \semantic control" functions allow to limit the re-evaluation process to aected instances.
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
7
4 From Natural Semantics to Attribute Grammars Relationship between Natural Semantics and Attribute Grammars has been discussed in [2, 5]: a subclass of Typol programs has been de ned and a translation scheme from this subclass to Attribute Grammars has been developed in order to provide an optimized implementation of Typol programs, without uni cation when possible. To avoid the re-de nition of this subclass, we brie y compare both approaches in terms of style, expressiveness, and execution process. This comparison will naturally raise the main restrictions required on a Typol program to get an attribute evaluation.
4.1 Comparison of the two formalisms
From the presentation point of view, Natural Semantics speci cations are expressed in a relational style as opposed to a functional style for Attribute Grammars. About the relative expressive powers, Natural Semantics can deal with dynamic semantics when Attribute Grammars are limited to static semantics (type-checking, translations). On one hand, the result of an Attribute Grammar is the input abstract syntax tree decorated with attribute values; on the other hand, the result of a proof in Natural Semantics is a proof tree which is not (in the general case) isomorphic to the input abstract syntax tree, subject of the goal to prove. Moreover, the logical framework of Natural Semantics provides uni cation and non-determinism which are outside the scope of Attribute Grammars: most strategies for evaluation are based on a static order for the computation of all attributes and are totally deterministic. Note that uni cation per se is not always a problem: we show on our example in Section 6.1 how a least x-point (solved with uni cation) expressed in a Natural Semantics speci cation naturally disappears in an Attribute Grammar. From the execution point of view, uni cation makes a big dierence between Natural Semantics and Attribute Grammars. The proof tree is built with structural induction in a single recursive pass over the original abstract syntax tree and uni cation propagates the remaining computations on non-instantiated variables. We distinguish two kinds of uni cations during the proof process: bottom-up uni cations due to the tupling of several attributes in a single sequent (as in rule (6) of Figure 5) and left-right uni cations due to common variables in input attributes (see rule (5) in Figure 3). In the former case, according to some property on the dependency graph between attributes, propagation of uni cation can be implemented with attribute evaluation and the single pass over the tree can be decomposed into separate passes (see [3] for more details). Finally, during the execution process in attribute evaluators, there is no notion of failure: the generation of an evaluator ensures that any evaluation will be successfully achieved. This is not the case with Natural Semantics.
4.2 Conditions for translation
Now, we can enumerate the conditions necessary to check whether a given Natural Semantics speci cation can be evaluated with attributes: 1. no missing de nition: all variables occurring in output positions also occur in input positions; 2. no missing rule: there must be an applicable rule for each abstract syntax operator; 3. determinism: only one applicable rule for each abstract syntax operator; 4. no constraint: input positions are reduced to variables; 5. no link: no common variables between input positions; 6. no dynamic rule: subjects of premises are strictly subterms of the subject of the rule; The reader can nd violations of these constraints illustrated in Figure 3. Missing de nition and missing rule (conditions (1) and (2)) may lead to a failure during the proof; therefore these two conditions are mandatory for the translation. Condition (3) involves non-determinism which is not acceptable in Attribute Grammars. However, we can express fake non-determinism, a situation where more than one rule can apply (pattern-matching on the conclusion), but only one will be successfully proved (failures in the premises for others). This fake non-determinism is based on exclusive rules (as rules (cond1) and (cond2)) with a conditional constructor. 1 . 1
A similar approach for the translation from Typol to Coq is discussed in [42].
8
Isabelle Attali , Didier Parigot
e T1 : a e p(T1 ; T2 ) : b e T1 : a e p(T1 ; T2 ) : f (a) e T2 : a e p(T1 ; T2 ) : f (a) e T1 : f (a) e T2 : b e p(T1 ; T2 ) : b e T1 : a e T 2 : a e p(T1 ; T2 ) : f (a) e T1 : true e T2 : e e p(T1 ; T2 ) : a e p(T1 ; T2 ) : a `
(1)
`
(3)
`
(3 )
`
`
`
`
`
`
`
`
`
`
0
`
0
`
`
0
(4) (5) (6)
Figure 3: Hopeless Typol rules for Attribute Evaluation
e E : true e T1 : a e cond(E; T1 ; T2 ) : a e E : false e T2 : a e cond(E; T1 ; T2 ) : a `
`
(cond1)
`
(cond2)
`
`
`
Conditions (4) and (5) involve left-right uni cations which may also lead to a failure during the proof. However, rules (4) and (5) can be reformulated in (4') and (5') in such a way that uni cation is not expressed in the rule but in an auxiliary predicate unify. e T1 : a unify(a ; f (a)) e T2 : b (4 ) e p(T1 ; T2 ) : b e T1 : a e T 2 : a unify(a; a ) (5 ) e p(T1 ; T2 ) : f (a) Even though an equivalent unify predicate can be provided within the attribute evaluator, we still have the problem of failure management in the generated evaluator. This problem is not solved in a satisfactory manner for the moment. That is why conditions (4) and (5) are mandatory. Finally, condition (6) will also be absolutely necessary as long as the associated attribute evaluator works on a constant abstract syntax tree. 0
`
0
`
0
`
`
`
0
0
0
`
4.3 Translation scheme
Given a Natural Semantics speci cation in which all the conditions presented above are veri ed, we can apply our translation scheme in order to generate an AG de nition, and then an attribute evaluator. We brie y outline the main principles of the translation scheme: from judgement declarations we generate attribute declarations: { each position induces a kind of attribute (inherited or synthesized); { each position generates a new name (according to the pro le of the judgement); { each position gives its type to the attribute; { the sort of the subject de nes the sort the attribute is de ned on. from each rule we generate a production rule and its semantic rules: { the production rule is the subject of the rule; { a semantic rule is generated from each output position in the rule, according to the correspondence between attribute names and positions in the rule. Notice that the relational style of Typol sequents and predicates has to be converted into a functional style.
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
9
5 The Minotaur System The architecture of the Minotaur system (see Figure 4) is based on components coming from Centaur, FNC-2 and also new components especially designed for integration. From FNC-2, we use the evaluator generator at generation-time. From Centaur, we take the Virtual Tree Processor (VTP) [31] for manipulation of formalisms and abstract syntax trees, the syntactic part (structure editors, incremental parsers and pretty-printers) and also the user-interface (menus, buttons, ...). To make Centaur and FNC-2 cooperate, it has been necessary to build new components such as:
an additional type-checker (TC2) for Typol speci cations to check when a given Typol program can be translated into an Attribute Grammar; a new code-generator (CG2) for Typol speci cations generating an Olga speci cation instead of a set of Prolog clauses; a new back-end to the FNC-2 evaluator generator which produces an attribute evaluator implemented in Le Lisp [11], with VTP primitives.
In the following, we brie y present the Minotaur components, including the speci cation level, the generation level and the run-time level. Typol
Metal NS Spec.
Syntax Spec.
Compiler
TC1
TC1
TC2
CG1
CG2
VTP Olga
Formalism
AG Spec.
FE EG CG−VTP
generation−time run−time
CG−C
AST
C
VTP change
Evaluator
Evaluator
Prolog Clauses
decorated AST
Figure 4: The Minotaur architecture
5.1 The speci cation level
This level is sketched in Figure 4 with bold arrows on round boxes (square boxes are used for system components). Syntactic speci cations are written in Metal [26]. Semantic speci cations can be written in Typol or in Olga, using nice interactive programming environments.
5.2 The generation level
This level is illustrated in the upper part of Figure 4. A formalism is created from Metal speci cations; this formalism is manipulated by both Typol and Olga. When semantic speci cations are written in Typol, the designer can choose between two execution processes: the former using Prolog (via TC1 and CG1), the latter using attribute evaluation via the additional type-checker (TC2) (to check if a given
10
Isabelle Attali , Didier Parigot
Typol program can be translated into an Attribute Grammar) and the new code-generator (CG2) (to produce an Olga source). The translator from Typol to Olga, written in Typol, takes as input a well-typed Typol program, decorated by the usual type-checker (TC1) with type information, and name analysis. The additional type-checker builds an intermediate structure (representing input and output positions) and performs veri cations of the conditions presented in Section 4.2. From this intermediate structure, the code generator produces an Olga abstract syntax tree. From Olga speci cations (either generated from Typol speci cations or written by hand), the evaluator generator produces an attribute evaluator. The front-end (FE) analyses the Olga speci cation, the evaluator generator (EG) produces an abstract evaluator. This abstract evaluator is based on only two instructions: eval of a semantic rule and visit of a node in the tree, which makes it independent of any implementation language. Dierent back-ends generate either a VTP code (CG-VTP) or a C code (CG-C).
5.3 The run-time level
This level is illustrated in the lower part of Figure 4. The usual implementation of Typol programs is based on the interpretation, via a Prolog engine, of generated Prolog clauses. A communication protocol is provided between the VTP kernel and the Prolog engine and makes it possible to supply a nice debugging environment for execution of Typol programs. However, this execution process requires ecient coercion primitives between VTP trees and Prolog terms. The Minotaur system provides two other implementations for Typol speci cations via attribute evaluation. The VTP implementation consists in a set of Lisp functions (one for each sort of the formalism) which manipulate VTP abstract syntax trees via primitives for navigation, creation of new trees, management of attributes, and error handling. Attribute evaluators implemented in VTP are used in an interactive setting: all trees are VTP trees visualized in structure editors. In exhaustive evaluators, attributes are stored in the abstract syntax tree. In the context of an optimized memory management evaluator, attributes are not hooked in the abstract syntax tree but are stored, when possible, in global variables or stacks. The incrementality facility of the produced evaluators is easily handled since the system keeps track of all modi cations performed during editing. Finally, attribute evaluators implemented in C are o-line processors: they take a program (an ascii le) as input and produce a decorated tree as result; they manipulate their own trees with their own primitives. This solution, once the semantic de nition is completely validated, provides eciency (both in time and memory). Hence, the Minotaur system can be considered as a production tool.
6 Evaluation of performance In this section, we walk through the development of an application with the Minotaur system and discuss the resulting bene ts at the speci cation level as well as the run-time level. Our application deals with languages generated from regular expressions and seems a convincing example to us both in terms of expressiveness of the speci cations (in Typol and Olga) and eciency of the generated evaluators (using Prolog or Lisp via attribute evaluation).
6.1 Specifying a sound example
This example describes how to build, for a given regular marked expression E , a nite automaton A in order to recognize the language L(E ) generated by E . The syntax of regular expressions over a set of input symbols is as usual: axiom ::= E end (end is a special symbol for termination state) E ::= I E + E E:E E? j
j
j
j
According to the following de nitions and the algorithm proposed in [7], the automaton which recognizes L(E ) is characterized by three components: a set of initial states ((E )), a set of transitions (E (a)) for all states a, and an indicator (E ). These components are de ned as follows: (E ) = a=av L(E ) 'E (a) = b=uabv L(E ) E (a) = a 'E (a) (E ) = L(E ) f
f
2
!
g
g
f
2
2
g
11
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
We give in Figure 5 the Typol speci cation for this construction. program build_automaton is use regul; use automaton; import append_arcs(ARCS, ARCS -> ARCS), append_states(STATES, STATES -> STATES), compose(STATES, IS_EMPTY, STATES -> STATES) from aux; judgement |- AXIOM : AUTOMATON ; judgement STATES |- EXP : ARCS, STATES, IS_EMPTY ;
fendg ` E : ; ; ` axiom(E) : automaton(; ; ); ' ` label I : fI ! 'g; fI g; false; ' ` : ;; ;; true; ' ` EXP : ; ; ' ` EXP : ; ; append arcs( ; ! ) append states( ; ! ) or( ; ! ) ' ` plus(EXP ; EXP ) : ; ; ; ' ` EXP : ; ; compose( ; ; ' ! '0 ) '0 ` EXP : ; ; append arcs( ; ! ) compose( ; ; ! ) and( ; ! ) ' ` prod(EXP ; EXP ) : ; ; ; '0 ` EXP : ; ; append states(; ' ! '0 ) ' ` star(EXP) : ; ; true; 1
2
1
1 3
1
1
1
1
2 1
2 2 2 1 1 1 1 1 2
2
2
2
3
3
2
2 3
2
1
2
3
3
2 1 1 2
(1) (2) (3) (4)
2
3
1
2
3
3
2
3
3
3
(5) (6)
end build_automaton;
Figure 5: From regular expressions to automata: the Typol version The Typol speci cation describes how to build, in a single pass over the given regular expression three synthesized attributes (; ; ) using one inherited attribute ('). Abstract syntax of expressions is imported via the use construct. The append arcs (resp. append states) predicate denotes the concatenation for arcs (resp. states). The and and or predicates denote the usual logical operators. The compose predicate is a conditional concatenation depending on the boolean value of its second argument. In the speci cation, a least x-point is expressed in rule (6): is both an input position (synthesized position of the rst premise) and an output position (parameter of the append states predicate); this is handled in Prolog with a bottom-up uni cation, due to the tupling of several synthesized attributes and not to a circularity in the dependency graph between attributes. This means that an Attribute Grammar without least x-point can express the same construction. We give in Figure 6 the Attribute Grammar produced from the Typol speci cation of Figure 5, assuming renaming of attributes for clarity reasons. The mapping between attributed trees is expressed within the header of the AG (input is an expression, output is an automaton). Auxiliary computations are done in functions instead of predicates.
6.2 Running this example
From the Typol speci cation given in Figure 5, since all conditions presented in Section 4.2 are veri ed, one can automatically generate a Prolog source or an Attribute Grammar (similar to the one presented in Figure 6, assuming -conversion). During the usual execution in Prolog, uni cation is used to build the automaton in a single pass over the input expression while uni cation is not necessary with evaluation of attributes in several passes over
12
Isabelle Attali , Didier Parigot
attribute grammar build_automaton (regul) : (automaton) is from aux import append_arcs, append_states, compose; attribute synthesized $epsilon (EXP) : IS_EMPTY; synthesized $sigma (EXP) : STATES; inherited $phi (EXP) : STATES; synthesized $alpha (EXP) : ARCS; synthesized $automaton(AXIOM) : AUTOMATON; where axiom -> E use $automaton:= automaton($alpha(E),$sigma(E),$epsilon(E)); $phi(E) := states(state("end")); end where ; where label -> I use $epsilon := false; $alpha := arcs(pair(state(I),$phi)); $sigma := states(state(I)); end where ; where empty -> use $epsilon := true; $alpha := arcs(); $sigma := states(); end where ; where plus -> EXP1 EXP2 use $epsilon := or($epsilon(EXP1), $epsilon(EXP2)); $alpha := append_arcs($alpha(EXP1),$alpha(EXP2)); $sigma := append_states($sigma(EXP1),$sigma(EXP2)); $phi(EXP1) := $phi; $phi(EXP2) := $phi; end where ; where prod -> EXP1 EXP2 use $epsilon := and($epsilon(EXP1), $epsilon(EXP2)); $alpha := append_arcs($alpha(EXP1),$alpha(EXP2)); $phi(EXP2) := $phi; $phi(EXP1) := compose($sigma(EXP2), $epsilon(EXP2), $phi); $sigma := compose($sigma(EXP1), $epsilon(EXP1), $sigma(EXP2)); end where ; where star -> EXP use $epsilon := true; $sigma := $sigma(EXP); $phi(EXP) := append_states($sigma(EXP), $phi) ; $alpha := $alpha(EXP) ; end where ; end grammar ;
Figure 6: From regular expressions to automata: the Olga version
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
13
the expression: during a rst bottom-up pass, $epsilon and $sigma are computed; then in a pre-order traversal, $alpha is computed, depending on the value of the inherited attribute $phi. Two kinds of evaluators can be produced: one with optimized memory management, the other with incremental facilities. On one hand, memory management is often eective since, in our example, the $alpha and $sigma attributes can be stored in stacks (however $epsilon and $phi have to be hooked in the tree itself). On the other hand, the incremental evaluator can be automatically called after changes in the regular expression (due to editing) and a minimal re-computation of the automaton occurs. Thanks to incremental facilities of the pretty-printing process, visualization of the changes in the produced automaton is quick and smooth.
6.3 Measuring the bene ts of Minotaur
We show in Table 1 comparative results for three applications speci ed in Typol and Olga and evaluated in Prolog and in Lisp via attribute evaluation. These results show that implementation via attribute evaluation is always better in time and memory than proof using Prolog interpretation. Incrementality bene t of the attribute evaluation does not appear in Table 1. The speci cations used for this comparison are the presented example for the building of an automaton from a regular marked expression (regul), a tree transformation (tree), and a type-checker for an Algol-like language (tc). Our experimental results come from multiple executions of the generated code from Typol and Olga speci cations on big data (regular expressions, trees, programs). The rst column summarizes the speedup between Prolog execution (using -prolog [34]) and Lisp/VTP execution for the same input data. This speedup is at least 3 for the attribute evaluator in Lisp. The second column shows the space ratio between evaluation in Prolog and in Lisp. Prolog interpretation is considerably more memory consuming than the attribute evaluation in Lisp, probably due to backtracking management, even though Prolog is based on structure sharing. However, one could argue that a more ecient Prolog interpreter (or compiler) could reduce the gap. This is still to experiment. Also, hacking the generated Prolog code (with cuts) could give better results for Prolog but is not satisfactory from our point of view. The last three columns deal with memory optimizations in the generated attribute evaluators. Some attributes can be stored in variables (at least 20 %), or in stacks (about 40 %), or in the tree (40 % in the regul application, none in the two others). Examples Time ratio Memory ratio Memory optimizations (%) variables stacks tree regul 15 26 20 40 40 tree 3 28 60 40 0 tc 10 45 52 48 0
Table 1: Comparative Results
7 Conclusion and Future Work The construction of the Minotaur system would not have been possible without two open systems like Centaur and FNC-2. Considering Centaur as a toolkit for manipulating (editing, pretty-printing, ...) abstract syntax trees and tailoring FNC-2 to generate attribute evaluators acting on these abstract syntax trees was a very convenient and fruitful approach. Our rst experiments tend to prove that, for a large class of semantic problems (an arithmetic calculator, a tree transformation, a type-checker for an Algol-like language, ...), the Minotaur system provides two powerful formalisms for semantic description (together with user-friendly programming environments) and an incremental and ecient (in time and memory) evaluation. Comparison of performances shows that interactive attribute evaluation is faster, and considerably less memory consuming, than Prolog interpretation. Moreover, incremental re-evaluation is a major advantage of the attribute evaluation. All these results give to Natural Semantics an industrial strength implementation. However, for the moment, there is no debugging environment for attribute evaluation equivalent to the one supplied for the Prolog implementation. During the designing phase of an application, this is not a big drawback: the designer develops semantic speci cations using small examples with the Prolog interpretation (and the associated debugger); then, once the speci cation is debugged,
14
Isabelle Attali , Didier Parigot
the following step is to generate an interactive ecient evaluation. This step is another validation of the speci cation at the run-time level. Finally, the Minotaur system can be used as a production tool to generate a stand-alone ecient code (in C) which can be distributed independently of the system itself. For the future, we are looking for a graphical application involving attributes since visual applications naturally requires incrementality (see [10, 22] for related work). We want to design a pretty-printer (implemented with an incremental attribute evaluator) and we want to connect it to a incremental graphic formatter (another tool of the Centaur system). Secondly, we wish to increase the expressive power of the Typol subclass which can be evaluated by attributes. The rst mean is to weaken some conditions for the translation from Typol into Olga (thanks to speci c constructs of the Olga language), namely conditions (4) (no constraint) and (3) (determinism which introduce only fake non-determinism and only local backtracking during the proof). Another requirement to reach this goal is to extend the Attribute Grammars paradigm towards Natural Semantics in order to handle Dynamic Semantics (condition (6), no dynamic rule). A rst condition is to store all attributes outside the tree: we think of using a technique similar to the one presented in [36] using \bindings", or \cactus stacks", to store non-temporary attributes. In this context, the current implementation does support successive assignments of a given attribute instance, thanks to an attribute storage external to the tree. In this context, the tree and the attribute values conduct together the computations. It remains to extend the Olga formalism with the expression of dynamic computations. Moreover, from our incremental evaluation of the DNC subclass, we also investigate a new approach for composition of evaluators [40] in the spirit of the Composable Attribute Grammars of [16]. This research direction will eventually yield interesting evaluation results regarding modularity, incrementality, lazy evaluation, and circular computations.
References [1] Alblas H. \Attribute Evaluation Methods", Proc. of \The International Summer School on Attribute Grammars, Applications & systems", LNCS 545, 1991. [2] Attali I. \Compilation de programmes Typol par attributs semantiques" Doctoral thesis, University of Nice, 1989. [3] Attali I. and Chazarain J. \Functional Evaluation of Natural Semantics Speci cations" Proc. of WAGA, LNCS 461, Paris, 1990. [4] Attali I., Chazarain J., Gilette S. \Incremental Evaluation of Natural Semantic Speci cations" Proc. of PLILP'92, LNCS 631, Leuven, 1992. [5] Attali I. & Franchi-Zannettacci P. \Uni cation-free Execution of Typol Programs by Semantic Attribute Evaluation", Proceedings Fifth Int. Conf. Symp. on Logic Programming, Seattle, 1988, MIT Press. [6] Ballance R., Graham S., Van De Vanter M. \The Pan Language-Based Editing System for Integrated Development Environments" ACM SIGSOFT'90 Fourth Symp. on Software Development Environments, Irvine, 1990. [7] Berry G., Sethi R. \From regular expressions to deterministic automata" TCS 48, 1986. [8] Bochmann G. \Semantic evaluation from left to right" CACM 19, 2, 55-62, 1976. [9] Borras P. et al \CENTAUR: the system" in SIGSOFT'88, Third Annual Symposium on Software Development Environments, Boston, 1988. [10] Chabrier B., Franchi-Zannettacci P. & Lextrait V. \GIGAS : a Graphical Interface Generator by Attribute Speci cation", Le Genie logiciel et ses applications, Toulouse, 1988. [11] Chailloux J., Devin M., Hullot J. M. \Le Lisp, a portable and ecient Lisp system" Proc. ACM Symp. on Lisp and Functional Programming, Austin, Texas, 1984. [12] Courcelle B. & Franchi-Zannettacci P. \Attribute Grammars and Recursive Program Schemes" TCS 17, 163, 1980. [13] Deransart P., Jourdan M., & Lorho B. \Attribute Grammars: De nitions, Systems and Bibliography" LNCS 323, Spinger Verlag, 1988. [14] Despeyroux T. \Typol: a formalism to implement Natural Semantics", INRIA research report 94, 1988.
Integrating Natural Semantics and Attribute Grammars: the Minotaur System
15
[15] Farnum C. \Pattern-based tree attribution" Proc. of Symp. POPL'92, Albuquerque, 1992. [16] Farrow R., Marlowe T. J., and Yellin D. M. \Composable attribute grammar: Support for modularity in translator design and implementation" Proc. of Symp POPL'92, Albuquerque, 1992. [17] File G. \Classical and Incremental Attribute Evaluation by Means of Recursive Procedures", TCS 53, 1, 1987. [18] Gentzen G. \Investigation into Logical Deduction" Thesis 1935, reprinted in \The collected papers of Gerhard Gentzen", E. Szabo, North-Holland, Amsterdam, 1969. [19] Jacobs I., ed., \The Centaur 1.2 Manual", INRIA, March 1992. [20] Jourdan M. \Strongly Non-Circular Attribute Grammars and their recursive evaluation", ACM Sigplan Symp. on Compiler Construction, Montreal Sigplan Notices 19, 6, 1984. [21] Jourdan M., Lebellec C. & Parigot D. \The OLGA Attribute Grammar Description Language: Design, Implementation and Evaluation", Proc. of WAGA, LNCS 461, 1990. [22] Jourdan J. & Parigot D. \Internals and Externals of the FNC-2 Attribute Grammar System", Proc. of \The International Summer School on Attribute Grammars, Applications & systems", LNCS 545, 1991. [23] Jourdan M., Parigot D. \The FNC-2 system user's guide and reference manual", INRIA, 1993. [24] Jourdan M., Parigot D., Julie C., Durin O. & Lebellec C. \Design, Implementation and Evaluation of the FNC-2 Attribute Grammar System" Proc. of the ACM SIGPLAN Conf. PLDI, 1990. [25] Julie C. & Parigot D. \Space Optimisation in the FNC-2 Attribute Grammar System", Proc. of WAGA, LNCS 461, 1990. [26] Kahn G., Lang B., & Melese B. \Metal : a formalism to specify formalisms" Science of Computer Programming, vol 3, North-holland, 1983. [27] Kahn G. \Natural Semantics" Proc. of Symp. on Theoretical Aspects of Computer Science, Passau, Germany, LNCS 247, 1987. [28] Kastens U. \Ordered Attribute Grammars" Acta Informatica 13, 1980. [29] Kastens U. \Implementation of Visit-Oriented Attribute Evaluators", Proc. of \The International Summer School on Attribute Grammars, Applications & Systems", LNCS 545, 1991. [30] Knuth D. E. \Semantics of Context-Free Languages" Math. Syst. Theory 2, 1968. [31] Lang B. \The Virtual Tree Processor" in Generation of Interactive Programming Environments, intermediate report, J. Heering, J. Sidi, A. Verhoog (eds), CWI Report CS-R8620, Amsterdam, 1986. [32] Magnusson B. et al. \An overview of the Mjlner/Orm Environment: Incremental Language and Software Development" Proc. TOOLS'90, Paris 1990. [33] Micallef J. \Incremental attribute evaluation for multi-user semantics-based editors" PhD Thesis, Columbia University, 1991. [34] Naish L. \The MU-Prolog 3.2 Reference Manual", Technical Report 85/11, Dpt of Computer Science, University of Melbourne, 85. [35] Parigot D. \Transformation, Evaluation incrementale et Optimisation des Grammaires Attribuees : le systeme FNC-2" Doctoral Thesis, Orsay, 1988. [36] Pennings M., Swierstra D., Vogt H. \Using Cached Functions and Constructors for Incremental Attribute Evaluation" Proc. of PLILP'92, LNCS 631, Leuven, 1992. [37] Pettersson M. \RML - A new language and implementation for Natural Semantics" Proc. of PLILP'94, Madrid, 1994. [38] Plotkin, G.D. \A Structural Approach to Operational Semantics" Report DAIMI FN-19, Computer Science Department, Aarhus University, Aarhus, Denmark, 1981. [39] Reps T. \Generating Language based Environments" M.I.T. Press, Cambridge, Mass, 1984. [40] Roussel R., Parigot D., and Jourdan M. \Coupling Evaluators for Attribute Coupled Grammars" Proc. of the International Conference on Compiler Construction CC'94, Edinburgh, 1994
16
Isabelle Attali , Didier Parigot [41] Tavernini V. E. \Translating Natural Semantic Speci cations to Attribute Grammars" Master's Thesis, University of Illinois, Urbana-Champaign, 1987. [42] Terrasse D. \Translation from Typol to Coq" INRIA Research Report in preparation. [43] Van Der Meulen E. \Incremental rewriting" PhD Thesis, University of Amsterdam, 1994
Unite´ de recherche INRIA Lorraine, Technopoˆle de Nancy-Brabois, Campus scientifique, 615 rue du Jardin Botanique, BP 101, 54600 VILLERS LE`S NANCY Unite´ de recherche INRIA Rennes, Irisa, Campus universitaire de Beaulieu, 35042 RENNES Cedex Unite´ de recherche INRIA Rhoˆne-Alpes, 46 avenue Fe´lix Viallet, 38031 GRENOBLE Cedex 1 Unite´ de recherche INRIA Rocquencourt, Domaine de Voluceau, Rocquencourt, BP 105, 78153 LE CHESNAY Cedex Unite´ de recherche INRIA Sophia-Antipolis, 2004 route des Lucioles, BP 93, 06902 SOPHIA-ANTIPOLIS Cedex
E´diteur INRIA, Domaine de Voluceau, Rocquencourt, BP 105, 78153 LE CHESNAY Cedex (France) ISSN 0249-6399