CENTAUR: Towards a \software tool box" for

0 downloads 0 Views 173KB Size Report
In Centaur we have extended the Le Lisp virtual graphic system to t .... Typically, trees will be constructed by a parser and modi ed by interactive editing. .... vtp classes and primitives can be found in the vtp reference manual (see 15]). ..... 4] Cl ement D., J. Incerpi, \Graphical Objects: Geometry, Graphics, and Behavior", 3rd.
CENTAUR: Towards a \software tooly box" for Programming Environments Dominique Clement SEMA-GROUP, Sophia{Antipolis Janet Incerpi, Gilles Kahn INRIA, Sophia{Antipolis FRANCE 1. Introduction The Centaur system is a generic interactive environment parameterized by the syntax and the semantics of programming languages. When provided with the description of a particular programming language {including its syntax and semantics{ it produces a language speci c environment. The resulting environment consists of a structure editor, an interpreter/debugger, and other tools, together with a uniform graphic interface. The Centaur system is based on results obtained in many branches of computer science, starting from work on semantics in the early 60's leading to work on syntax-directed editors and evaluation mechanisms in the early 70's. The Centaur system was designed to allow experimentation with already existing formal speci cation techniques as well as to support new research on the de nition of speci cation formalisms. From the user's viewpoint, Centaur is a deductive database. Data are programs in various languages and rules are formal speci cations, e.g., syntax and semantic speci cations. Information is either prede ned, as in the type of system objects, or computed using rules on given data, as in the type of an identi er in a program. We give the logical breakdown of the system, i.e., its architecture and its main components. Next we describe in more detail two of these components, namely the kernel and the manmachine interface. Then we present some examples of Centaur's current use. 2. Logical Breakdown of the system The architecture of the Centaur environment generator is based on a decomposition into three components: { a kernel for the representation and manipulation of structured objects, { several compilers to support both the syntactic and the semantic aspects of languages { a man-machine interface that takes care of all interactive communication between the system and its users. The kernel plays a central part in this architecture because it is intended to be used, directly or indirectly, by all other components of the system. Every structured object is represented within the system as an abstract syntax tree that is manipulated with primitives of the kernel. 2.1. The kernel The purpose of Centaur's kernel is to support symbolic manipulation of structured documents. By symbolic manipulation we mean destructive manipulation, typically used while y

This research is partially supported under ESPRIT, Project 348. 1

editing documents, as well as evaluation, more common in semantic processing. The kernel of the Centaur system is made of two specialized components: { an abstract machine that handles syntactic aspects, called the Virtual Tree Processor (vtp)[18]. { a logical machine that handles semantic aspects, i.e., evaluations of all kinds. The vtp is used for de ning formalism and supports modular construction of abstract syntaxes. It de nes standardized tree manipulation primitives, (i.e. construction, navigation, and destructive manipulation), a generalized notion of location, and an extendibility mechanism. Finally vtp objects can be saved on persistent storage. At this stage the vtp is implemented in Le Lisp [3], the logical machine is a Prolog interpreter [19]. These two components are intended to be used as co-routines. They are connected through an interface that contains two classes of primitives: { control primitives to call Prolog from within the vtp and to call the vtp from within Prolog, { transformation primitives to perform coercions between vtp trees and Prolog terms. The underlying universe being Unix, the interface between the vtp and Prolog is implemented in C. 2.2. The Compilers To generate an interactive programming environment for a given language one must provide speci cations for its concrete syntax, abstract syntax, and semantics. These speci cations are themselves written in appropriate formalisms. 2.2.1. Specifying syntax The speci cations of concrete and abstract syntax, together with their relationship, are presently written in either METAL [17], a formalism developed for the MENTOR system [8] [9], or in SDF, [11], a new speci cation language designed and implemented for the Centaur system. A METAL speci cation is a collection of grammar rules, with annotations that specify what abstract syntax trees should be synthesized. A SDF speci cation merges all syntactic aspects of a language de nition within a single framework. Both the METAL and the SDF compilers create a vtp formalism and input for a parser generator, Yacc [14] for METAL and an incremental parser generator for SDF [12]. Pretty-printing of abstract trees is de ned in the PPML formalism [20]. A PPML speci cation is a collection of unparsing rules of the form \pattern ?! format". The PPML compiler generates a Lisp program which calls primitives of an incremental Abstract Display Machine. This provides a standardized protocol for displaying structured objects and is useful for mixed-language environments. 2.2.2. Specifying semantics To derive systematically a language-speci c interactive programming environment it is necessary to go beyond syntax and introduce semantic speci cations. For example: { from a speci cation of static semantics, one wants to derive a type checker and contextdependent interactions, { from a speci cation of dynamic semantics, one wants to derive an interpreter and a symbolic debugger. Several formalims could be used to express the semantics of programming languages. We have experimented with a formalism based on Natural Deduction that we call Natural Semantics [16]. A semantic speci cation is written in TYPOL [6], the computer version of Natural Semantics. Other experiments have been carried out using ASF [1], an algebraic speci cation formalism. Both formalisms are type-checked and compiled into executable form. At the present 2

time, speci cations are compiled into Prolog. 2.3. Man-Machine Interface The design of the man-machine interface presupposes that the user is interacting with Centaur via a personal workstation, using a keyboard, a mouse, and a high resolution screen. But graphic interfaces are far from standardized and it is necessary to isolate the programmer from system variations: Le Lisp includes a virtual multi-window graphic system together with an extension mechanism. Already many di erent implementations of the Le Lisp graphic interface have been used: initially using software from Brown University [21], then using the X10 version of the X-Windows graphic software package from MIT, and more recently using the X11 version of X-Windows [10]. In Centaur we have extended the Le Lisp virtual graphic system to t with the X11 interclient protocol. For a user viewpoint, Centaur is a multi-window system which is menu-driven and can be embedded in the user's favorite window manager (for window manipulations such as move, resize, etc.). Using system objects like the Centaur editor the user can manipulate structures and run special processors and translators. Here we believe it is important to disconnect the man-machine interface from the system and to keep the man-machine interface open-ended: { a set of graphical interactors has to be available which must be extensible. { graphic objects are not usually independent, thus a network of dependencies should be maintained automatically to assist the user in keeping a consistent screen. The Centaur interface uses graphical objects [4] such as buttons, menus, popup menus, menubars, scrollbars, etc. But Centaur also provides a framework to de ne specialized graphical objects, for example, a customized editor for displaying Centaur structures. This is achieved by separating graphics from behavior. A set of Le Lisp primitives, the Aida [7] package, is used to describe the geometry of graphical objects. These include basic constructors such as string, point, rectangle, box, etc., and geometric combinators such as row, column, overlay, etc. We see behavior of objects as reactive systems that change state in response to input events (from the mouse and keyboard for example) and also generate output events. The behavior of all objects is written in Esterel [5], a special purpose language for real-time applications [2]. 3. The Virtual Tree Processor The Virtual Tree Processor (vtp) de nes and eciently implements a protocol supporting creation, manipulation, and persistent storage of abstract syntax trees. As a protocol, the vtp is a collection of abstract data types, called classes. Certain functions that are not logically necessary (e.g., iterators) are included to obtain a far more ecient implementation. The protocol is open in that it provides a method for creating new types of objects to enrich the vtp. The speci cation of the various data types yields a natural modularity to the vtp. Another level of modularity results from following an object-oriented approach. The types of objects are hierarchically organized. The main advantage of this architecture is to allow extending the system without any alterations to existing code, as long as the type hierarchy is extended. 3.1. Main concepts of the vtp The abstract concepts de ned and implemented in the vtp are motivated by a number of earlier experiments on syntax-directed document manipulation, in particular with the MENTOR [8] system. The vtp is organized around very few concepts. A formalism embodies the notion of an abstract syntax. A tree belongs to one formalism, but it may have subtrees in other formalisms. Contexts materialize the idea of one or several locations in a tree. Annotations may be hooked to abstract syntax trees to retain additional information without modifying their structure. 3.1.1. Constructing a formalism An abstract syntax is represented in the vtp by an object in the class formalism. A formalism is composed of objects in the classes operator and phylum, where a phylum is a set 3

of operators. Primitives are available to create a new formalism and populate it with operators and phyla. When creating an operator, its arity and the phyla of its operands must be provided. Operators come in three varieties: operators of xed arity, list operators where the list can be empty, and list operators where the list cannot be empty. It is possible to iterate over all operators in a formalism, all phyla in a formalism, and all operators in a phylum. When all the phyla and all the operators of a formalism have been given, the formalism is declared complete. At this time a number of global calculations are performed that result in faster tree traversals and more compact tree representations in persistent storage. The formalism has a persistent representation: it may be saved in a le and read in at a later time. The natural way to create a new formalism is to use a speci cation language. A compiler for the speci cation language will then use the primitives provided by the vtp to create the formalism. For example, both the language METAL [17] and the language SDF [11] are currently used to specify simultaneously the concrete and abstract syntax of a language. It would be possible to experiment with other syntax speci cation languages without changing the rest of the system because all other components of Centaur are only interfaced with vtp objects. 3.1.2. Trees With an operator and the appropriate number of subtrees one can create a new tree. The tree is then said to belong to the formalism of its top operator. The vtp provides primitives to navigate in a tree (such as, down, left, right, up, etc.) and to modify a tree (such as, replacing a subtree by another one, inserting and deleting elements in lists, etc.). All modi cations fail if they are incorrect with respect to the abstract syntax. Syntax checking is strictly local and very fast. It is possible to iterate eciently over all subtrees of a tree. Finally, trees have a representation in persistent storage. Typically, trees will be constructed by a parser and modi ed by interactive editing. The vtp de nes the protocol that any parser must use to create trees, but it is not committed to any particular kind of parser. Likewise, it provides the primitives to build an interactive structure editor but it does not enforce a speci c mode of interaction. 3.1.3. Pattern Matching The vtp includes primitives based on the use of tree pattern matching, such as, simple pattern matching, global search of a tree pattern, and application of a function to all occurrences of a tree pattern. Here, tree pattern has the usual meaning, i.e., a linear tree expression containing metavariables. A vtp metavariable is primarily de ned by a name and a matching class. The name of a metavariable is immaterial, except in pattern matching primitives. The class of a metavariable de nes the class of values that can be matched. In the current version of the vtp only two classes can be used in metavariables, ftreeg and fsublistg. The class fsublistg denotes a sequence of adjacent sons within a tree, i.e., a metavariable of class fsublistg matches any sequence of adjacent sons in a tree headed by a list operator. Within a given formalism, a vtp metavariable is considered as \equal" to any tree which belongs to that formalism. But it is possible to restrict the domain of values for matching equality by associating a constraint to a metavariable. Here a constraint is expressed as a pair of a vtp class and a value in that class. For example to restrict matching equality on trees labelled with a particular operator, one associates to the metavariable a constraint on the foperatorg class with that operator as value. A tree is an instance of a tree pattern if they are equal (as trees), except for occurrences of metavariables in the pattern (where each metavariable must be \equal" to the corresponding subtree). The vtp provides primitives for creating prede ned tree patterns for all operators of a formalism, also called schema. For example, the \if" schema in a Pascal-like language is a tree labelled with the \if" operator and with three sons, respectively, a metavariable for the conditional, a metavariable for the \then", and a metavariable for the \else". The name of each metavariable is computed from the operator argument's phyla (e.g., EXP, STATEMENTS1, 4

STATEMENTS2). These names are immaterial, but they must be di erent. 3.1.4. Contexts An essential aspect of the vtp is that it supports imperative manipulation of formal documents. Thus the abstract representation of a document is not a recursive structure that is built and processed solely in a purely applicative way. Rather it is a kind of database to explore and modify. As a result it is frequently necessary to keep references to places of interest in the manipulated structure. Contexts refer to locations rather than sub-objects. A context value is thus a place within a mutable object, where a component of that mutable object may be stored. The vtp includes an abstract notion of context to designate (multi-)sets of subtrees within a given tree. This is based on paths [15], a generalized notion of occurrences. A path can be read as the list of moves to perform to go from some enclosing tree to each of the designated subtrees. Paths are occurrences. But, at each position in a tree structure (after some moves are performed), a path can split at a \crossroad", where each branch heads towards other designated subtree(s). The vtp provides path constructors, e.g., to build a path designating a tree within a larger tree, and path manipulation primitives, e.g., to merge paths within a \multiple" path. So far, there are two kinds of paths in Centaur, namely \selection paths" and \modi cation paths". The former denotes multi-sets of locations within tree structures; the latter designates the subparts of a given tree structure where some destructive modi cation (change, delete, insert) has occurred (this is with respect to a given starting date and a given starting version of the tree). Typically, a selection path is associated with the \root" of a tree by creating a selection. As mentioned above, modi cation paths are used to memorize tree modi cations. All Centaur \tools" (pretty-printers, compilers, etc.) take a tree as input. When a tool is incremental, it is useful to provide the set of modi cations made on the tree since last run of the tool. A modi cation is a pair of a tool and a modi cation path. Every time a tree is modi ed, the change is accumulated in each tool's modi cation path. 3.1.5. Annotations Annotations are a more formal and structured view of the old concept of property lists, originally introduced in Lisp. The idea is to attach to objects a list of independently managed tagged values, where each tag, or property, characterizes the role of the value. In the VTP, the role of an annotation must be declared by rst de ning a decor. For example, one can de ne a decor called \pragma" that can be used to attach string annotations to nodes of abstract syntax trees representing Pascal programs. The de nition of a decor includes its name (e.g., pragma), the type of the annotation itself (e.g., string) and the type of objects that may be decorated by annotations in this decor (e.g., Pascal programs). In the vtp, the role of decors and annotations has been generalized. It is possible to de ne decors and annotations for other classes of objects than trees. For a given class of objects, new decors can be dynamically created, and the corresponding annotations can be attached to the objects in this class. Decors are essentially a way to de ne dynamically and independently new elds or properties. They are used as a modularity device to extend the functionality of vtp classes without interacting with their already existing semantics. Most applications need a general mechanism to decorate abstract representations. Using annotations for document organization and manipulation is discussed at length in [8]. One important application is to use annotations to store attributes, in the precise sense of attribute grammars [22]. Also annotations are used to dynamically extend the abstract syntax of a formalism, for example to handle comments and formal assertions. Navigation primitives (e.g., \down", \left", \right") can be used on annotated structures, by providing a decor name (e.g., pragma) instead of a rank. 5

3.2. Organization of the vtp The vtp is organized as a collection of classes (in the Simula 67 or Smalltalk sense). A vtp class is characterized by a unique class name, an abstract domain of values, a speci cation of an internal representation (in memory) of the values belonging to the class, and a collection of primitive functions and procedures operating on the values of the class. When specifying the vtp, we have chosen not to use concepts such as subclassing and inheritance which were not yet cleanly de ned and implemented in the available version of Le Lisp. They are, however, useful concepts for specifying the vtp since they o er a means of factorizing common aspects of di erent classes. Thus the concept of class schema was introduced for the purpose of the organization of the vtp. A class schema is abstractly similar to a class, the main di erence being the absence of a speci c implementation model. A class schema may have several implementation models available at the same time, each corresponding to a di erent class. A class is said to be an instance of a class schema if it is an implementation model of the class schema. Here, an implementation model of a class schema only means an abstract data type that has all the primitive operations of the class schema, with their semantics (informally) de ned in English. Following the usually accepted terminology, we shall say that an instance of a class schema, i.e., a vtp class, inherits all primitives and properties of the class schema. 3.2.1. The class schema hierarchy We only describe the class schema hierachy, giving some class schema signatures. A complete description of the vtp classes and primitives can be found in the vtp reference manual (see [15]). The class schema universal is a superclass of all vtp classes and class schema, i.e., the primitives de ned in that class schema are assumed to exist for all vtp classes. This class schema can be described by the following signature: signature UNIVERSAL = sig type universal equal : universal * universal copy : universal write : universal save : name * universal restore : name end

-> -> -> -> ->

universal universal universal universal universal

The functions save and restore translate objects between memory and persistent storage. The class schema arborescent is used to specify tree structures. It includes navigation primitives, modi cation primitives, and pattern matching primitives. In particular, its signature contains navigation primitives: signature ARBORESCENT = sig type arborecsent down : arborecsent left : arborecsent right : arborecsent up : arborecsent ... end

* * * *

path path path path

-> -> -> ->

6

arborecsent arborecsent arborecsent arborecsent

where a \path" value is either a rank number or a decor name. This class schema makes few assumptions about the syntactic constraints that must be satis ed when building or modifying arborescent structures. This permits two di erent tree classes, one with syntactic constraints and one without. 3.2.2. Current Implementation The vtp classes could have been formally speci ed by any appropriate mathematical technique (e.g., abstract algebras), but we have not attempted to do this for initial versions of the vtp. Only informal descriptions in English are given, followed by the actual Le Lisp implementation. The primitives provided for each class are sometimes redundant. This is intended both as a convenience for the user of the vtp and as a simple means of achieving better code eciency. A property of many of the vtp classes is that the objects they de ne can be saved on persistent storage. Persistency is obviously necessary for saving objects for later use. In the current version of the vtp persistent strorage is achieved by giving a persistent storage representation for all classes that one wants to save. This external representation of the values belonging to a class is part of the speci cation of that class in order to preserve compatibility between distinct versions and/or implementations of the vtp. However, knowledge of these representations is needed only by vtp implementors, and by vtp users who de ne new classes with persistent storage representation. Calls to vtp functions may fail, for example, when requesting a computation that is not meaningful for the supplied data (e.g., requesting the rst son of an atomic node of a tree), or when requesting a modi cation of the manipulated structures that would cause them to become inconsistent (e.g., a tree transformation not consistent with the syntax of the represented language). The signalling and handling of such failures is done via Le Lisp exception primitives such as tag, exit, lock, and others. A failure tag is associated to each such failure of the vtp functions. The major drawback of the current implementation of the vtp is a direct result of the absence of syntax and type structure. In particular there is no assistance with respect to syntax or type correctness. More importantly there is no way to enforce that vtp classes are well-typed instances of their class schemas (except for the unreliable good will of the implementors). The current version of the vtp does not support modular composition of abstract syntaxes, but only extensions of a given formalism. The current implementation uses the exception mechanism provided by the Le Lisp system, leading to low-level primitives and no clear organization of exceptions. Although object-oriented principles are used for specifying the vtp, all vtp classes are implemented without using subclassing and inheritance. When type tags are needed at run-time, objects are explicitly encapsulated in tagval objects, where a tagval object is a pair of any object together with a descriptor of the class it belongs to. This means that tags must be explicitly handled by vtp users. Finally, only persistent storage of trees (and of classes used in trees) are currently implemented. In particular graphs or dags cannot be saved on les using the vtp \save" primitive. However it is always possible to use Le Lisp saving facilities, but possibly without compatibility between versions. 3.3. Discussion The vtp is a stable and trustworthy component of Centaur . The vtp symbolic path mechanism provides a powerful communication mechanism between Centaur tools, provided that they work on structures with an abstract representation. For example a type-checker can generate a path denoting error occurrences within a structure, which can be used by an interface component, typically the Centaur structure editor, to give a graphical representation to these errors. Note that the type-checker and the interface component do not have to share the same internal representation of the abstract structure. In particular, Centaur paths can be used to 7

make the vtp cooperate with external tools. Within the above mentioned example, the interface component does not have to be integrated within Centaur , provided that symbolic paths can be given to that component. In other words, vtp paths can be used as symbolic pointers. The performance analysis of the vtp is generally dicult, because various processors use the vtp in fairly di erent ways. Ineciencies often point to an inadequate choice of the primitive functions for some class. This has motivated adding several iterators to the vtp. Translating between computational and persistent storage must be made faster and should produce even more compact representations. 3.4. Future Work Because the vtp has proved to be a very useful component for building several systems, in particular the Centaur system, we have recently started new activity devoted to the design and the implementation of an extended version of the vtp, a Virtual Abstract Structure Processor (vasp). This activity includes a more formal de nition of the class system, a better organization of exceptions, and modular composition of abstract syntaxes. To achieve these goals we aim at using recently developed speci cation techniques. At least we expect to describe class schema as signatures and classes as structures providing typechecked implementations of these signatures. We want to give formal semantics of vasp primitives. Finally, we expect to have several implementations of vasp. The main advantage of the notion of persistency provided by the vtp is to warrant compatibility between versions, as long as class coding representations are invariant. However, this is achieved at the cost of explicit descriptions of the representations used. We feel this is not satisfactory compared with other approaches such as dynamic typing. But dynamic typing techniques are dicult to use with di erent versions and implementations. Persistency is a current subject of research. 4. The Man-Machine Interface The Centaur system has a multi-window, menu-driven user interface. The user interface is based on the virtual bitmap of Le Lisp. Centaur can be used interactively on color or monochrome bitmaps. Our two primary goals for the Centaur man-machine interface are: i) that the interface is disconnected from the system itself and ii) that the interface is open-ended. This implies that we provide a set of graphical interactors that is extensible. Another consideration, within the framework of Centaur, is language-speci c environments. That is, a developer of an environment under Centaur may want to customize the user interface in a language-dependent fashion. In the following section we present the various layers used in the user interface. Next we describe the current implementation and issues that it has raised. Finally, we nish with a discussion of where we are headed with the man-machine interface. 4.1. Organization and Architecture We assume that a user is interacting with Centaur via a workstation with a high resolution bitmap and a mouse. To build a man-machine interface for a system requires the following elements: i) a low-level system to create and manipulate windows, ii) a library of graphic objects such as buttons, menus, etc., and iii) a higher level system to help connect graphic objects for interactive applications. In this section we look at the di erent aspects of the Centaur user interface. These include a virtual window system, a toolkit for graphical objects, and some specialized objects we developed for Centaur. Finally, we address the issues of language-speci c environments. 4.1.1. Virtual Primitives As we mentioned above, at the base of a man-machine interface is a low-level system which manipulates windows and handles keyboard and mouse input. Le Lisp [3] provides such a virtual 8

multi-windowing system. This virtual bitmap system has been implemented on a number of windowing systems, such as Sunviews [23], X (versions 10 and 11) [10], and MS-windows. This virtual system provides the basic graphic primitives. Primitives exist for displaying text and for manipulating fonts and bitmaps. Also there are drawing primitives for rectangles (resp. lled rectangles), circles (resp. lled circles), etc. At this virtual bitmap layer, there are primitives for manipulating windows and subwindows. Windows can be created, modi ed, moved, resized, and destroyed. Note that the virtual bitmap is responsible for communicating with the underlying window system. It maintains a Le Lisp window structure which is consistent with the underlying external system's window structure. Communication with the underlying system also exists for input events (e.g., mouse and keyboard input) and window-related events (e.g., resizing or exposing windows). The virtual bitmap has an event structure and maintains a queue of events. There exist low-level primitives for manipulating this event queue. 4.1.2. A Basic Toolkit Using a library of graphic objects such as buttons, menus, scrollbars, text-editable areas, one can build the necessary components that make up a user interface. Commands can be associated to buttons, menus, or menubars. Dialog boxes can be assembled to query the user for information. As we mentioned above, it is important that this library is extensible, so that new objects can be de ned. We use graphical objects provided by a system called gfxobj [4]. This gives a library of standard graphical objects and a framework for de ning specialized objects. To de ne the graphical objects we specify separately their appearance and their behavior. For the appearance or geometry of the objects we use an object-oriented approach. The objects, which are organized in a hierarchy, can be combined using geometric combinators to make compound objects from existing ones. This hierarchy is based on an object language de ned in the Aida package [7]. This package, which is a set of Le Lisp primitives, includes basic constructors, such as string, box, icon, etc., and geometric combinators, such as row, column, overlay, etc. Using these primitives we can describe the \look" of various objects. Aida provides as well a motor for handling the events that Le Lisp receives. This motor, or event manager, sends the various events to the graphical objects. For the behavior, we view graphical objects as reactive systems that change state in response to input events and generate output events. To de ne the behavior for the graphical objects we use Esterel [2], a synchronous language designed to program reactive systems. At a general level, an Esterel program is a collection of processes that communicate via broadcast signals. An external interface describes how a module communicates with its surroundings. While an Esterel program contains parallel constructs it compiles into a nite state automaton, thus the resulting code is very ecient. The resulting automaton is hooked to the various input and output events. (The interested reader should see [5] for more details.) Note that from the gfxobj system, one can build the various components of the interface. There are trigger and trill buttons, pop-up and pulldown menus, menubars, scrollbars, scrollable objects, etc. The object-oriented approach makes it possible to add new graphical objects, or even new geometrical combinators. 4.1.3. System Objects As with any sophiscated application, for Centaur we need some specialized graphical objects. These are graphical objects that no standard toolkits provide. One such object is the Centaur structure editor; another object is an editor-view, which is a special container object. In this section we describe these two graphical objects and their use in Centaur. The Centaur structure editor, also called ctedit, is a graphical object that is used to display a textual representation of the abstract trees of Centaur. Note that, like most editors, a bu er is associated to a ctedit. Typically this bu er is larger than the ctedit. Thus, what is visible through the ctedit's window is some portion of the tree. Again like most editors, there 9

are primitives for positioning oneself in the bu er. Finally, what is displayed is a data structure representing the decompilation of the abstract syntax tree. This data structure permits the selection of regions, which correspond to subexpressions. Thus, in addition to displaying the visible portion of the bu er, the ctedit also displays selections that are to be highlighted. The editor-view is a graphical object that can be used as a container for another objecty . The idea behind the editor-view is that one has the viewee and one would like to access its various functionalities interactively, say for example, through a menu. The editor-view is a compound object that is an ensemble of buttons, menus, scrollbars, etc., and it is possible to attach the viewee's functionalities to these subcomponents. The appearance of the editor-view, of course, depends on what kind of subcomponents one wants. However, we restrict subcomponents to appear above and/or to the left of the viewee. Also it is possible to have scrollbars for the viewee. The subcomponents and their layout must be supplied to create the editor-view. For example, perhaps one wants a row of buttons above the viewee, a column of radio buttons to the left, etc. Finally, notice that the editor-view has no behavior which is why we refer to it as a container. The subcomponents have their own behaviors as does, we suppose, the viewee. Note that the editor-view's main purpose is to facilitate the user in customizing his Centaur environment. This can be done by setting up his own environment for interacting with a ctedit used as a viewee. Or perhaps, he has a special purpose object that he would like to put in an editor-view. 4.1.4. Designing Language-Speci c Environments The purpose of Centaur is to develop interactive programming environments for di erent languages. In this section we look at the kinds of issues that have been raised with regards to the user interface. Of course, the system of graphical objects and the specialized Centaur editors are available for building a user interface. Within Centaur, several environments exist for the various languages used to specify the syntactic and semantic aspects of a target language. Since these environments must access language-dependent functionalities, we provide a generic base that can be modi ed dynamically. In particular, we have created a standard Centaur editor-view, called ctview. This is an incarnation of an editor-view which contains a ctedit and has certain generic functionalities available. For example, the ctview contains two scrollbars and a menubar above the ctedit. The menubar gives access to generic le operations (e.g., read and write), display operations (e.g., page width and level of detail), and edition manipulations (e.g., cut and paste). Since in Centaur one can open editors on several di erent formalisms at any one time, it is when a new formalism is read into a ctview that we automatically allow for language-speci c changes. Here the developer for a language environment can change the look of the window: add buttons or a menu to the left of the ctedit, remove or add a pulldown to the menubar, remove the menubar, etc. This allows the developer to put in place the interactors that will be used for calling various processors, searching by language-dependent constructs, and primitive debugging (e.g., stepping and break-points). For dialog boxes and the such, these are created using the gfxobj primitives when the appropriate processors are called. Note that much of what is possible for the language environment developer is also desirable by the simple user. Perhaps, he wants to customize a bit the generic look of the ctview; that is, he would like some generic navigation primitives for walking through trees always accessible through a menu. The above is only concerned with what happens basically in one ctview, however, more generally a processor can open several ctviews or other graphical objects which must communicate. For example, after running a compiler on a program perhaps a list of errors and warnings appear in a separate graphical object on the screen. In this case, a mechanism is needed to associate

y In the following discussion, we refer to the object contained within an editor-view as the

\viewee" to avoid confusion.

10

this object with the originating ctview. We provide a primitve method to link graphical objects. 4.2. Current Implementation While the previous section gave an idea of the organization of the man-machine interface within Centaur, here we talk about the current implementation and the problems encountered. First, we address the implementation of the various layers that go into building interfaces. Next, we address the customization problems encountered when a user or developer wants to change the interface. Finally, we talk about the interconnecting of graphical objects. 4.2.1. The Various Levels By taking advantage of the Le Lisp virtual bitmap, the man-machine interface requires only that Le Lisp has been ported; the Le Lisp virtual bitmap has been ported to a relatively large number of windowing systems. One could claim that Xwindows is the answer but through the development stage of Centaur we have had versions running under Sunviews, X10, and now nally X11. Note that the virtual bitmap also permits extensions if they are necessary for the underlying window system. Perhaps here we should talk, as well, about window managers. Centaur does not provide a window manager, nor does Le Lisp. One calls Centaur from within the user's favorite, or default, window manager. Operations on the windows themselves, such as resize, move, and iconify, are handled by the window manager. We suppose that the window manager, or underlying window system, informs Le Lisp about these modi cations. The gfxobj system and the Aida object language are implemented in Le Lisp. Using the Le Lisp object hierarchy, one can send messages to the various objects. A graphical object has an associated Le Lisp structure. This includes an image, which knows how to display itself, and, possibly, a window. The Esterel used to describe the behavior of the graphical objects compiles into various languages (C, Le Lisp, Ada); we use Le Lisp automata. The connection to the graphical objects themselves is a mapping of Le Lisp events, as managed by the Aida motor, to the corresponding automaton signals. The automata are very ecient, the performance of the graphical objects is good. 4.2.2. Customizations The system objects, editor-view and ctedit, are the basic objects through which one interactively manipulates the abstract syntax trees in Centaur. The editor-view, in fact, provides a certain level of customization for the look of interface, allowing one to easily combine menus, buttons, etc. with a ctedit. This is true both for the user or the environment developer. The user wants to customize the overall interface for his Centaur, while the environment developer changes are more dynamic in nature. (In the sense that, like in emacs, the switching of modes, which is dependent on the bu er contents, is dynamic.) Currently, we provide primitive mechanisms to change the interface dynamically. This is done on a language-speci c basis. When one loads a program in a language into the standard Centaur editor-view, we clear, if necessary, the editor-view's previous language environment and load in the new language environment. Thus permitting the developer to attach the necessary functionalities. In addition, the evironment developer has all the standard primitives for creating dialogs and additional system objects at his disposal. The current system works reasonably well for the type of environments that Centaur is designed for. Various environments have been designed using these primitives. The main problem is that this speci cation is all done by program: calling primitives to create the graphical objects and other primitives that dynamically change the editor-view. Also the linking of the graphical objects together so that they can communicate is also done via Le Lisp primitives. (We return this in the following section.) 4.2.3. Interconnecting Objects Graphical objects are not usually independent. Processors may create graphical objects that are used to interact with the Centaur structure editor. In fact, the environment developer may 11

have multiple editors in communication. Perhaps, a compiler creates an editor-view containing all the error messages and warnings, or an interpreter displays the current values of the variables during the execution of a program. This kind of interconnection between objects is possible using \links". A link simply stores in a graphical object a value associated to a certain key. This primitive mechanism is rather ad hoc. We use two special methods for gfxobj to attach and detach links. However, it is currently up to the environment developer to manage this linking and unlinking by using the appropriate primitives. 4.3. Future Work Having described the current state of the Centaur man-machine interface, we present directions for future work on the user interface. Since the work on the gfxobj system was completed, many more toolkits have come into existence. These are, more or less based, on the same principles: a reasonable sampling of widgets, portability, and extensibility. Currently, we are not considering a change of course with respect to the widgets. However, we have experimented with incorporating X11 toolkit widgets within Centaur; the developers of Aida have also looked into similar strategies. For the Centaur user interface, we want to focus our attention more on the problems of dependencies and communication between objects. In particular, how the components of a dialog interact, and, similarly, how the dialogs interact with one another. This seems only natural since the building blocks are now in place and we know the various diculties in generating some environments ourselves. Somewhat lacking, in all this, are tools for developing interactively the interface one wants to use within Centaur. Clearly, one could imagine a Centaur environment for gfxobj where one constructs necessary interactors, although such tools exist and can be used independently of Centaur. For example, there exists an interactive editor for Aida, called Masai [13], which generates the corresponding Aida graphical objects. We are experimenting in this area as well, however, this is not a major concern. 5. Using CENTAUR as a tool box To a certain extent we use Centaur as a tool box for developing new components and new compilers. Here the notions of abstract syntaxes and generalized locations are of primary importance because they provide a uniform and somewhat implementation independent framework for memorizing information on objects. An example is the recently integrated SDF compiler. A less obvious example is the design and development of the graphical objects system which was done independently of Centaur and later used for de ning and implementing the Centaur user interface. This was made possible because of a clear separation between functionalities and graphics. But Centaur is also used as a tool box in projects where external software tools are connected to Centaur. Firstly the syntax part of Centaur, i.e., its vtp kernel and syntax compilers, is used for developing an interactive environment for vectorizing FORTRAN programs. This is based on coupling an already existing program vectorizer to a Centaur generated FORTRAN environment. One of the major achievements will be a bi-directional communication between user programs and their vectorized forms allowing future developments of interactive vectorizers, symbolic debuggers, etc. Here the Centaur technology provides the vtp and its generalized notion of location together with its tools for building a user interface. Secondly both the syntax and the semantic aspects of Centaur are used for connecting a Centaur generated ADA environment with external ADA tools (in particular an ADA compiler and an ADA library manager). This work shares a lot of similarities with the FORTRAN environment but it also includes using Centaur for \specifying" the static semantics of ADA and for de ning a formal communication protocol based on attributed syntax trees. Here the notion of persistent storage is central. 12

6. Conclusion We believe that Centaur o ers interesting features for a software tool box, in particular its architecture, the vtp, the generalized notion of location, and its graphical objects system. But the current form of persistency provided by the vtp is quite insucient when the communication of data between di erent software tools is needed. Here a more general mechanism based on the notion of \dynamic" types would be useful. Modular construction of formalisms and dynamic extendibility of structures are very important during developing phase. The graphical objects system is a rst step towards integrating results of work on \building user interfaces" with systems like Centaur. Here much more work is necessary in particular for expressing what we call the \logical control" of the system.

Acknowledgements

The paper describes the work of several teams involved in the Esprit GIPE project: the Croap group led by Gilles Kahn at INRIA, Sophia Antipolis and led by Bernard Lang at INRIA, Rocquencourt; the CWI team led by Paul Klint in Amsterdam; the Sema Group team led by Dominique Clement in Sophia Antipolis.

REFERENCES [1] Bergstra J.A., J. Heering, P. Klint \ASF { an algebraic speci cation formalism", Report CS-R8705, Centre for Mathematics and Computer Science, Amsterdam, 1987. [2] Berry G., P. Couronne, G. Gonthier, \Synchronous Programming of reactive systems: an introduction to ESTEREL", in Proceedings of the First France-Japan Symposium on Arti cial Intelligence and Computer Science, Tokyo October 1986, North-Holland. [3] Chailloux J., et al. \LeLisp v15.22:Le Manuel de Reference", ILOG, France, 1989. [4] Clement D., J. Incerpi, \Graphical Objects: Geometry, Graphics, and Behavior", 3rd Annual Report, Esprit Project 348, 1987. [5] Clement D., J. Incerpi, \Specifying the Behavior of Graphical Objects Using Esterel", Proceedings of TAPSOFT'89 Colloquim on Current Issues in Programming Languages, Barcelona, March 1989. (Also appears as INRIA Rapport de Rercherche No. 836, April 1988.) [6] Despeyroux T., \Executable Speci cation of Static Semantics", in Semantics of Data Types, Lecture Notes in Computer Science, Vol. 173, Springer-Verlag, June 1984. [7] Devin M., et al., \Aida: environnement de developpement d'applications", ILOG, France, 1987. [8] Donzeau-Gouge V., G. Kahn, G. Huet, B. Lang, J-J. Levy, \Programming environments based on Structured Editors: the MENTOR experience", in Interactive Programming Environments, D.R. Barstow, H.E. Shrobe and E. Sandewall (Eds.), McGraw-Hill, 1984. [9] Donzeau-Gouge V., G. Kahn, B. Lang, B. Melese, E. Morcos, \Outline of a tool for document manipulation", Proceedings of IFIP Congress 1983, Paris, North-Holland. [10] Gettys J., R. Newman, R.S. Scheifler, \Xlib { C Language X Interface, Protocol Version 11", MIT project Athena, February 1987. [11] Heering J., P.R.H. Hendriks, P. Klint, J. Rekers, \SDF Reference Manual", Centre for Mathematics and Computer Science, Amsterdam, 1987. [12] Heering J., P. Klint, J. Rekers, \Incremental generation of parsers", Centre for Mathematics and Computer Science, Amsterdam, 1987. 13

[13] Ing B., \Masai Version 1.0 The User and Reference Manuals", ILOG, France, 1989. [14] Johnson S.C. \Yacc: Yet Another Compiler Compiler", Unix Programmer's Manual, 7th Edition, Bell Telephone Laboratories, January 1979. [15] Kahn G., et al, \The Centaur 0.9 Documentation", June 1989. [16] Kahn G., \Natural Semantics", Proceedings of STACS 1987, Lecture Notes in Computer Science, Vol. 247, Springer-Verlag, March 1987. [17] Kahn G., B. Lang, B. Melese, \METAL: a formalism to specify formalisms", Science of Computer Programming, Vol. 3, pp. 151-188, North-Holland, 1983. [18] Lang B. \The Virtual Tree Processor" in Generation of Interactive Programming Environments, Intermediate report, J. Heering, J. Sidi, A. Verhoog (Eds.), CWI Report CS-R8620, Amsterdam, May 1986. [19] Naish L., \The MU-Prolog 3.2 Reference Manual", Technical Report 85/11, Department of Computer Science, University of Melbourne, October 1985. [20] Morcos-Chounet E., A. Conchon, \PPML, A general formalism to specify prettyprinting", Proceedings of IFIP Congress 1986, Dublin, North-Holland. [21] Pato J.N., S.P. Reiss, M.H. Brown, \The Brown Workstation Environment", Brown University CS-84-03, October 1983. [22] Reps T., \Generating Language Based Environments", Ph.D. Thesis, Tech-Report 82-514, Cornell University, Ithaca NY, August 1982. [23] Sun View Programmer's Guide. Sun Microsystems, Inc. Mountain View, California, February 1986.

14

Table of Contents

1. Introduction . . . . . . . . . . . . . . 2. Logical Breakdown of the system . . . . . . . 2.1. The kernel . . . . . . . . . . . . 2.2. The Compilers . . . . . . . . . . . 2.2.1. Specifying syntax . . . . . . . 2.2.2. Specifying semantics . . . . . . 2.3. Man-Machine Interface . . . . . . . . 3. The Virtual Tree Processor . . . . . . . . . 3.1. Main concepts of the vtp . . . . . . . . 3.1.1. Constructing a formalism . . . . . 3.1.2. Trees . . . . . . . . . . . 3.1.3. Pattern Matching . . . . . . . 3.1.4. Contexts . . . . . . . . . . 3.1.5. Annotations . . . . . . . . . 3.2. Organization of the vtp . . . . . . . . 3.2.1. The class schema hierarchy . . . . . 3.2.2. Current Implementation . . . . . 3.3. Discussion . . . . . . . . . . . . 3.4. Future Work . . . . . . . . . . . 4. The Man-Machine Interface . . . . . . . . . 4.1. Organization and Architecture . . . . . . 4.1.1. Virtual Primitives . . . . . . . 4.1.2. A Basic Toolkit . . . . . . . . 4.1.3. System Objects . . . . . . . . 4.1.4. Designing Language-Speci c Environments 4.2. Current Implementation . . . . . . . . 4.2.1. The Various Levels . . . . . . . 4.2.2. Customizations . . . . . . . . 4.2.3. Interconnecting Objects . . . . . . 4.3. Future Work . . . . . . . . . . . 5. Using CENTAUR as a tool box . . . . . . . . 6. Conclusion . . . . . . . . . . . . . .

i

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

1 1 1 2 2 2 3 3 3 3 4 4 5 5 6 6 7 7 8 8 8 8 9 9 10 11 11 11 11 12 12 13