The Pan Language-Based Editing System - ACM Digital Library

10 downloads 0 Views 2MB Size Report
Jacob. Butcher,. Bruce. Forstall,. Mark. Hastings, and. Darrin. Lane have all made substantial contributions. ..... 1980. 45. MEnINA-MORA,. R., AND FEILER, P. H..
The Pan Language-Based ROBERTA. University

Editing System

BALLANCE of New Mexico

and SUSAN

L. GRAHAM

University

and MICHAEL

L. VAN DE VANTER

of California

Powerful editing systems for developing complex software documents are difficult to engineer. Besides requiring efficient incremental algorithms and complex data structures, such editors must accommodate flexible editing styles, provide a consistent, coherent, and powerful user interface, support individual variations and projectwide configurations, maintain a sharable database of information concerning the documents being edited, and integrate smoothly with the other tools in the environment. Pan is a language-based editing and browsing system that exhibits these characteristics. This paper surveys the design and engineering of Pan, paying particular attention to a number of issues that pervade the system: incremental checking and analysis, information retention in the presence of change, tolerance for errors and anomalies, and extension facilities. Categories

and Subject Descriptors:

D.2.2 [Software

Engineering]:

editors; [Software Engineeringl: Coding–program ming Environments; H.5,2 [Information Interfaces

General

Terms: Design,

Documentation,

D.2.6

Tools and Techniques;

[Software

Engineeringl:

and Presentation]:

D.2.3

Program-

User Interfaces

Languages

Additional Key Words and Phrases: Coherent user interfaces, colander, contextual constraint, extension facilities, grammatical abstraction, incremental checking and analysis algorithms, interactive programming environment, Ladle, logic programming, logical constraint grammar, Pan, reason maintenance, syntax-recognizing editor, tolerance for errors and anomalies

1. INTRODUCTION Languages

and documents

addition to the natural developers use a variety

play

a significant

language used of more formal

role in software

for written languages

development.

In

human communication, to describe both software

This research was sponsored in part by the Defense Advanced Research Projects Agency (DoD), monitored by Space and Naval Warfare Systems Command under contracts NOO039-84-C-O089 and NOO039-88-C-0292, by IBM under IBM research contract 564516, by a gift from Apple Computer, Inc., and by the State of California MICRO Fellowship Program. A preliminary version of this paper was presented at the Fourth ACM SIGSOFT Symposium on Software Development Environments, December 1990. Authors’ addresses: R. A. Ballance, Department of Computer Science, University of New Mexico, Albuquerque, NM 87131; S. L. Graham and M. L. Van De Vanter, Computer Science Division (EECS), University of California, Berkeley, CA 94720. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. @ 1992 ACM 1049-331 X/92/O100-O095 $01.50 ACM

Transactions

on Software

Engineering

and Methodology,

Vol

1, No

1, January

1992, Pages

95-127

96

R. A. Ballance et al

.

products and the processes Some examples are design documentation languages, languages

for scripts,

often

contain

tions.

For example,

into

ways

schemata,

embedded

and complex The Panz

by which they are developed and maintained. languages, specification languages, structuredprogramming languages, and numerous small

“little

many

and mail

messages,

languages”

that

subroutine

argument sequences. editing and browsing

to exploit

libraries

Furthermore,

impose

define

programs

their

own

minilanguages

1 system

language-based

originated

technology

from

an investigation

to provide

more

support for software The design of Pan

developers and for the documents with which rests on the premise that the bridge between

and

their

will

that itor

provide a generalization to support interactive

software

convenfor long

be intelligent

editing

interfaces,

integrated they work. developers

namely,

of the services of a traditional browsing, manipulation, and

interfaces

interactive modification

edof

documents. The current supports techniques, related

implementation

ongoing

is

The

—Pan

incrementally

documents

that

users

and

mouse-based in the spirit

and maintains

can be shared

with

mix

text-

and language-oriented

text

editing

session can

mechanisms

may

involve

be added

to

for other

and chosen

continuing

system

that

is

[57].

of information

about

manipulations

in the

tools.

field;

Pan

editing

of Emacs

a collection

other

were for

It

analysis

methods,

prototype

as a platform

editing

languages

Extension

multiple-font, builds

program-viewing of this

tool

prototype.

language-based

can freely

same visual —A single

advanced

and extensible

functional

description,

characteristics

as a usable

a multiwindow,

customizable

—New

design,

functional

leverage

fully

–Pan

in language

user-interface

areas.

for maximum research. —Pan

of Pan [9, 21] is a fully

research

is completely

unrestricted.

multiple

languages.

Pan

writing

by

language

services

language

descriptions.

are also provided.

New language description techniques were developed for Pan. Grammatical abstraction establishes formal correspondence between the concrete (parsing) syntax and the abstract syntax constraint gram mars are an adaptation maintenance

for the

specification

and

[6, 71. Information gathered during a memory-resident logic database incrementally as documents change.

for each language of logic programming enforcement

constraint (available

[8, 12]. Logical and consistency

of contextual

constraints

enforcement is retained in to other tools) and revised

Related issues, including support for novice programmers and learning environments, support for program execution, graphical display and editing, and the design and implementation of a persistent database were deferred.

1 Libraries for window systems often have these kinds of interfaces, 2 Why “Pan”? In the Greek pantheon, Pan is the god of trees and forests Also, the prefix “pan-” connotes ‘‘applylng to all, ‘‘ in this instance referring to the multilingual text- and structure-oriented approach adopted for this system Finally, since an editor 1s one of the most frequently used tools in a programmer’s toolbox, the alluslon to the lowly, ubiquitous kitchen utensil is apt. ACM Transactions

on Software Engineering

and Methodology,

Vol

1, No 1, January

1992

The Pan Language-Based This

paper

surveys

reviews

the interactions seemingly sions

the

goals

the implementation

and

of the

of Pan’s

design

components

and

decisions

pervade

the term

and,

the

underlying

.

for

The discussion

and components,

strategies

elsewhere [7, 8, 66]. Throughout the paper

design

Pan prototype.

of the technologies

simple

early

Editing System

language-based

Pan

Detailed

have

indicates

and

emphasizes

in particular,

system.

technology

97

been that

how discus-

presented

one or more

of the facilities mation derived

provided by the system make use of language-specific inforfrom the documents known to the system. In the context of

this

term

paper,

entire

the

collection

one or more

system

of services

documents

Pan

project

editing

/browsing

system)

are used to browse,

encompasses

manipulate,

the

and modify

interactively.

2. LANGUAGE-BASED The

(or that

ENVIRONMENTS

was motivated

by a vision

of language-based

browsing

and

editing systems as the primary interface between people and integrated environments containing the documents they manage. Positioned this way, between users, tools, and documents (Figure 1), such a system is uniquely situated to gather and present information about documents for the benefit of its users, 2.1

The Needs

The

developers

enced

of Developers who

professionals.

read

and

They

are

write

software-related

proficient

with

guages and are usually skilled at authoring Like all users, they may need extra support languages or documents. As Winograd [68] and

Goldberg

[25] argue,

and complex that developers spend modifying, and adapting documents

the

first

place.

the

acquisition

successful present at hand.

activities

and

exploitation

interactive

complex

about

to provide

software

kinds

environment

systems

and

lan-

languages. unfamiliar have

of subtasks,

become

in particular,

of information must

documents

such services,

are experi-

tools

far more time reading, underthan they do creating them in

a variety

of many

development

information

In order

involve

primary

documents in those when confronted with

so large standing,

These

documents

their

suitable

the system

[29,

recognize,

for the particular must

41].

gather,

maintain

A and

task a model

of the syntactic, static semantic, and contextual properties of the document. At the same time, there must be minimal disruption of services in midtask, when documents may be incomplete and inconsistent. An editing interface must not compromise flexibility and power, even for safety

or learnability.

how and when

users

For may

example,

manipulate

many

language-based

documents

syntactic correctness. But both contrary arguments suggest that experienced developers generally will on a natural and convenient mode of text-based system must services.

not

“do

too

much”

[46]

in

ACM TransactIons on Software Engineering

the

editors

as text, in order

form

restrict

to maintain

[67, 691 and experience not tolerate restrictions interaction. Likewise, a of gratuitous

and Methodology,

intrusive

Vol. 1, No. 1, January 1992.

98

R A. Ballance et al

.

Editing

System SystemSexvwes

Editing Interface

Undem.tand

~

Users

* Mampulate Modlficauon

Retieve Actwe Document Repository

f

Fig. 1

Finally, tions

Editing

the

way

languages

are

2.2

The Nature

The documents overlapping aspects.

Some

Old

and

sometimes

and

their

of Software managed

over

time;

project-specific)

languages

services

to the environment.

are

invented must

projects come

revised for

evolve

and

specific

and and

convengo.

extended; needs.

Even new

Develop-

or be abandoned.

Documents by software

of information

structure

changes and

change.

adopted

webs

work

personal,

themselves

environments

and system services in relation

developers

(community-wide,

languages ment

interface

developers

having

is syntactic,

many for

constitute structural,

example,

richly

connected,

as well

as textual,

paragraphs

in

reference

manuals, data definitions in design documents, and statements in programming languages. Other structure stems from the content of the documents. Some document underlying formal

structure can be derived languages, for example,

automatically the connection

from knowledge of between a figure

and references to it in a book, the relationships between declarations, definitions, and uses of variables in a computer program, the call graph of a program, and the relationships among grammatical units defined by a formal syntax. In other cases, important structure is independent of language, for instance, hyperlinks in a hypertext system and the user-imposed hierarchy supported by outline processors. Such structure cannot be inferred and must be retained when provided by developers. The editors ED3 [601 and Tioga [631 and the noninteractive WEB [381 are examples of systems that support user-supplied structure. Some documents are encoded in more than one language, in which case important relationships may cross language boundaries. The Mentor programming environment [201 supports such nesting of languages. Although formal syntax is often presumed to be the most useful structure for the purpose of document display and user structure are at least as important. For example,

interaction, other kinds of language-based formatting

ACM Transactions on Software Engineering and Methodology, Vol 1, No 1, January 1992

The Pan Language-Based (prettyprinting),

which

is intended

traditionally

is based

more

if it is sensitive

helpful

well

as to local

versus within

to scopes,

types,

to distinctions

even

documents.

and

Although

the

piler,

type

a global

produced

by

correctness

interface other

can

checker,

tools

In this tion

facilities

mechanisms, 3.1

OF THE

section

organization,

Design

must that

as

as “mainline”

user tasks

be broad

program

also

provide

require

information

in

plans

F%

are

subject [42,

56]

doand

also

text

other

tool.

be made

useful

Conversely,

visible

to a cominformation

through

an

of programs might exploit addition, as the formatting

enriched

if

user-supplied

editing

measureexample

information

is

SYSTEM

we summarize the

information

or an auditing

should

system services and used widely.

3. DESIGN

relationships,

such

in such a system are economical only when information. The computation that verifies a

interface. For example, helpful views ment results and version history. In suggests, preserved

clef-use

of access to the

information sense

is considerably

and different

forms

99

of Integration

Complex, expensive analyses many tools share the resulting document’s

users

different

and

.

of a document,

Formatting

and

it need not be deep (in the [54] are deep) to be useful.

The Benefits

syntax.

code. Different

uses of structure

main, clichis

to aid user understanding

on surface

conventions

“error-handling”

different

2.3

only

Editing System

the major

editing

than

and services

interface,

language

design

strategies

and those

description.

are described

for Pan,

extension

Language-based

in Sections

the system

and customizatechnology,

4–6.

Strategies

Our vision of the role number of pervasive

to be played by and surprisingly

Pan led to the interdependent

adoption of a small strategies for its

design. Text-based are designed existing

interface. around

working

visual presentation Studies by Baecker value

of typography

The text familiar

environments.

editing

models,

services

and user

to encourage

High-quality

smooth

typography

interfaces

of Pan

integration is

used

into for

the

of text, a feature often not realized in editing interfaces. and Marcus [4] and by Oman and Cook [48] suggest the for program

presentation,

Language description. Pan supports many languages, driven by a description of each. The descriptive medium is largely declarative. It supports the definition of languages, but also includes tions, user interaction, and language-specific Syntax recognition. To present one that also supports language-based as are Babel [31], the Saga editor system is one in which the user syntactic structure by analysis. In

specifications services.

for usage

conven-

the

appearance of a “smart text editor,” interaction, Pan is syntax-recognizing, [37], and SRE [111. A syntax-recognizing provides text and the system infers the contrast, we call systems like the Cornell

ACM Transactions on Software Engineering and Methodology, Vol. 1, No. 1, January 1992.

100

R A. Ballance et al

.

Program

Synthesizer

[61], Mentor

[20], and Gandalf

[261 syntax-directed

since

the primary mode of editing in those systems is template based. The syntaxrecognizing approach does not preclude a user interface that simulates syntax-directed editing. A simple prototype has convinced us that syntax-directed Similarly,

editing

most

editing. mation,

can be provided

syntax-directed

easily

editing

in a syntax-recognizing

systems

support

some

As a by-product of syntax recognition, all language-oriented including the primary internal tree representation shared with

analyses,

is derived

originally

Incrementality.

from

Maintaining

a textual

full

service

during

editing

information be revised as documents change. update algorithms rather than recomputation.

hances

the

tional

retention

of previously

be user-supplied

or otherwise

efficiency

willing

Tolerance mitted

syntax

During

recognition,

and

document

(some

of which computa-

Users

are seldom

functionality. text-oriented

are

that

is organized approach en-

maintains

growth.

the unrestricted documents

demands

Pan This

information

on speed, even for enhanced

for variance.

by

determined nonrecoverable)

in the face of collective

to compromise

text inforother

representation.

derived around may

editor.

localized

most

often

editing ill-formed

perwith

respect to the underlying language definition. Maintaining full service demands that no more restrictions be placed on the user in this situation than a standard text editor does in the presence of spelling errors. To emphasize the distinction between this approach and those adopted by many language-based editors, we refer to variances rather than to the traditional term language errors. This approach acknowledges that experienced users often introduce variances deliberately while working toward a desired result; users should not be penalized by the system’s failure to understand the process [43]. Coherent guage

error”

which

the

user interface. to the details

The

informative

shift

of emphasis

“variance”

of language-based

from

is only

technology

the preemptive

one example

and

“lan-

of ways

implementation

in

should

be concealed. Following the view that Pan is an interface between user and document, language-oriented interaction is organized around a conceptual model of document structure, tuned for each language to be convenient and natural. Users are offered a variety of services while hiding representational complexity. Extensibility

exploit

rich

internal

data

tion to variations research platform,

customization. Pan is designed for convenient adaptaamong users, projects (group behavior), and sites. As a it accommodates extension and evolution [40]. Language

description

one aspect

3.2

System

and

that

is only

of Pan’s

extensibility.

Organization

Pan’s implementation supports users in three categories: and language description authors, 3 The classes of users

clients,

customizers,

can be characterized

3 It is the role of the user that is the issue here, All but the most naive client users customize their environments in simple ways, blurring the boundary between clients and customizers ACM Transactions

on Software Engineering

and Methodology,

Vol

1, No 1, January

1992

The

Pan

Language-Basecl Editing System

.

101

Exfr.n.lwn&

cu.ft0Ru2ar1011 Lbrmws

rl

/

. --+’: Users

; ACOW j Docwwnf ! RepOsllOry

Fig.2.

by the

tasks

they

System oganization: Users’ perspective.

perform

and

by the

aspects of the system (Figure 2). For clients, Pan is the interface Section 3.3 describes the lar, the rich text editing text.

Clients

libraries;

on

this

task

the

they

browsing

and

for

need editing

about

various

documents.

surface characteristics of this interface, in particuand presentation facilities available in every con-

use libraries

about them. Customizers,

knowledge

and language other

requires

descriptions

hand,

augment

expertise

ranging

but

extension from

need to know and

the shallow

little

customization (e.g.,

adding

key bindings) to the deep (e.g., adding a new kind of directory editor). Section 3.4 describes some of the mechanisms that support customization. Language description authors must be familiar with language processing and

with

the

techniques

used

by

Pan’s

document

analyzer,

described

in

Sections 4 and 5. A language description contains information about syntactic structure and context-sensitive constraints. Constraints usually include the static semantic rules of the language, but can also include site-specific or project-specific restrictions such as naming conventions. As part

of the emphasis

on coherent

tions also specify user interaction (described in Section 6) and may tiple descriptions viding a different

for a single underlying interface for a different

view, the author is a user-interface user-interface specification. 3.3

Text-Based

Editing

6 of Section

ACM Transactions

[661, language

descrip-

language may coexist, each proclass of users. In this broader

designer

and the language

description

a

Interface

Superficially, Pan appears to clients multiple window text editor in the sors. Figure

user interfaces

with generic language-based services add new language-specific services. Mul-

6 shows

as a convenient spirit of Bravo

a sample

on Software Engineering

editing

bit-mapped, mouse-based, [39] and its many succes session.

and Methodology, Vol

But

Pan is also an

1, No 1, January 1992

102

R A, Ballance et al,

.

editor that happens to be extremely and the local working environment:

knowledgeable about document structure languages in use, local conventions, and

perhaps the user’s own personal working habits. The same text-oriented services are provided for every or not it is written in a language that description) to analyze. Users familiar between

the

two

editors

smoothed

by

comparable text services. Generalized tion, extension, and self-documentation In contrast to many syntax-directed text

without

ever

giving

Pan with

document,

whether

is prepared (by prior language Emacs [571 find the transition

compatible

key

bindings

[211 and

undo, kill-rings, text-filling, customizaare among Pan’s standard services. editors, one might use Pan for editing But

at any time

one may choose to broaden the dialogue with Pan and to exploit (maintained by Pan) about the document. Pan can be directed

a thought

information to use this

information

to guide

to highlight

the textual

display,

editing

to its other

actions,

to present

capabilities.

to configure

answers

services are described in more detail in Section Two general (and configurable) mechanisms of information small glyph change

about documents: near the top right

shape,

concerning modified,

change

color,

and selectively

to queries,

and more.

6. enable

flags and visual text of a window; it may or all

three

in

the

in the display each

and varying

3.4

by Pan’s

document, in height),

highlighting. document)

persistent

display

response

to a change

of state

the document. This document state might be whether it has been whether it may be modified, whether particular services are en-

characters for

of these

attributes. A flag is a appear and disappear,

abled, whether certain kinds of inconsistency have been other property that can be represented as an editor variable A visual text attribute may be applied independently map

Some

The

choice

user’s

is underlined,

Customlzahon

services:

specifying of ink

current

choice

up to eight text

independent

color,

of font fonts,

and choice

selection

(shared

of any other

detected, or any (Section 3.4). to any group of

(from

a configurable

proportionally

spaced

of background

color for

by all

windows

on a

attributes.

and Extension

Pan’s services are built upon a rich and flexible base, designed for experimentation with document analysis techniques and the design of editing interfaces. The language-based mechanisms described in this paper use the infrastructure, as do a few experimental services that are not yet languagebased: a browsing interface to the file system, a hypertext-like browser for UNIX4 man pages, and an elaborate internal help and documentation system that can be configured for each category of user. Although some of Pan’s customization and extension facilities are declarative, others require programming. to Pan’s implementation language rather than inventing a separate can be abused,

4 UNIX

it has not proved

is a trademark

ACM Transactions

of AT&T

Unlike Emacs, we chose to provide access (Common LISP) and its run-time system, extension language. Although that access to be a problem

in practice.

Pan’s

extension

Bell Laboratories.

on Software Engineering

and Methodology,

Vol. 1, No, 1, January

1992.

The Pan Language-Based language general Pan’s

is embedded

in

mechanisms, current

all

services

Common

LISP

and

is supported

integrated

with

the

on-line

are implemented

Variables. In Pan, bindings per document

Editing System

in the extension

editor variables are instance, per document

by

help

scoped dynamically, type, and globally.

keyboard bindings, hooks, and window

which

fine

scoping

at a relatively

the scopes in which Function the

and

extension

options.

command

syntax

dynamically automatic

exception

that

may

be

level, including are built using

menus, character configuration, all of

New

variables

may be

be bound.

system.

New

are

functions

session,

and macros

along

with

may

declaratively

(called

commands)

that

are

bindable

automatically

integrated

into

the

run-time

mechanism.

Command

that

the

directs

arguments

command

handling.

Pan

signaled

(along

three

a message)

specified to

prompters,

distinguishes with

are

dispatcher

(by user selection, by configurable type checking and error recovery.

Generic

allowing This form

syntax that supports options such as for active values, and restrictions on

an interactive

functions

menus

of

Functions and macros may be defined at automatic integration with Pan’s internal

and its undo during

Those

and

dispatch

declarative

tions

definition.

system

keystrokes

may

implying

at any time

specified to

macro level,

documentation be defined

the variable

granularity.

of

Many

language.

variables, including user options, classes, flags, font maps, color maps, respect

103

a number

system.

of scoping permits the organization of services at each language-specific modes. Many fundamental mechanisms

defined at any time using a declarative predicates for type checking, notifiers

.

with

collect

or by default)

categories without

a

values with

of excepconcern

for

context: announcement, warning, and error. Response to Pan exceptions is context-sensitive, implemented by dynamically bound exception handlers [641. For example, loaded. Language descriptions

may

an error might arise while descriptions can be loaded be preloaded

the error terminates may be automatically case displays the tively terminating

when

the build with loaded during message, beeps, the command

a new

a language description is being in any of three contexts. First, Pan

a logged a session

is being

built.

In this

case,

message. Second, descriptions with Pan. The handler in this

and resets the that triggered

command language

dispatcher, loading.

effecThird,

descriptions may be loaded directly from within a Common LISP debugging loop. In this case, a recursive call to the debugger preserves the stack for inspection. Extension-level primitives guard their internal state and respect conventions for signaling exceptions; exception handling and still be robust. 4. DOCUMENT

simple

extension

code can ignore

ANALYSIS

Document analysis in Pan relies on two components: Ladle (Language Description Language) [12] and Colander (Constraint Language and Interpreter) [6]. Ladle manages incremental lexical and syntactic analysis; it includes both an off-line preprocessor that generates language-specific tables ACM Transactions

on Software

Engineering

and

Methodology,

Vol.

1, No

1, January

1992

104

R, A. Ballance et al

.

and a run-time analyzer that revises to reflect textual changes. Colander tal checking

of contextual

off-line preprocessor tion 6) coordinates

Pan’s internal document manages the specification

constraints.

Like

Ladle,

Colander

Language

Description

A Pan language of Pan’s

three

(1) Lexical

description

syntactic

and

both

an

interface (Secaccessible to

in a bit more some important

detail, design

Processing

components:

and

language

includes

and a run-time component. The editing analysis and makes derived information

users and client programs. This section describes each of Ladle and Colander discusses how the two cooperate, and finally examines issues that cross all component boundaries. 4.1

representation and incremen-

define

contains

declarative

information

for use by each

5 data,

used

an internal

by

Ladle,

describe

the

syntax

tree-structured

representation

context-sensitive

constraints,

of the (Section

4.2). (2) The but

Colander

portion

not limited

5). This specification contextual-constraint use. (3) User-interface guage Figure

Multiple

of the language

may also direct that certain checking be stored and made configure

the

editing

including,

(Sections

4.3 and

data derived during available for general irlterface

for the

lan-

6.1 and 6.2).

3 illustrates Pan

semantics

specifications

(Sections

the run-time time.

specifies

to, the static

the system,

Pan language

flow for

of information either

descriptions

from

preloading could

a language or dynamic

be written

description loading

for a single

to

at run

language,

suited for different users and different tasks. The primary motive would be provide different collections of services of the kind described in Section However, one could also experiment with different presentation styles, ternate internal representations (abstract syntaxes), or different styles Colander language

description. descriptions

be managed A related components semantic, as

One area of active research concerns the so that multiple views of a single abstract

effectively. area of research among language well as syntactic,

layering syntax

to 6. alof

of can

concerns sharing structures and description descriptions. Many languages share similar concepts. The reuse of portions of the language

description simplifies the language possible to share data and establish different languages.

description linkages

writer’s task and makes among documents written

it in

5 In the current implementation, each document must be composed using a single language Our architecture and algorithms support documents composed from multiple languages, but the current implementation does not. ACM TransactIons on Software Engmeermg

and Methodology,

Vol

1, No 1. January

1992

The Pan Language-Based

Editing System

User lnterfac9

.

105

F==%

ConflguranonIniormatlon

Internal Tree Spw!fcaoon

Lexer Tables

ran

‘C

Caller

Fig.3.

4.2

Ladle

An

abstract

grammar, ing the implicitly lates

syntax which

Language

is described

also

semantically defines the

document

specifies

lexical

regular quote

Ladle

using

tree-structured

relevant structures terms in which the

an augmented

context-free

representation.

By defin-

of the language, the grammar rest of Pan accesses and manipu-

description

of syntactic structures is to be supported, Ladle to convert textual representations to and vice versa:

may include that

Bracketing

both

regular

is, expressions

with

can be either

–The grammar for the abstract syntax productions necessary to disambiguate

nested

expressions paired

and bracketed

delimiters

punctuation. productions

Ladle constructs and the grammar

directives tune the Ladle syntactic error-recovery during parsing. These directives also have important interface (Section 6.1).

Internally,

Ladle

manipulates

two

such

context-free

grammars:

Butcher

has recast this work in terms of grammatical

ACM TransactIons on Software Engineering

a for

mechanisms effects on the

one describing

the abstract syntax and the other used to construct parse tables. The must be related by grammatical abstraction [8], a relation ensuring

6 More recently,

as

or simple.

is augmented by specifying those the original (abstract) grammar or

to incorporate additional keywords and full parsing grammar from the additional the abstract syntax. —Optional invoked editing

processing

components.

expressions, marks.

Grammar

description

to the

When text-oriented editing additional information enables tree-structured representations –The

I ex,l-Al

expansion

two the

[12]

and Methodology, Vol. 1, No 1, January 1992

106

R A. Ballance et al

.

1)

(sfmt)



(Zf-sf??rt) “,”

2)

(lf-strnf)

-

(If-part)

(else-par-i)

3)

( tf-pari)



if (ezpr)

then

4)

(eke-part)

-

c I

‘~)

Concrete

1’)

(Sfmt)

.,/ -)

l“)

else (sirnts)

Grammar

G’I



If

(ezpr)

then

(stints)

I

f

(ezpr)

then

(stints)

Abstract

(Strnt)

Grammar

– I

~tl)

.-ibstract Fig. 4.

(strnts)

else

(sfrnt.s)

“,”

@l

(erpr)

(sfrnts)

(ezpr)

(stints)

Grammar

Grammatical

“,”

(strnts)

~

abstraction

following: (1) The abstract syntax represents a less complex version of the concrete syntax, but structures of the abstract syntax correspond to structures of the concrete syntax in a well-defined way. Singleton derivation steps and nonterminals

needed

and tokens structure

of the

primarily

concrete

need not be explicit

(2) Efficient incremental abstract to concrete special procedures abstract is triggered

for parsing

syntax

that

can be suppressed.

can be inferred

in the abstract

entire

to parse

Keywords the

abstract

syntax.

transformations from concrete to abstract and from can be generated automatically; no action routines or are necessary. The transformation from concrete to directly by actions of the parser.

(3) The transformation from concrete to abstract vant information about a concrete derivation abstract representation. This property allows cations

from

documents

incrementally

is reversible, so that relecan be recovered from its the system to parse modifi-

without

having

to

maintain

the

tree.

(4) The relationship between the two descriptions is declarative and statically verifiable so that developers can modify either syntax description independently. This approach allows a high degree of control over both the structure of an internal representation and the behavior of the system during

parsing.

Grammatical abstraction is structural; it does not use semantic information to identify corresponding structures. Two examples of grammatical abstraction appear in Figure 4. The concrete grammar GI describes the syntax of conditional statements. The fragments ~1 and @l are both allowable (but different) grammatical abstractions from the fragment GI. ACM Transactions

on Software Engineering

and Methodology,

Vol

1, No 1, .January 1992

The Pan Language-Based The

Ladle

preprocessor

nal tree representation parsing

and

modified

error

recovery.

LALR(l)

To date,

generates as well

parser

syntactic

thetables

as auxiliary A

descriptions

needed

tables

standard

generator

Editing System to describe

needed

lexical

during

analyzer

are also invoked, have

been written

. the

inter-

incremental

generator

as shown

and

in Figure

for Modula-2,

a 3.

for Pascal,

for Ada, for Colander, and for Ladle’s own language description Descriptions are being developed for a variety of other languages, C, C++, and FIDIL [28]. 4.3

107

language. including

Colander

Colander

supports

constraints. tion

the

such

as name

include

binding

extralingual structure. naming conventions, within

description

Constraints

or among

and

incremental

nonstructural

rules

and

type

checking

of contextual

aspects

of a language

consistency

rules,

defini-

as well

as

Examples of the latter include site- or project-specific design constraints, and complex, nonlocal linkages

documents.

Our approach is based on the notion of logical constraint grammars. In a logical constraint grammar, a context-free grammar is used as a base. Contextual constraints are expressed by annotating productions in the base grammar with goals written in a logic programming language. 7An incremental

evaluator

in order

monitors

to maintain

To date,

logical

semantics

changes

to the document

consistency constraint

aspects of design semantics, information. Other problems straint

grammars

ager,

and

have

languages,

include

and that

the

The

been

including

kinds

information

used to define

Modula-2,

of analyses

Colander

performed

a compiler,

compiler

the

to express

to describe and maintain can be expressed using

[44] or Microscope [2]. itself has three subcomponents: an evaluator.

and the derived

them.

grammars

of programming

Masterscope Colander

between

some

prettyprinting logical and conby tools

such

a consistency

generates

static

the

as

man-

code used

by

the evaluators as well as the run-time tables required for consistency maintenance. The consistency manager, a simple reason maintenance system [22, 551, invokes collects presents 4.4

the

Colander

Document

Textual

the

to

(reattempt

maintained

in more

a goal.

by the

The

consistency

evaluator, manager.

in

turn,

section

5

detail.

Processing

changes

and parsing. lexemes with

evaluator

information

are incorporated

Ladle’s incremental an underlying text

into

an internal

lexical stream,

of the lexical stream. The lexical analyzer for use by the incremental parser.

analyzer updating maintains

tree

in two phases:

a summary

of changes

7 Logical constraint grammars should not be confused with corzstrczmt logic programming generalization of logic programming. s The compiler uses Pan to parse language descriptions. This involution is one example Pan is used to support itself. ACM Transactions

on Software Engineering

lexical

synchronizes a stream of only the changed portions

and Methodology,

[15], a

of how

Vol. 1, No 1, January 1992

108

.

Ladle’s tion in scratch,

R. A, Ballance et al. incremental

of the tree. incremental path

parser

revises

the tree-structured

representa-

It uses a variant of an algorithm by Jalili and Gallier parsing, the algorithm first “unzips” the internal

between

concludes

the

by

portion. When reconstituted. For

LALR(l)

response to lexical changes. This parser can create a tree from but in response to lexical changes, it need only modify affected areas

the

root

and

the

incorporating unzipping

benefit

leftmost

changes and zipping

of Colander

and

changed

and

up, subtrees other

area.

“zipping

The

up”

are broken

clients,

Ladle

[33]. During tree along a algorithm

the

unzipped

apart

and then

classifies

tree

nodes

after each parse: newly created, deleted from tree, reconstituted, and unchanged. Semantic values associated with reconstituted nodes are retained, even though their Pan’s distinction semantics)

reflects

techniques. “typedef’

4.5 Since

analysis

problem

problems Information subtrees

may require syntax and

a division

It creates

and semantic these

annotations between

common

problems must

in the

within

Pan

updating. contextual

to

in practice

be intertwined “C” language. is currently

almost for

constraints all

languages

static

description

in which

parsing

[231, for example, the well-known Research into general solutions to under

way.

Retention may

be heavily

annotated

(both

by tools

changes to the internal tree must loss. A simplistic implementation

be minimized to avoid of the incremental

would

every

destroy

(or

language

and then

recreate

subtree

between

and by users),

actual

needless information parsing algorithm a changed

area

and

the tree root. Widely shared data often appear close to the root of the tree. The loss of semantic annotations on those nodes would cost lengthy and often unnecessary

recomputation,

so efficient

incremental

constraint

checking

by

Colander depends on Ladle’s reuse (either physical or virtual) of these nodes. Ladle uses an effective heuristic for reconstituting unzipped nodes. The parser keeps a stack of “divided” tree nodes. When new nodes are needed, they are taken from this stack if possible. A node is reused when it represents when either

the same production in the abstract syntax as in its previous its leftmost child is unchanged between parses. The heuristic by not

reusing

a node

or by reusing

a node

in a different

use and can err (absolute)

position within the syntax tree. Neither case causes major difficulties, although either may entail extra computation during incremental constraint checking. In practice, the heuristic for reuse succeeds in the most critical cases, preserving portions of the internal tree close to the root. 4.6

Tolerance,

Recovery,

and Variances

Documents are most often incomplete and ill-formed during editing sessions. To maintain full service to the user, Pan’s analysis mechanisms treat such problems as uariances, not as errors, and make every effort to treat them as interesting but relatively normal occurrences. Pan’s internal document representations are automatically extended to admit variances and to retain as ACM TransactIons on Software Engineering

and Methodology,

Vol

1, No 1, January

1992

The Pan Language-Based much

information

as possible

how the editing Lexical

interface

analysis

ever there

normally

lexical

ing, and document

analysis

During

parsing,

Sections

information

since error

characters,

leading

109

6.1 and 6.2 describe about

tokens

.

variances.

are generated

to parser-based

when-

syntactic

vari -

fails, a lexical variance is signaled. For instance, an may lead to a lexical variance. A variance detected inhibits

all information is preserved.

for syntactic the abstract

presence.

available

succeeds,

are unexpected

antes, If lexical analysis unterminated comment during

in their

makes

Editing System

both

that

Pan

parsing

existed

and contextual-constraint

prior

uses a simple,

to the

effective,

error recovery. The structures syntax. Directives in the Ladle

attempt

panic

mode

check-

to reanalyze

the

mechanism

[17]

to which it recovers portion of language

are those of descriptions

tune the recovery mechanism for each language. The presence of a syntactic variance is marked in the internal tree by an error sub tree annotated with an appropriate

error

message.

and subtrees that strategy is similar tions

on

the

The children

of an error

subtree

are the lexemes

were skipped over during the recovery, to that used in the Saga editor [37]. Any

subtrees

within

the

error

subtree

are

This recovery extant annota-

preserved,

including

annotations created by Colander. Contextual constraints within an error subtree are not attempted by Colander. When the user corrects the variance, prior annotations can immediately be reused. This is just a special case of the general information retention problem. Contextual-constraint checking can proceed in the presence of syntactic variances; any constraints within error subtrees are simply ignored. Unsatisfied

contextual

constraints

annotations

on offending

5. LOGICAL

CONSTRAINT

form

another

kind

of variance,

resulting

in

nodes.

GRAMMARS

A great deal of Pan’s analytical power, as well as its potential for future extension, derives from the adaptation of logic programming and consistency maintenance,

as introduced

in Section

cal foundations of logical constraint particular language and implementation, Logic

programming

and maintenance

is a natural

of contextual

4,3,

This

section

reviews

grammars [6, 8] and Colander, in more

paradigm

constraints.

for the First,

the theoreti-

describes detail.

specification,

context-sensitive

Pan’s

checking, aspects

of

formal languages are often described informally using natural language that approximates logical structure. Translation of these descriptions into clausal logic is relatively straightforward. Second, the act of checking contextual constraints an be viewed as satisfying the constraints relative to some collection of information. (In pass-oriented applications like compilers, this collection presence

of information is represented by a symbol of a logic programming language implies the

inference engine and conventional constraint as a shared repository engine. ACM Transactions

table. ) Finally, the existence of both an

a logic database. The extensibility of Pan, beyond checking, derives from the presence of the database and from the generality of the logic-based inference

on Software

Engineering

and

Methodology,

Vol.

1, No.

1, January

1992

110

R. A. Ballance et al

.

A logical

constraint

grammar

is a context-free

(LCG)

grammar

symbols and productions have been annotated with logic-programming language, that specify constraints satisfied using An eualuator

erated by G. Goals are fication, as in PROLOG. executing

the

goals

goals initialize language. The syntactic

that

structures

whenever proved.

all

are

the data evaluator

present

of the

independent

successfully

are

is considered

A document

of any

in a gen-

search based on unidescription begins by

syntactic

structure.

These

of the documents written in a given to satisfy all goals associated with

the document.

in

goals

expressed on the language

backtracking for an LCG

shared by all then attempts

G in which

goals,

well

The evaluator proved

formed

or

stops processing

further

no

whenever

goals

can be

all of its associated

goals have succeeded. As in unrestricted attribute grammars, circularities among the goals in an LCG description can arise. It is left to the writer of an LCG description to remedy

circularities.

Other well-known specification formalisms for include attribute grammars [18, 531, action routines [51, and natural mechanism for An

tree.

semantics [341. Attribute defining attribute (property)

attribute

value

at a subtree

language implementation [36, 451, context relations

grammars provide a declarative in a syntax values at subtrees a function

is

of the

attribute

values

on other values defined at neighboring subtrees; they may also depend an attribute grammar itself defined at the current subtree. Unfortunately, usually specifies only the attributes and their interrelationships. A separate formalism, usually a general-purpose programming language, must be used to specify the semantic functions and the data types. Yet much of the specification resides interesting information in an attribute-grammar-based

in the code that separate

implements

formalism

description.

hinders

Another

way to make

the

semantic

functions

the

analysis

of interactions

drawback

information

to attribute

in the

and data is that

grammars

attribute

values

types.

within

available

Using

a

a language

there

is no easy

to other

tools

[321.

most algorithms for incremental evaluation of attribute grammars Finally, system can unduly constrain the kind of manipulations that an interactive support [66]. are arbitrary procedures associated with nodes in a syntax Action routines tree. An action routine is invoked whenever a particular event concerning its

associated

node occurs.

The actual

events

that

can cause action

Thus, action invoked depend on the implementation. contextual and powerful mechanism for implementing but

they

require

a

developer

to

deal

with

all

in the

PSG

system

routines

routines

form

constraint

aspects

of

to be

a simple checking,

incrementality

explicitly. Context

relations,

composing by

its

into

context a correct

“still-possible the cates

relation. (larger) attribute

evaluation that

as used

fragments

there

ACM Transactions

of other

of syntax

trees.

A fragment fragment.

the

context

values

valid

valid have

An

empty

assignment

if it not

of values Vol

means

for

is summarized can

be embedded

summarizes

that

and Methodology,

a novel

fragment

relation

constraints.

on Software Engineering

[5], provide

program

is considered A context

values,”

is no possible

Each

the

been

context

set

ruled

of all out

relation

to the

1, No 1, January

attributes 1992

by

indiof

The Pan Language-Based the

fragment,

Snelting tions,

like

such

and

[5]

therefore

indicates

their

incremental

describes attribute

grammars,

as name-resolution

Natural tured

[34]

tion

an

abstract

characterize

axioms

and

deduction

[50]. in

rules

Typol

in

[19,

34],

proved

the problem

to

attribute

grammar by

program

will

baum’s adopts

not

Like

only

contextual

rather

than via

allowed

to

the

the

a given

can

language

be locally

the

the

is and

rules

of the to

be

in

description,

determined.

ACM Transactions

An

LCG

like

extend

to the

Typol

to

an

attribute

values

modifiable

incre-

Hoover

attribute

table

for

and

Teitel

grammars,

a -

one

descriptions

implementation relations,

a

in

the

of

made

allow

one

the

are

an

content

for

LCGS

within LCG of the

available

setting, evaluator.

automatically description logic

to

of

vehicles

a database

document

author

be

to

how

is

database.

users

via

the

6.

satisfying

between

of contexts

to

symbol

information

in

can

how

from

entire

context

and

Section

shows

operational

as

The

access

also

the

of information

structure

by transforming

weaknesses.

used

structures

interactions flow

their

Attali

providing

necessarily

the

mecha-

PROLOG-based incremental.

and

primarily

Like

database

For

cases,

[10],

of axioms

in Typol

not

of derived

interactions.

Evaluation

other

be

interaction

Incremental

In

could

and

5.1

Many

are

among

not

adopting

also

LCGS

use

discussed

goals.

Centaur

descriptive

it was

techniques

by

techniques.

facilities

other

which

equation

transformation

are

unless

and

flow

in

proving

determine in

original

evaluation, care,

but

and

both

information

support

the

without

strengths

database specify

to

elegant

programs

While

Moreover,

creation

on the

reduced to

The

~ttali

Typol

as a unit

interactions

handled The

their

an

evaluation.

grammar

that,

description

Nonlocal

of

grammars,

on the

natural

single

thereby

attribute

constraints

higher-level to focus

of

on

is to compile

grammar,

incremental

adopted.

attribute

rules

collection

based

used

evaluation

attribute

achieves means

are

is used

resulting

since

of incremental an

be treated

[30]

strucdescrip-

inference

The

is a collection

provide

inefficient

semantics.

the

This

is

strategy the

attribute

dynamic

manipulated mentally.

and

language

constraints.

evaluation

approach

on Plotkin’s

logic

language

alone

was

into

partial

a

description

and

Typol

incremental

implement

details,

a language

syntax.

matching

of contextual

[31 addresses for

target

evaluation rules

and

however,

methods

and rela-

goal,

kinds

definition

with

description

general

implementation,

a Typol

description

axioms

identified the

Bahlke Context

based

with abstract

A Typol

PROLOG

many

their

semantics,

the

pattern

semantic

semantics

for

of

111

case.

a PROLOG

Natural nism

given

The

into into

about Tree

by

algorithm.

formalism

natural

in

paper

o

formalism.

together

is

semantics.

rules.

description

rules

the

on natural

inference

syntax

logic.

any

In

structures

Reasoning

that

apply

based

the

inference

theorems

is a logic-based [49].

The

many

to a separate

semantics

specifies

that

error. analysis

relegate

rules,

semantics

operational

an

Editing System

goals

or other

data

evaluator

on Software Engineering

goals

some act

requires

through from

may

use,

the

one but

satisfying

logic

subtree does

database. to another

not

require,

and Methodology, Vol. 1, No. 1, January 1992.

112

R. A. Ballance et al

.

knowledge system the

of dependencies

takes

performance

manager

goals

have

ency

is restored

crucial

simpler

ing

[58].

a datum

whose

satisfaction

is

and

whose

removed,

ways

shadowing the

consistency

absence

memory

the

of

intensive. collections

situations

after

using

the

removed

consist-

and

continues

until

database

which consist-

modifications

of data

from

the

dependency-directed

data

are

used

to

manager

additions

Holes

is

database, backtrack-

satisfy

retries

the

all

used

Holes

in

are

of using for

each

goal.

those

goals

a goal.

all

goals

of the

does

not

Thus, they

developed:

representing

satisfying

holes

absence

were

for

data

When that

computationally

a datum.

whose

database

a means

reattempts

datum.

data

to

provide

was

searched

the

be

on it.

strategy

are

in which

is detected,

must

Removal

consistency

manager

The

retried

which

database

that

a simple

y is suspected.

handle

rules.

from

the

sparse

to

arising

manager,

evaluation

is handled

the

LCG

system,

information

data

be

records

of an

by the

derived

inconsistency

evaluation.

cases,

absence

is filled,

to

depended

different

holes

two

evaluator

When Two

goals

user

a consistency

an

Incremental

incremental

of the The

their

by

derived

a circularity

of

to efficient

and

When

if the employed

be enhanced.

detected

which

or until

selection

the

on

system.

to be reattempted.

Careful

can

are

determines

Naturally, strategies

documents

changes

maintenance

ency

goals.

evaluation

evaluator

between

incremental

reason

of the

of the

Inconsistencies from

between

advantage

a hole

depended

efficient,

scale

holes

well

are

represent

but

when

best

would

many

suited

for

normally

be

present.

Shadowing LCG

rules

description,

again

when

storage

a datum

but

situations static

more in

analyzer

5.2

Making

to the

computed which

database.

than

holes,

populate

descriptions

specifying

many

static

The

of an

attempted require

better

database. the

be

rules

them

relieves

analysis

must

Shadowing making the

and

by goals

presence

authors

less

suited

for of a

of language

details.

Programming

LCGS practical

Partitioned lections of

rules, determine

sparsely

simplifies from

of logic

to

is added data

LCGS and Logic

model

help

computation

which

descriptions

inference

are

that

required

programming

several

modifications

to the

basic

PROLOG

[59]:

database. data tuples.

The logic Collections

database is explicitly structured and data tuples are first-class

into

col-

objects:

dynamically. Collections are created and They can be created and destroyed as a side effect of satisfying a goal. Data data tuples are added to collections but cannot contain unbound logical variables. tuples can refer to collections Partitioning

the

of an incremental tion

to

example,

logic

database

evaluator

express

directly

collections

while the

can

be

into

collections

allowing

partitionings used

to

the

improves author

often represent

found scopes

the

performance

of a language in in

a

programming

language.

ACM

TransactIons

on

Software

Engmeermg

and

Methodology,

Vol

1, No

descrip-

languages.

1, January

1992

For

The Pan Language-Based Distinction

between

base

must

be

dure

clauses.

procedures Only

This can

that

or

nently

associated

change until

its

to

information

The tree

being

maintained.

for

Ergo

separating language relative

system

owning

tuple

the

data-

of

proce-

of goals

and

in the

The

itself

database

its

Ownership

in

This

zero

owned

are

and

may identity

the

to

changes

or more

consistency in

associated

that of

by

perma-

a collection

by

document

has

are

existence

is used

underlying

database.

Collections

tuples

retains

a document

the

be explicit.

edited.

subtree.

the

in

to

the

collec-

subtree

are

a subtree

are

abstract

syntax

from

to

among

on

contexts

access

from

the

is to propagate

tree,

similar

abstract

syntax

information

to the

locally

methods

developed

goals the

must

in

an

LCG

essentials

of

contain

the

helps each.

the

author

Goals

are

information

of

a

defined

necessary

to

goals.

goals.

so standard

the

focus

their

PROLOG

The ordering programming

data tuples and retract

of assert

use

[47].

contexts;

among

is to provide

A secondary

context

or disprove

use

in

heads

collection from

will

being

collection

a context

of the the

Ordering

5.3

for

description

orderings

entire

the

contexts. All goals associated with to the subtree’s cent exts. The contexts

use

subtrees

the

removed

appear

as

its

collections.

among

fied,

or

can

113

dynamically.

primary

satisfy

an

and data

in

subtree

called

to other

the

appear

database

is destroyed.

changes

relative

determined

the

subtree

Each

that

can that

document

their

but

of tuples,

evaluated

the

relate

Contexts. tions

added logic

collection

with

owning

be

to the

in

repeatedly,

manager

assures

may

subtrees

Terms

that

.

analyzed.g

changes

Every

more

data.

terms

limitation

terms all

Ownership. one

from

be statically

ground

assures

code and

distinct

Editing System

in

the

in

an

among goals techniques

database LCG

may

differs

is

not

from

formally rely on

not

that apply. their

In

use

speciknown

particular,

in

PROLOG.

Example

Figure name

5 shows

a very

be defined

before

or importation; notation

names

resembles

simple it are

LCG

is used, used

in

Colander

the

for

a language

Names generic

are “uses”

language.

that

defined and

An

requires

by

either

that

in procedure

identifier

each

declarations calls.

prefixed

by

The

“$”

syntax tree; “$$” denotes a node in the abstract denotes the node associated with the goal currently being satisfied. Identifiers prefixed by “?” denote is denoted by ??, The logical variables; the anonymous logical variable notation

respect can

(?Form, ?Collection) to the given collection.

be used to represent

sections

of the LCG

indicates that ?Form is to be evaluated is a special kind of collection An “entity”

linguistic

define

the facts

objects

such

as variables.

and procedures

The first

with that three

used in the specification.

9 Extending the analysis to handle the dynamic addition of new goals or clauses to the logic program is straightforward in principle but costly in practice. It requires a new level of dependency-directed backtracking relating the inputs and outputs of the static analysis. ACM Transactions

on Software Engineering

and

Methodology,

Vol.

1, No

1, January

1992.

114

R. A. Ballance et al

.

Definition-Specific

Data Tuples:

declared ( ?Narne, imported

?Enttty)

/.

Fact repmsent]ng

b,nd,ng

?Ent~ty)

/+

Fact reprmentlng

,mportatjon

J,

Fact representing

the next outer

/*

Entity

( ~Name,

enclosing-scope

( ‘Scope)

type-of ( ‘Type-mark)

propert.v


< lookup ( ?For-rn), ~Scope >





, I < enclosing-scope ( $’Scopel),




!Name

representing tbe type entlt.v—either “Id” or “proc”

of an

Definition-Specific Procedures: ~Scope < visible (’2Name, ~Entzty),

of

~$cope

>

>,

>

Procedures:

assert (< 7Tuple, ?Col[ectzoa context ( ?LL,c, ‘~sc(]p? ) new-entity

(

>)

‘2Ent2ty)

new-cOntext*

( ?C’onte2t)

~Tuple to ~CollectIoJt

/.

.4dds

/.

B]nds

:?Scope

/. Binds

‘2Enizty

/+

‘2C’ontext

Binds

to

un)que to

string- name( ~Loc,

‘7;Vame)

/+

Suc.eeds

I*

B!nds

,f and

a new

lf

and

cc)nte.~t

to

?Fo~m

tostring

falls

nan]e

*/

*/

~Contezt

only

~.h’eme

of ‘~Loc

marker

propagates

not ( 7Form)

,/

context

to s,ngle

of

al~ subtrees

*/

*/ QLOC

*/

Grammar: (~Oc~rne~~)- (~ef)’ (USC)’ (clef )

“DEF”

-

id

context ($$, ‘Scope), string-name

($id,

‘7Name),

not (< vmble ( ‘2Nume, “Invalld new-entity

‘v~), @,Scope>)

redeclaration

of

assert ( < type-of (“id”),

2Enttty

assert (< declared (~Name, (clef )

-

(clef )

+

this

scope”

YS’cope >), to

( ‘2h’eu-conteH

assert (< enclosing-scope

Create

/*

),

( ‘Contezt),

new, context

2Nero-cOntezt

“DEF’

for

above

*/

to chddren

./

Id as a procedure

./

and propagate

>) /+

~

~oals

(we)”

($$, ‘Canfezt),

new-cOntext*

(use)

,

>),

~Entzty),

/* S]m,lar

“IMPORT id “PROC’) Id (clef)” context

‘2.Vame m

( 7Enf2trJ),

‘tTSE”

Declare

id

,- context ($$, ?. Scope), string-name($ld, ‘zName), < lookup (visible ( ~Name, ~EntIty)), “NO Identlfler

named

< type-of (’old”), — “CALL”

(use)

The

goal

using

context

the

procedure

ing

to that

name

the

not term

creates ACM

a new

TransactIons

Fragment

with and

string-name. and

entity

in

an error

that

Software

the

will Engineering

>

not

be located” declared

production the

The

[(clef)

and

goal

then

will the

Methodology,

If

checks

+/

accesses identifier

other

no

binding

the

declared,

1, January

bind-

is present,

Otherwise,

being 1, No.

first

of the

that

be created.

Vol.

id]

name

another

variable

as a procedure

grammar

+ “DEF”

(string)

scope.

message represent

id lS declared

constraint

external

current

,

as an Identlfler”

that

Check

of a simple logical

the

binds

occurs

fails,

on

“ ~Name /.

associated

current

>

id

Fig. 5.

the

2Enttty

~Scope

‘~,vame could

1992

sets

goal the

The Pan Language-Based type of the entity to ‘id’, scope. to the current The

(lookup(visible(

subgoal

goal associated name

of the

with

[(use)

identifier

string-name lookup term

and adds a fact that ?Name,

~ “USE”

appearing

id],

in the

Editing System

will

represent ?Scope),

searches

for a binding Once

115

the new binding

?Entity)), production.

.

appearing

in

the

involving

again,

the

the

primitive

is used to get the actual external name of the identifier. The implements nested block structure by recursively traversing

outward through nested scopes, evaluating the procedure visible scope. The use of the cut operator (“ !‘’) in lookup ensures that

within each only the first

solution

lookup

to ?Form

will

an error message the entity located

be investigated.

If the subgoal

involving

fails,

is generated. If it succeeds, then ?Entity will be bound to in the search; the type-of property can then be validated.

In the goal associated with [( de~) + “PROC” id (clef) *( use) *], the built-in function new-context* is used to create and propagate a new collection to all of the direct descendants in the internal primitive that allows greater control over created

tree. which

Colander also subtrees inherit

collection.

One simple extension to this example is to maintain program by adding a goal associated with the production that

adds

facts

Database graph

provides a the newly

of the

form

could

then

triggers

“calls(?Procl,

?Proc2)”

be used to update

a call-graph of the [(use)+’’CALL” id] into

some

a graphical

view

collection. of the

call

incrementally.

5.4

An Implementation:

The

Colander

Colander

language,

one of Pan’s

formalisms

for

language

embodies the LCG approach. The language and its run-time the basic approach in ways that improve either the efficiency tion, the usability of the description language, or both. Pass-structured a syntactic

evaluation.

structure

establish

into

the context

Colander two

classes:

used by that

partitions those

structure

goals

the

goals

whose

description,

support extend of the descrip-

associated

primary

or by its substructures,

with

use is to and those

goals whose primary use is to express a contextual constraint. The classification of goals is indicated explicitly in a Colander description. Goals in the first class are called first-pass goals. They are evaluated during a top-down preorder walk of the internal tree. Goals in the second class are called second-pass goals. They are evaluated after the first-pass goals and, in the current implementation, are evaluated after the first-pass goals of the subtree’s children have been evaluated. For example, the first goal associated with

the production

goal,

while

[( def ) ~ “PROC”

the second

general, goals that will be second-pass

goal on that

establish goals.

id

(clef )*( use)*]

production

contexts

should

could

should

be a first-pass

be a second pass goal.

be first-pass

goals;

all

In

others

Colander distinguishes three Multiple kinds of collections and data tuples. kinds of collections, each holding its own kinds of data. Datapools are collections of facts. Datapools are used to aggregate facts that can or should ACM

Transactions

on Software

Engineering

and

Methodology,

Vol.

1, No

1, January

1992.

116

R. A. Ballance et al

.

be treated as a single unit. There can be multiple instances For example, each scope in a program might be represented datapool

containing

together An

with

data

entity

described

is

facts

about

relating

that

a collection

language.

Entities

the

declarations

appearing

scope to the other

that

can

of the same fact. using a separate

be used

in

that

scope,

scopes in the program.

to

represent

objects

can be used to hold the attributes

of the

of a particular

object such as a variable, a procedure, or a paragraph. Information about entities is represented using entity properties. A property is a named value associated with a collection. Properties are single-valued; unlike facts, it is not possible for a collection to contain more than one property value with a given

property

name.

Subtrees can also be considered as collections Subtrees are created and destroyed by Ladle Structural

information

about

the

internal

holding during

tree

subtree syntactic

is represented

properties. analysis.

using

subtree

properties. Maintained

subtree

properties.

Main tained

tributes in an attribute grammar. dures associated with the grammar

subtree

properties

are like

value of a maintained property is computed relative to the subtree the property is being defined and can depend on any other values including the properties of the parent, child, or immediate sibling Maintained maintained

at-

Their values are defined by local proceproduction that defines the subtree. The for which available, subtrees.

properties are calculated on demand. Once the value property is computed, it is stored in the logic database.

of a The

consistency

manager

of a

maintained

property

A typical

is then

responsible

for

recomputing

the

value

as necessary.

use for maintained

subtree

properties

is in determining

the type

of an expression. The value of the type would be stored in the maintained property, and the subtree-specific procedure would define how to compute the type as a function of the subtree’s subexpressions. Although an attribute grammar can be emulated directly by an LCG using only maintained subtree properties, it is far more efficient to use the database and the context for moving values through the tree. In most attribute grammar descriptions, inherited attributes either summarize relatively local structural table”

information containing

about nonlocal

the

internal

information.

tree

or else consist

Colander

subsumes

of a “symbol the

“symbol

table” into the database along with other information about the tree. Local structural information can be passed like inherited attributes by creating and propagating new context datapools containing that information. Synthesized values propagate from leaves toward the root, as in attribute grammars; these appear frequently in Colander descriptions as maintained properties. Client

properties.

A client

property

is a subtree

property

that

is neither

structural property nor a maintained property. Client properties are usually manipulated by programs or components via a client interface, although they can appear in a Colander description. Client properties are not under consistency maintenance unless they are declared and used within the language ACM

TransactIons

on

Software

Engineering

and

Methodolo~,

Vol

1, No.

1, January

1992

a

The Pan Language-Based description.

One

prototype information

way

in

which

we

have

of a context-based prettyprinter. to subtree nodes using client

used

Editing System

client

.

properties

The prettyprinter properties.

117

is in

attaches

the

its own

Database triggers. Colander provides triggers that are activated when data are added or removed from the database. Triggers provide a uniform mechanism for implementing notifier functions used by clients. They are used internally

as well,

for example,

to implement

shadowing

rules.

Messages to the user. Any term appearing in a goal or a procedure can be suffixed with a message. When a goal fails during evaluation, message associated with the most described in Section 6.2.

recent

term

to fail

is captured

body the

and used as

The internal tree used in Pan allows subtrees with an Special primitives. arbitrary number of children called sequence nodes. Colander provides several functions for mapping goals over the children of a sequence subtree. Colander also provides two special functions that interact with the consistency manager. The function all-solutions can be used to calculate all solutions terminates.

It

changed, matching obtain

is

A typical

reevaluated

of the

whenever

the

use for all-solutions

a particular all

query.

bindings

is like Prolog’s bagof operator. It a goal, assuming that the goal

to

For to

set

of solutions

is to gather

example,

all-solutions

a particular

name

might

up all of the data in

might order

to

have tuples

be used

to

discriminate

among the possible definitions of an overloaded operator. The function notever is a form of negation that is monitored by the consistency manager. If a new solution arises that might cause the not to fail, then the goal containing the notever will be retried. 6.

USER SERVICES

Pan’s

ultimate

some

of the

services.

technical

A more

guage-based 6.1

purpose

problems

thorough

technology

Document

is to assist

its intended associated

treatment

appears

users. with

of

Pan’s

elsewhere

[661.

This

section

providing

approach

discusses

coherent

user

to delivering

lan-

Models

User and system must communicate about what is being edited. Languagebased editors often present a document model based implicitly on internal representations, and the abstract syntax tree is sometimes proposed as a “natural” model for user interaction. In practice, however, the design of a tree representation (the abstract grammar in Pan) is strongly influenced by internal

clients

of the

data

(e. g., a Colander

specification).

These

influences

are unrelated to the way users understand document structure. Pan decouples internal representation from user interaction. The language description mechanism provides a loose framework in which the author of each description is expected to design a model of document structure that is appropriate for the language, its intended users, and their tasks. Alternate descriptions for a single underlying language can provide different services ACM

Transactions

on Software

Engineering

and

Methodology,

Vol.

1, No

1, January

1992

118

R, A. Ballance et al.

.

based on different models, suitable for different users and tasks. For ple, one might provide different operand classes and more elaborate handling

for novices

for reengineering This framework document

than

for

experienced

users,

than for authoring. is based on two assumptions

structure:

they

think

in terms

or different

about

is neither

an

Operand

“operator”

nor

classes.

representation

Pan’s

named

derived The

relationship and

specification.

and

are

not

similar one

(anonymous)

it

internal

document

are arbitrary,

navigation,

highlighting,

They

possibly

form

the basis

and editing.

and the

current

Class

database

of

not by

the

an

any

model.

In

places

that

classes and

class the

are

its

user-ori[35]

from

Synthesizer

but

specification

independent

of

by

Garlan’s

unparsing

and

definition.

With

by predicates

may

be created

name,

option

the

specifies on the

Several

tactk

a few

on nodes at any

defining daemons

time

built-in of the using

predicate,

exceptions,l” internal

Operto

under-

complementary [24]

and

a declarative and

to be invoked

to Reiss’s

that

One

or after

are

A

new

specifies

particularly

any

the useful

structural

selection

class.

language-independent

and

Error’11

classes

‘Unsatisfied

contextual-constraint

are

predefine.

Constraint’

variances,

For

denote

respectively.

the

example,

ones

such

as

‘Expression’,

‘Syn-

sets of syntactic

‘Language

to be the union of ‘Syntactic Error’ and ‘Unsatisfied appropriate when the distinction would be irrelevant

structural

classes

representation. syntax

options.

before

operand

tree

the user. The class ‘Query Result’ permits perusal by the most recent database query. Each language description adds language-specific

Error’,

de-

(constraint’, or confusing

is to

of a set of nodes returned classes,

‘Statement’,

starting

with

‘Declaration’,

10The classes ‘Character’, ‘Word, and ‘Line’ are defined in terms of the text stream, ‘ Lexeme’ in terms of the lexical stream; all offer services similar to the structural classes, 11Although ‘syntax errors’ are Just a kind of variance in the Pan system, to users they designated ACM

is

is embedded

access

is

a

there

determined.

controlling

for

users

Generator, [52],

dynamically by

is phyla

hidden

mechanism

and

being

documents

exemplified

operand of operators

place”

than

of

use

operand

“resting

of resting

approach as

in

rather

“views”

and

to the

structural

class

define

operators

[511.

Class

and

of of

a “statement”;

hiding

classes components.

tree

scheme

display

defined

fined more

of the

structure,

views

based

for

Operand

in contrast

is achieved

classes

lying

class

part

unparsing

document

class

between

Operators

effect

the

and

mechanism

based on class definition

many-to-many,

tree

in

is just

instead

terminology

information.

ented

only

specific

of document

selection,

is dynamic,

understand

components

the

services



class.

collections

for structure-oriented membership

primary

operand

is the

overlapping,

a “subtree.

how people

of structural

trees, and they think of those components in particular languages. To most users, a “statement”

query

examerror

by their

Transactions

on

traditional

name.

Software

Engineering

and

Methodology,

Vol.

1, No

1, January

1992

and

and are

The Pan Language-Based ‘Procedure’.

Since

statement

might

Finer

distinctions

classes

may

be in both are

the

overlap,

a node

‘Statement’

possible.

In

Editing System representing

and ‘Syntax

one

a malformed

Error’

description,

119

.

classes.

‘Language

Error’

contains all syntactic variances plus those unsatisfied constraints concerning the language definition; other unsatisfied constraints are in the class ‘St ylistic Violation’. Diagnostics. with tates

Each

The user requests Pan’s

other

statements,

more

services

description

associates

diagnostic

editing

when tion

take

messages

unsatisfied

constraints,

and

in

information

panel

flags

that

appear

user control. This of language-based

this

only

section editing

Visual

components. identifiers,

it might

when

is exploited

enhancements.

document keywords,

Pan

has been derived.

that

information

and under discussion

Presentation

Malformed

sometimes

even

as statements. different.

Information

6. Derived

ticular gories:

services.

of variances.

is so fundamental

specifies

optional deferring

appropriate

notice

with

language-based

Figure

by invoking

no particular

within malformed blocks can still be treated and well-formed documents need not appear much

Using Derived

Text

information

statements

statements Ill-formed 6.2

language

potential variances. The presence of variances in a document precipino special action other than the appearance of designated panel flags.

text

not be apparent Pan’s

default

is the

through

case,

as shown

in

specific

services,

all

describes a few to Section 6.4.

attributes

draw

For example, font shifts comments, and unanalyzed

at all

configura-

such

services,

attention

reveal text.

to par-

lexical cateThis use of

typography contributes significantly to program readability in our ence, but only when the font selection is tuned for each language. Prettyprinting reindents documents to reveal syntactic structure.

experi-

advanced

seman-

tically

forms

driven

Structural varying results ances.

of prettyprinting

elision,

for program

are under

highlighting

documents,

More

including

development.

may be designated

for text

in particular

categories,

the color of both text and background. Useful examples include the of the most recent database query and different categories of variAlthough

highlighting

sages, experienced once attention The operand

carries

programmers

is drawn level.

often

less information

than

diagnose

simple

variances

at a glance,

of generic

Cursor-To-First,

Each

Pan

has a current

operand

level, chosen

window

commands

like

most menu-based

services, equivalent

by the language description input mode that modulates

Cursor-Forward,

Cursor-To-Last,

Cursor-In,

Select, and Delete. Search-Forward, commands and never affects text-oriented 12As with

mes-

to them.

by the user from a menu of classes specified level is a weak Figure 6). 12 The operand operation

diagnostic

Cursor-Backward, Cursor-Out,

The operand editing.

keyboard

ACM TransactIons on Software Engineering

(see the

commands

level

affects

Cursorno other

are available.

and Methodology, Vol. 1, No 1, January 1992.

120

.

R. A. Ballance et al.

... ........,,,,,.,...............................

ACM

Transactions

on

Soft\vare

Engineering

and

Methodology,

Vol

l,No

... .,,,

1, January

1992

The Pan Language-Based Structural

navigation.

The

operand

level

enables

to be in that selecting

class.

only

Operand

classes

specification

with the

each

Inconsistency

Any

situation

arises

the

user

ately

after

the

textual

easily

understood

a treelike

during

it

is

usually

are

to

selected.

of each

walks,

tree

configured

announce

the

Structural

variance

of information

two.

The

text,

is

text-oriented

what

however,

as

kind

the from

with cases,

variances

diagnostic

one

derived

during

various

(by

diagnostic

navigation

in

then

turn.

and Reanalysis

between is

perform

after-daemon)

member and

where

inconsistency

commands

with

appropriate

location

6.3

formation

navigation

navi-

a press of the left component defined

in the class.

associated

of an

associated supports

Generic

components

121

.

language-specific

For example, when the current level is ‘Statement’, button selects the “nearest” (based on a heuristic)

gation. mouse

For

no

by

when

example,

One

derived

font

panel

Colander

in-

of the

problem

may

disagree immedi-

a comment.

almost

Designated

for the

invites where

be incorrect

into

remain

users.

‘D’

and

aspect

would

of a statement

experienced

another

information

shifts

enhancements

syntax

from

approach,

exception.

transformation

for

is derived

syntax-recognizing

editing,

sees.

presentation

glyph

In

correct flags

(in

database)

most

in

ways

Figure

6,

appear

pale

inconsistency.

To avoid that

more

serious

language-based

information

are

analysis

consistent.

(almost)

although

it

explored,

where

more

user

incremental

delay.

invoke

analysis this

policy

Possible

state

takes

of

place

eration

the

does

relaxations

language-based

only

that

document when

triggers

might

document

requested,

Nothing

without

Mixed-Mode approach

to language-based

The

should

in the

document

A dual invoked automatic

information

at any

time

Pan prerequisite.

reanalysis, TransactIons

to edit

being

work

on

an

the

Incremental

analysis

by invocation

ensures on Software

is to broaden

textually

it should

aspect cursor.

ACM

user,

are

understands

explicitly

or

policy

by

of

genan

op-

invocation

typing

an

frequent

of entire

analysis

user.

editing

be able

presentation;

without

this

Since

the

Editing

Pan’s

derived

implicitly

either reanalysis

to the

them.

user

user

during it.

restrict

usefully

trade-offs.

derived

attempting

prevents a Pan user from Pan is designed to encourage

analysis.

it cost-effective

judge

ensures

a command

not of

services

can

automatic

Analyze-Changes. by making

and

such before

almost correct basis using inconsistent information. Pan’s lazy reanalysis policy presumes that the eral

reanalysis policy when text and

place

the

succeeds,

cause

automatic only

takes

Should

always

may

Pan’s

confusion, interaction

Pan triggers

inconsistency,

6.4

Editing System

and

at any

be equally at any

derived

Engineering

possible

not

and

‘LO narrow

at any

to edit

place

in terms

of

place.

commands, textTwo mechanisms that

options, time

or structure-oriented, make this work.

information and

Methodology,

is consistent Vol.

1, No.

may be The first, with

1, January

the 1992.

122

R. A. Ballance et al.

.

text

for any operations

that

require

The

it.

second

Pan’s

is

dual

aspect

edit

cursor.

Pan’s box. as

edit

cursor

It may it

also

does

colored

in

Any aspect:

or

user

the

selects

use

by

user

by

the

beginning. the

appropriate

aspect,

then

mechanism

pointing

cursor”

in the

one

is

used

when

the

mouse.

This

[62]

by providing

with problem

prototype

Cut,

achieves

the

persists

text

incremental

from

effect

the

analysis

by

a struc-

moving The

the

will

mod-

with

clipboard. until

reanalysis

inserts

invoked

same

into

component

implementation

when

the

and

automatic

Paste

it.

is free

the

analysis

already

time

the

next

remove

clipboard.

before

If the

equivalent

sequence

violates

the

internal

reanalysis,

it

derives

Complex

operations

may

any

context

is

structural

and

identifiers

could

be

perspective, intent,

fail

Another

cost on

to copy into

a

the

General approach

of

Pan’s

concerns to

this

and

approach the

well

definition

is

confuse

can

users;

hides. For example, list of a procedure

Although

the

conceptually

may

Pan. approach

this

two

related

dictate

lists

of

from

a

incompatible

problem

involve

inter-

guessing

the

in

textual

document

call. and

diag-

variances,

guarantees

Pan parameter

formal

underlyare

language

modification

identically solutions

it

the

of reasons

procedure

avoided

hand,

since

kinds

other

information,

other

structural

the

implementation

an

annotations

for

textually

representations.

the

the

Problems

handle

derived

representation,

parser. for direct

reasonable paste it

seem

On

anyway.

that

discarding

slightly.

into

succeed

mechanisms By

internal

Paste

and

operations

same

Pan’s mechanisms

built

it might definition

the

to continue.

of

Cut

inspired

precisely

increases

nal

6), stream

definition,

formedness

user’s

same

the

subsequent

language

user’s

the

directly.

deleted

a structurally

editing

by

by

selects

quickly.

When

the

location extended

also

uses

structural

commands

because

can

information

nosed

at its

location

no

Figure of

of the

appropriate,

ing

in

out

invisible

commands

cursor

component

(revealed

cursor

has

structure

(as

text

is

text

a cursor

inverted

component,

a ‘Statement’

at

structural

cursor

versus

No user

document

representation it

cursor’s

“point

editing.

selection

but

the

is

as an

structural

simultaneously.

internal

associated

the

displayed

to some

the

requires

If

location,

cursor

positions that

text

the

behaviors

tural

the

Setting

a structural

resolves

Simple

and

structure.

from

design

ify

6 where

operation

text

a textual corresponding

shading).

textually

inferred

both

Figure

editing

has

a location

background

component

the

always

have

components

is the

during

loss

of any Cut

structural

nonderivable

and

Paste.

We

by the following (as yet unimplemented) expedithis can be repaired cut both text and structure, paste text, reanalyze, and copy nonderivable

believe ent: data

only

Other oriented rich Some

when

the

new

language-based editing

repository services

ACM Transactions

structure

is isomorphic

operations.

The

lies in an open-ended of information locate on

and

Software

to

diagnose Engineering

ultimate

collection

assist

users

variances, and

to the

Methodology,

old.

advantage

of services in

commonly

as described Vol.

1, No

that

of languagedraw

performed above.

This

1, January

1992.

upon tasks. section

a

The Pan Language-Based describes mented that

other by

the

make

user

them

One

examples, generic

need

not

few

forms

Searching

in

one

language

variable,

and

language-based

other one

query

is

Result” part

for

the

user

variances.

123

are

imple-

and

second

representations

is all

that

to depend two

For

similar. of

particular

programs,

words

drawing

using

patterns

upon

enclosed

variable

definitions

different

variables.

in

renames

variable

the

database

same

even (where

language

of Pan’s

(lexemes),

have

replace-

natural

words

Pan’s

replacement

replacement

against

in

links,

structure,

variant

a The

during

want

on textual

desired)

variant

that

textual

One

information

logically

not

example,

variable. database

patterns.

only

Another

for

sometimes

whole-word

is

itself

the

supports than

example,

longer

to specify

language.

rather

current

“Query is defined as to

hypertext-like in

users

of this similar

of the

services,

through

But

results

level

query

recorded

For

interface

of a specified

Pan

editors,

matching.

are

matches

declaration and

structure

is difficult

command

text

The an

components

The

is textual

of a programming

operand

other

navigation

on language

substrings

documents

of

uses

by

the

supports

language

expression

set

editors

information.

pointing.

with

may

to the

all

available

components.

example

powerful

regular

replacing

an

by

text

derived

and

associated

and

cursor

any

made

user

the

underlying

Like

the

the

ordinary

identifies

description

moves

the

the

by

upon

are

Text

through

on

ment

services

internal

declaration

queries

language that

draw

the

to navigate

analysis.

when

that

components,

complex

supported

can

and

by

ment

of

locates

which

command

based

the

command

used

command defined

first

language-specific

aware

of query

highlighted,

of the

latter

be

Pan

example,

to the

emphasizing

and

.

work.

of the

search.

simple

combining

Editing System

by

instances

in

to avoid

lexical

name

vision:

current

replace-

as defined

renaming

but

that

are

7. RETROSPECTIVE Pan has limitations

with

techniques

are

mentation

supports

may

span

provides tions. for

aimed

system.

addressing

for

but

Ongoing

research

ects near

completion

within in

and

which

one

will

create

the

visual

and

a

in

successor

system

presentasupport

the

supported

impleanalysis

environments,

editing, not

the

single

language;

learning

are

a

generating

and

database

project,

use

the to

of

a

current

Pan,

is

issues.

projects

are

include

the techniques, including program viewing [65], into and investigations techniques [231. With the exception of in Pan are all textual

only

description

languages;

document;

flexibility

display

development of these

of formal per

programmers graphical

Ensemble

long-range class

language

desired

novice

software some

one

of the

execution,

The

to our

a particular

documents,

part

Support program

respect

only

multiple only

persistent

at

use

the

the

to

a simple are

leverage

gained

of

advanced

development

of Colander development

ways

and

using

the

Pan.

Proj -

user-interaction

to specify

of

strengthen

noneditable all closely

ACM Transactions on Software Engineering

from

and control user-centered language descriptions, Pan’s language description

new

tree display, the presentations coupled to the concrete syntax

and Methodology, Vol. 1, No. 1, January 1992.

124

R. A Ballance et al

.

descriptions

of the

approach

in

(1) much

richer

from

mappings of

the

the

(3)

VORTEX

proved

and

is generalizing

descriptions

approach

to

Pan’s

implementation on Natural

and

experience

gained

14];

to

a wide

range

being

based

grammars, for

expressing

of languages,

queries

however,

this

situation

for

higher-level

Semantics

the

of

media–text,

documents.

constraint

vehicle

presentations,

on

and

effective

remedying

[13,

viewing

compound

of logical

structure,

heavily

system and

video;

for

to be quite

Complete

based

editing

support

The notation has

project

document

building

document

of

sound,

integrated

Ensemble

among

appearance,

extension

graphics,

The

ways:

specification

(2)

documents.

three

is

rapidly

to

use

the

semantic

on clausal

logic,

against

the

become

verbose.

LGG

database. One

mechanism

descriptions,

as

such

an

as those

[341.

ACKNOWLEDGMENTS

Pan

is the

work

Jacob

Butcher,

made

substantial

port

of

Yuval

of many Bruce

Bill

Scherlis and

Finally,

with

whom

best

we we

ideas

have the

appear

played

wish

to

role

helpful

the

over

somewhere

the

within

Ghristina

Darrin

as

well.

We

have

of

several

sup-

grateful drafts

to

of this

“editor

years.

all

and

are

on earlier

community

past

Black,

Lane

suggestions,

comments

acknowledge

interacted

authors. and

encouragement,

a major for

the

Hastings,

The

reviewers

have

besides

Mark

contributions.

Peduel

paper.

individuals

Forstall,

people”

Many

of their

Pan.

REFERENCES 1. ACM.

Proceedings

SIGPLAN

2 AMBRAS)J., IEEE

Softw.

3 ATTALI,

1,

mentation Notes 4.

in

of the

AND O’DAY, Compiling

V.

M

Program Science,

,

Microscope:

TYPOL

Computer R.

SIGPLAN

SIGOA

Symposium

on Text

Manipulation.

A

knowledge-based

programming

environment

1988), 50-58.

5, 3 (May

an c1Logic

BAECIiER,

ACM

(ACM) 16, 6 (June 1981).

Not.

with

attribute

grammars

m zng, P. Deransart, vol.

348.

AND MARCUS,

A.

Program

In

B.

Lorho,

and

J.

mmg

Springer-Verlag,

New

York,

Human

and

Typography

Factors

Languages

Malusznski, 1988,

Imple-

Eds.

pp.

252-272,

for

More

Lecture

Readable

ACM Press, New York, 1990 5. BAHLKE, R., AND SNELTINC+, G. The PSG system: From formal language definitions to interactive programming environments ACM Trans. Program Lang. Syst 8, 4 (Ott 1986)> 547-576. Syntactic and semantic checking in language-based editing systems, 6. BALLANCE, R. A. Programs.

Ph

D.

1989. 7

dissertation,

as Tech.

Rep.

R. A,,

applications. Furukawa,

In Proceedings Of the 8th Ed. MIT Press, Cambridge,

tal

syntax

R

AND GRAHAM,

S L.

Incremental

of Cahfornia,

consistency

Berkeley,

maintenance

Internatzon al Conference Mass , 1991, pp 895-909.

in a language-based Language

Univ

editor

Design

and

S. L.

on Logic

Grammatical

In Proceedings Implementation.

for interactive Programming.

abstraction of the SIGPLAN

on

Software

Engineering

and

Methodology,

Vol

1. No

1, January

K

and incremen-

SIGPLAN Not.

88 Confer-

(ACM)

1988), 185-198.

Transactions

Dec.

89/548.)

A., BUTCHER, J,, AND GRAHAM,

analysls

ence on Programming

(July

Division—EECS,

UCB/CSD

BALLANCE,

8. BALLANCE,

ACM

Science

Computer

(Available

1992.

23, 7

The Pan Language-Based 9. BALLANCE, Tech.

R. A.,

Rep.

Berkely,

VAN

Mar.

88/409,

CENTAUR:

nia,

In[27],

F. J., HOLT,

Exper.

12, BUTCHER,

15,5

J,

Science

S, L.

The

125

architecture

Division—EECS,

Univ.

of Pan

of

I.

California,

S, G.

SRE–Asyntax-recognizing

editor.

Softw.

1985),489-497.

Master’s

Nov.

G., LANG, B., AND PASCUAL, V

14-24.

R. C., AND ZAKY,

(May

Ladle.

Berkeley,

pp.

1989.

thesis,

Computer

(Available

13. CHEN, P., AND HARRISON, M. A. 21,

AND GRAHAM,

D., DESPEYROUX, T., INGERPI, J., KAHN,

Thesystem.

11. BUDINSKY,

L.,

Computer

.

1988.

10. BORRAS, P., CLtiMENT,

Pratt.

DE VAiW~ER, M.

UCB/CSD

Editing System

Science

as Tech,

Multiple

Rep.

Division—EECS,

UCB/CSD

representation

Univ.

of Califor-

89/519)

document

development.

Computer

1 (Jan. 1988), 15-31.

14. CHEN, P., COKER, J., HARRISON, M. A., MCCARRELL, J,, AND PROCTER, S. The VORTEX of the 2nd European Conference on TEX document preparation environment. In Proceedings for Scientific 236.

15. COHEN,

J.

Documentation,

Springer-Verlag, J.

New

Constraint

Desarm6nien,

York,

logic

1986,

Ed.,

Lecture

Notes

in Computer

Science,

vol.

pp. 45-54.

programming

languages.

Commun.

ACM

33,

7 (July

1990),

52-68. 16. CONRADI, ments.

R., DIDRIKSEN,

Lecture

Notes

17, CORBETT, R. P. Science

Static

Science,

D.,

vol.

and compiler

Univ.

of California,

M.,

AND LORHO,

EDS.

244.

Aduanced

Programming

Springer-Verlag.

error

recovery.

Berkeley,

New

Ph.D.

June

Environ-

York,

1986.

dissertation,

1985.

(Available

Computer as Tech.

Rep.

85/251.)

18. DERANSART,

P., JOURDAN,

Lecture

and Bibliography.

1988. 19. DESPEYROUX, T. G. Kahn, 20,

AND WANVIK,

semantics

Division–EECS,

UCB/CSD

173.

T. M.j

in Computer

Executable

D. B. MacQueen,

Springer-Verlag,

D. R. Barstow,

York,

V., HUET,

editors: H.

specification

Attribute

1984,

G., KAHN,

The MENTOR E. Shrobe,

of static

semantics.

Eds.

Definitions,

Systems,

Lecture

In Semantics

Notes

New York, of Data

in Computer

Types,

Science,

vol.

pp. 215-233. G., AND LANG, B.

experience.

and

Grammars:

Science, VO1.323. Springer-Verlag,

and G. D. Plotkin,

New

DONZEAU-GOUGE, on structured

B.

Notes in Computer

E.

In

Sandewall,

Programming

Interactive Eds.

environments

Programming

McGraw-Hill,

based

Environments,

New

York,

1984,

pp.

128-140. 21.

DOWNS, L. M., Tech.

Rep.

Berkeley, 22.

AND VAN

UCB/CSD

Nov.

DOYLE, J.

maintenance

24.

GARLAN, D.

Palo with

Flexible

unparsing

Softw.

Programmer Eng.

28.

(Boston,

HILFINGER, Symbolic

student

Workshop,

An

P.

N.,

AND

introduction Univ.

for of

COLELLA,

Pa.,

pp. 97-138.

and

mechanisms Berkeley,

G. M.

Olson,

editing

D.

IEEE

SoftuJ.

Gandalfl

in Pan. Nov.

environment.

Univ.,

users.

California,

Software

B. L. Webber Master’s

thesis,

1991.

Tech.

Pittsburgh,

4, 5 (Sept.

Rep. CMU-CS-85-

Pa., Apr.

1985.

1987), 62-70.

development

environments.

IEEE

1986), 1117-1127. 88: 3rd ACM,

P.

Fidil: to

Symposium

New A

York, language

Scientific

programmers. S. Sheppard,

and

on Soflware

Development

Envi-

1988. for

scientific

Computing,

D. A., AND SCHULTZ, A. C.

professional

Intelligence,

description

1988).

Applwations

1989,

pp. 496-516.

1981,

of California,

SIGSOFT

28-30,

in Artificial

Calif.,

Univ.

12 (Dec.

ACM

Nov.

HOLT, R. W., BOEHM-DAVIS, for

4.0:

Division—EECS,

In Readings

Carnegie-Mellon

Compatat~on:

Philadelphia, 29.

I version

language

as reader.

SE-12,

27. HENDERSON, P., ED, ronments

Pan Science

in a structure

Science,

26. HA.BERMANN, A. N., AND NOTKIN, Trans.

L.

Alto,

Division–EECS,

of Computer

GOLDBERG, A.

system.

Tioga,

Experience

Science

129, Dept. 25.

Eds.

FORSTALL, B. T. Computer

M.

Computer

1991.

A truth

and N. J. Nilsson, 23.

DE VANTER, 91/659,

Mental

In

Empirical

E.

Soloway,

R.

representations

Studies Eds.

programming.

Grossman,

Ed.

of programs

of Programmers:

Ablex,

Norwood,

In SIAM,

Second

N. J.,

1987,

p, 33, 30. HOOVER, R., AND TEITELBAUM, attribute tion.

grammars,

SIGPLAN

Not.

T.

In Proceedings

Efficient of the

incremental SIGPLAN

evaluation 86 Symposium

of aggregate

values

on Compiler

Construc-

in

(ACM) 21, 7 (July 1986), 39-50.

ACM Transactions on Software Engineering and Methodolo~, Vol. 1, No 1, January 1992

126

R. A. Ballance et al.

-

31. HORTON, M. R, Ph.D 32.

33. JALILI, ACM

ACM

F., mm

Trans.

New

34. KAHN,

G.

KAHN,

Sci.

KAISER,

G. E.

B.,

P. A

38. KNUTH, LANG, B.

41.

LETOVSKY,

SoftuJ.

44,

AND

45

L.

Trans.

46. NEAL, 1987

Palo

of syntax

E.

D. A.

R.,

program

Alto SE-7,

dissertation,

based

Umv.

Dept.

of

on an incremental

of Illinois

at Urbana-

1984), 97-111.

[161, PP

In

47-51. In

Norwood,

plans

and

factors

for error.

Center,

P. H.

5 (Sept.

Empirical

N J., 1986,

program

Studzes

of

pp. 58-79

comprehension.

IEEE

In

User

A.

Centered

Norman

and

System S.

Deszgn:

W.

New

Draper,

Eds.

in

An

an

interactive

Alto,

Calif.,

Tech

Rep.

incremental

environment. 1980.

programming

environment

IEEE

1981), 472-481.

(Toronto,

R.

D.

Palo

m computing

Proceedings

AND COOEL C. G

Computer

D.

A structural

Science

PRAWITZ, holm,

systems

Apr.

The

Ergo

5-9,

and

attribute style

graphical

1987).

ACM,

system.

1s more

interfaces.

New

York,

In

1987,

CHI

+ GI

pp. 99-102.

In [27], pp. 110-120

than

cosmetic.

Commun.

ACM

33, 5

D.

Natural

REISS, S. P.

of

Software

York,

52. REPS, T., tion.

approach

Aarhus

program

development

1984,

the

ACM

semantics

Denmark,

Tech.

Sept.

Rep.

DAIMI

FN-19,

Wlksell,

Stock-

1981.

Theoretic

Stady.

Almquist

with

PECAN

program

and

Enuwonments

T.

T.,

editors

SMITH, B.,

1988),

SE-10,

57. STALLMAN,

R. M

In [1], pp.

147-156

Trans.

Transactions

Univ., A

Generator Ithaca,

Apr.

Lang.

The programmers

Reference

N. Y.,

Incremental

Program.

G., EDS.

Intelligence. 5 (Sept. EMACS:

Ellis

K.

Reason Norwood,

23-25,

on

1984).

ACM,

Syst.

apprentice:

Manual,

Second

Edi-

1987.

context

dependent

5, 3 (July

1983),

A research

analysis

for

449-477.

overview

Computer

Empmical

Artificial on Software

Maintenance

Systems

Chichester,

studies

and

Their

Applications

1988.

of programming

knowledge.

IEEE

Trans.

display

editor.

1984), 595-609. The

extensible,

STEELE, G. L., JR., AND SUSSMAN, G. J. of Technology

Pa.,

system.

Symposium

10-25.

SOLOWAY, E., AND EHRLICH, Eng.

development

Engineering

(Pittsburgh,

Syntheswer

Cornell

AND DEMERS,

ACM

AND KELLEHER,

in Artificial

The

Science,

RICH, C., AND WATERS, R. C. 21, 11 (Nov.

Software

pp. 30-41.

of Computer based

SIGSOFT/SIGPLAN

Development

AND TEITELBAUM,

Dept.

language

Softw.

Aarhusj

A Proof

REPS, T , TEITELB~UM,

Series

to operational

Umv.,

Deduct~on:

Graphical

Proceedings

New

Dept.,

1965.

Practical

ACM

for-

1990), 506-520

49, PLOT~IN,

58.

specify

1985.

comprehension Ablex,

analysis

Research

AND FEILER,

Human

Conference

(May

56.

to

pp. 411-432.

Typographic

55.

25-27).

1978.

Interaction,

F.

54

May editor

editors.

Eds.

Designing

OMAN,P.,

53.

, Jan.

formalism

Ph.D.

Pa.,

J. 27, 2 (May

Delocalized

NORD, R. L., AND PFENNING,

In

A

Science,

Alto,

program

and S. Iyengar,

48.

51.

of the 9th Annual N.M

1987.

environments.

directed in

47

50

and

151-188.

Pittsburgh,

Computer

processes

Global

Palo Eng.

L. R.

Feb.

of Computer

Manual.

N .J., 1986,

M.

Softtu.

Proceedings

A language-oriented

Human-Computer

Xerox

MEnINA-MORA,

In

Metal:

E

1983),

editing

Dept.

SOLOWAY,

Hillsdale,

SSL-80-1,

1981.

on relations

1986), 41-49

on

MASINTER,

based

(Albuquerque,

601, INRIA,

Univ.,

editor:

Users

43, LEWIS, C., AND NORMAN, Erlbaum,

structure

usefulness

3, 3 (May

Perspectwes

parsers.

AND MORCOS,

programming

Cognitive

S.,

capabilities.

Berkeley,

1986), 577-608.

Languages

3, 2 (Aug

SAGA

E. Soloway

LETOVSKY,

Rep.

B.,

dissertation,

Bravo

On the

Programmers,

Tech.

for

Literate

S.

detection

environments

8, 4 (Oct.

friendly

error

of Califorma,

1985.

39. LAMPSON, B. W. 40.

Syst.

Budding

Program.

The

Dec.

D. E.

Lang.

Carnegie-Mellon

C.

static Univ.

editing

of Programming

MtiLhsE,

Ph.D.

with

196-206.

Semantics

parser.

Champaign,

pp.

Comput.

Science,

37. KIRSLIS,

H.

semantics.

LANG,

malisms

LR(l)

1982,

Natural

G.,

J

editor

Divmion-EECS,

Generating

T

on Principles

York,

Computer

42,

Science

Program.

GALLIER,

Symposium

ACM,

36.

of a multi-language

Computer

HORWITZ, S., AND TEITELBAUM, attributes.

35.

Design

dissertation,

Intelligence Engineering

customizable,

Constraints.

Laboratory,

AI Memo

Cambridge,

and Methodology,

self-documenting

Vol

502, Massachusetts

Mass., 1, No

Nov.

1978.

1, January

1992

Institute

59.

STERLING, L., AND SHAPIRO, E. Press,

60,

Cambridge,

Mass.,

STROMFORS,O.

Editing

The Pan Language-Based

Editing System

The ArtofProlog:

Programming

Aduanced

.

127

Techniques.

MIT

1986.

large programs

39-46. 61. TEITELBAUM, T., AND REPS, T.

The

using

Cornell

astructure-oriented program

text editor.

synthesizer:

In [161, pp.

A syntax-directed

pro-

ACM24,9 (Sept. 1981),563-573. 62. TEITELBAUM,T., REPS,T., ANDHORWITZ,S. Thewhyand wherefore of the Cornell program synthesizer. In[l], pp. 8-16. (Mar. 1985). 63. TEITELMAN,W, Atour through Cedar. IEEE Trans. Softw. Eng. SE-llj3 64. VAN DE VANTER, M. L. Error management and debugging in Pan I. Tech. Rep. UCB/CSD 89/554, Computer Science Division–EECS, Univ. of California, Berkeley, Dec. 1989. design for language-based editing systems. Ph.D. 65. VAN DE VANTER, M. L. User interface gramming

environment.

dissertation,

Commzm.

Computer

Science

Division–EECS,

Univ.

of California,

Berkeley.

To be pub-

lished. 66. VAN DE VANTER, language-based 67.

WATERS, Not.

R. C.

(ACM)

WooD,

Rece,ved

Program

R. A.,

Int.

editors

J.

AND GRAHAM, Man-Mach.

should

Beyond programming

S. R.

Z–The

January

1991;

ACM

systems.

not

S. L.

Stud.

abandon

text

Coherent

user

interfaces

for

(1992). To appear. oriented

commands.

SIGPLAN

17,7 (July 1982), 39-46.

68. WINOGRAD, T. 69.

M. L., BALLANCE, editing

95% program

revised

Transactions

and

languages. Conzmun. In [11, pp. 1-7.

accepted

on Software

ACM

22,7

(July

1979),

391-401.

editor.

September

Engineering

and

1991

Methodology,

Vol.

1, No.

1, January

1992.