SAX: Generating Hypertext from SADT models Nicola Cancedda
Dipartimento di Informatica e Sistemistica Universita di Roma \La Sapienza" via Salaria 113 - 00198 Roma
[email protected]
Gjertrud Kamstrup, Emanuele Pianta
IRST - Istituto per la Ricerca Scienti ca e Tecnologica Loc. Pante di Povo - 38050 Trento
[email protected],
[email protected]
Ettore Pietrosanti
Finsiel SpA via G. Bona - 00100 Roma e.pietrosanti@ nsiel.it
Abstract
Conceptual model validation is crucial in the information system development process, but it may be dicult to accomplish if the domain expert is not acquainted with the formal language used by the analyst. In this paper we present SAX, a system that generates hypertext descriptions of conceptual models designed with the SADT methodology. The combination of natural language and hypertext signi cantly lowers the communicative barrier between the analyst and the domain expert, thus increasing validation eectiveness. The application of hybrid techniques for text generation guarantees an optimal trade-o between robustness and portability across domains on one side and text uency on the other.
Keywords: Conceptual Model Validation, Hypertext Generation, Flexible Templates
1 Introduction The development of complex information systems usually relies on established analysis and design methodologies. These methodologies make frequent use of formalized or semi-formalized graphic languages, especially in the conceptual analysis phase: E-R diagrams, DFD, SADT and Petri nets just to mention some of the best known. While these languages strongly contribute to increase analysis reliability, they may turn into an obstacle whenever a domain expert is called to validate the resulting model. The expert may not know nor be expected to learn the analysis formalism in order to interpret the model correctly and possibly detect subtle faults in a large description. In such a situation the analyst should provide the domain expert with additional and comprehensible documentation. Textual natural{language descriptions of the model are often suitable for this purpose. The motivation behind our work is that valuable analyst time could be saved if these texts are produced automatically. Furthermore, the analyst herself may more easily detect aws in the model by reading a natural language description of it. Thus automatic generation of model descriptions can increase the accuracy of the conceptual modeling process. In particular, our work focused on the problem of automatically generating a hypertextual description of models designed with the SADT methodology1 [Marca and McGowan 1988]. In this paper we present SAX (SAdt eXplain), a system that automatically generates hypertext descriptions of any SADT model. The input to the system is the SADT model representation adopted by the \DAFNE Tools", a CASE system which automates the DAFNE Methodology2, a exible model for software production which comprises methods, techniques and tools supporting all levels of the software development Structured Analysis and Design Technique is a trademark and copyright by Softech Inc. c 1982, 1983, 1986, 1987, 1989-1996) of Data and Functions NetDAFNETM is a Copyright and registered trademark ( working, Finsiel Group. 1 2
process. The user can tailor the hypertext generation through parameters that select the global presentation strategy and the way speci c parts of the diagram are described. The SAX architecture is shown in g. 1. It includes three main components. The Text Planner is responsible for the content selection, the global textual organization and the sentence level template-based phrasing. The Linguistic Realizer performs morphological synthesis and phonological adjustment, while the HTML Writer translates a hyper-text plan into a html document. SAX is implemented in Prolog and runs in the Windows environment. A rst version of the system is currently under beta-testing. The system is sponsored by Finsiel SpA and jointly realized by IRST, Finsiel SpA and the Computer Science Department of the University of Rome \La Sapienza". The SAX application scenario involves dierent types of users. We will adopt the following convention to refer to each of them: author: the person who writes the templates and the schemas to be used for generating the texts analyst: the person that uses the system to generate a hypertextual description of the model (usually, but not necessarily, the person that designs the SADT models) reader: the person who reads the nal texts (may be the analyst herself or the domain expert)
SADT model
User Interface
Labels
Communicative schemata and Templates
Text Planner
Label processor
Syntactic module Linguistic Realizer
Morphological analyzer/generator HTML Writer
Hypertext
Figure 1: The SAX architecture
1.1 The SADT language
A SADT model is a collection of diagrams, organized in a tree structure, used to describe an activity at dierent levels of abstraction: the root diagram accounts for the whole activity and speci es its interactions with the external world, while the other diagrams describe how the activity is decomposed in sub-activities at
an increasing level of detail. Each diagram is composed of boxes, representing activities, which are connected by arrows, representing ows of materials, data and informations (see g. 2 ). The objects represented by the arrows may play dierent roles with regard to the activities. Possible roles are:
input: an object consumed during the activity or transformed in order to produce an output; control: an object (usually a source of information) exerting some kind of in uence on the activity; output: the result of an activity; mechanism: an entity (a person, an organization, an instrument etc.) performing the actions, or under whose responsibility the activity is carried out.
The role of an arrow is indicated by the box edge through which the arrow goes in or out: inputs go into the left box edge while outputs go out of the right one, controls go into the top edge and mechanisms go into the bottom one. Every box and arrow is given a label, i.e. a natural language expression informally describing an activity (boxes) or a ow (arrows). Box labels are constrained to be in nitive verb phrases while arrow labels should be noun phrases. No other constraints on the linguistic form of the labels are imposed. Telegraphic style is frequently encountered (i.e.: articles and prepositions may be missing); acronyms and abbreviations are also quite frequent. As labels must be short, both for readability and for layout purposes, a glossary is included in the model. Each glossary entry provides a more detailed explanation either of the activity represented by a box or of the content of an arrow. A box representing a complex activity can be \exploded" into a diagram, thus obtaining a tree structure. Leaf-boxes can be annotated with activation rules specifying what inputs and controls are needed to obtain a certain box output. C1 Strategie
C3
aziendali
Normativa C2
Caratteristiche stabilimento C4
Budget
Definire prodotti e Obiettivi di produzione obiettivi di produzione
O3
Specifiche tecniche
TEC
VEN
DIR
I2
Materiali
Fabbricare prodotti di qualita’.
Piani e programmi Prodotti Informativa di produzione
O1 O4
Conferme / Ordini accettati Richieste da reparti / Richieste non soddisfatte DIR
PER
CQU Situazione a magazzino I1
Ordini
Gestire rapporti clienti / fornitori
DIR APP Piano di spedizione
GESTIRE LA PRODUZIONE DI BENI INDUSTRIALI
Figure 2: A SADT diagram
VEN AMM
O2 Ordini di acquisto / Conferme da fornitori
2 Related work In this section we will present related work and systems that have properties in common with the SAX system. Text generation as a method for validating and easing the non-expert's comprehension of conceptual models has been treated by several researchers (see [Dalianis 1992, Reiter 1994, Gulla 1995, Passonneau et al. 1996]). Many researchers also agree upon the usefulness of natural language descriptions of models for the analyst herself: the text can help her in iteratively designing the model. Gulla argues that natural language paraphrasing is a useful model validation technique. In his system, explanations are generated from PRM models, which are an extension and formalization of traditional data
ow diagrams. Reiter presents a generation system which explains Entity-Relationship diagrams, while Passonneau et al. have realized a system that generates descriptions of data ow graphs. The above mentioned systems use full Natural Language Generation (NLG) technology and generate plain text. See [Reiter 1995] for a distinction between NLG, template-based and hybrid approaches to text production. Other systems presented in the literature are relevant for our purposes because they produce hypertexts, although not in the conceptual model validation domain. We will take a look at some of these systems, which all use template-based or hybrid generation techniques to generate dynamic hypertexts, i.e. hypertext that is generated in response to user requests (clicks on links), possibly taking the browsing context into account3. IDAS [Reiter et al. 1995] generates hypertexts concerning technical documentation. Since the system generates very short dynamic hypertexts (one paragraph, at most), less high-level text planning is needed than in other systems: the user performs this task by following one hypertextual link rather than another. IDAS was initially designed to use pure NLG technology, but hybrid approaches were applied after a bene tcost evaluation. The authors present the idea that intermediate techniques can be used to reduce the costs of using NLG by sacri cing those of the NLG advantages that are not regarded as important in the current application. Important aspects of hypertext generation are considered in [Knott et al. 1996], where the ILEX system is described. Upon request from the user, ILEX generates dynamic hypertext for simulated tours in a museum. In ILEX a hybrid text production approach is used: canned text is interleaved with information coming from KB entries. The canned text is annotated with conditions to guarantee a certain amount of adaptation to the context, such as production of eective referring expressions. The system keeps track of the discourse history and does not repeat the description of a museum item: a pointer to the previously generated description is given instead. PEBA-II [Dale and Milosavljevic 1996, Milosavljevic et al. 1996] dynamically generates hypertext descriptions of a zoological database through an online interface. Discourse history is used to obtain context sensitive text. The authors argue that hypertext signi cantly eases the user modeling task since part of the content selection is made by the user. In [Geldof 1995] a system capable of generating hypertextual descriptions of KresT knowledge systems is brie y sketched that makes use of rigid templates for sentence level generation. In [Geldof 1996] a templatebased approach is used both at the planning level and at the realization level. Page-templates lled with information from databases are used to generate hypertext nodes in a movie festival context. A main motivation for using dynamic hypertext is the continuously changing nature of the database information: as time passes, some information become less or not interesting at all. As in some of the systems presented in this section, the application domain of SAX is the validation of conceptual models. In our system the validation is obtained through natural language hypertextual descriptions. In section 3 we will argue that a hybrid text production approach is suitable in our domain and with our system requirements. 3
The content and organization of static hypertext on the contrary are completely predetermined by the hypertext authors.
3 A hybrid approach based on schemas and templates When designing an automatic text generation system, one should choose the kind of linguistic resources which are most suitable to the task. A clear distinction is usually made between template-based approaches and deep generation-based (NLG) approaches. Template-based approaches are usually rated as ecient but not exible, while deep generation is considered exible but also inecient and resource consuming. Also, NLG systems are usually dicult to update and require specialized support, while template-based systems can be maintained by non-linguists. However, the distinction between these two approaches tend to become less clear-cut. On the one hand, NLG systems are beginning to use templates when deep generation is not strictly necessary. On the other hand, we think that template-based systems could become more exible and powerful. In our system we pursued the goal of integrating the two approaches. Firstly, we decided to use a classical NLG approach for the planning of the global structure of the text using communication schemas, while using templates at the sentence level. Secondly, we tried to enhance the exibility of traditional static templates. In this section we present the communication schemas used in SAX and the hyper-template formalism developed in the system.
3.1 The communication schemas
To represent the global structure of text we use communication schemas, see [Not and Pianta 1995]. A communication schema is a variant of rhetorical schemas [McKeown 1985], enhanced with the communicative intention that the corresponding presentational pattern is meant to satisfy. All the schemas in SAX have been identi ed through analysis of a corpus of hand-written SADT diagram descriptions. A communication schema has the following structure: c_schema( head(), intentions(), effect(), constraints(), body(
), order() ).
The head slot identi es the communication schema. The intentions slot is meant to allow a free documentation string about the communicative intentions behind the schema. On the other hand, the eect slot describes, in a simpli ed but formal way, the mental state that the schema is meant to induce in the reader. The constraints must be veri ed before applying the schema. The body slot includes a list of sub-schemas that articulate the main schema. Each sub-schema can be optional and its expansion can be restricted by local constraints. The order slot includes linear precedence constraints on sub-schemas. If no linear precedence constraint is speci ed, a default order is assumed. An example of a communication schema is shown in g. 3. The schema describes how to present the overall activity modelled by the diagram. Observe that the body includes both a sub-schema and a template. The schemas are recursive structures terminated by a call to templates which realize the sentence level.
3.2 A exible hyper-template formalism
The main feature of the template formalism is the possibility of coping with morphological agreement and phonological adjustment. It also allows the functional description of hyper-links and images to be inserted in the nal text. A template is a declarative structure including two kinds of elements:
connectives gaps
c schema( head( main activity(ActivityId)), intentions( 'present the main activity of the diagram'), effect( know(reader, structure of(ActivityId))), constraints( (sadt parameter(strategy, forward), activity topology(ActivityId, ActTopology))), body( [activity(ActivityId, forward, ActTopology), template(sub activities summary(ActivityId, forward))]), order([before(activity( , ), sub activities summary( ))])).
Figure 3: A communication schema used in SAX Connectives are formed by preselected linguistic items, while gaps can be seen as variables which are instantiated by other linguistic items ( llers) during the generation process (template resolution). Both connectives and llers may include xed or exible items. Flexible items are realized dierently according to the context in which they occur. In the current implementation of the formalism, exible items may undergo morphological and/or phonological variation. The author can de ne a template through the following syntax: template(, )
A is a term4 which can include variables, thus allowing the de nition of parametric templates. The can include a sequence of template units. Here is a list of the most important template units:
Strings are sequences of characters that will be inserted in the text without modi cations, for example: 'is responsible of'
Potential words A potential word is a word form which can undergo phonological adjustment. It is described by a term specifying the base form and its lexical category: w(noun, responsibility)
Sequences of potential words are mapped onto sequences of strings by the Linguistic Realization component. For example, in Italian: [w(preposition, di), w(article, i), w(name, responsabili)]
becomes: ['dei', 'responsabili'] 4 In the logic programming terminology, a term can be a constant, a variable or a structure (also compund term). A structure comprises a functor and a sequence of one or more arguments which are terms. A functor is characterized by its name (which is an atom) and its arity, i.e. number of arguments. Syntactically a structure has the form f(t1, t2, t3, ..., tn) which has functor f/n.
Morphological bundles These are sets of morphological features that are mapped onto potential words and then on strings by the Linguistic Realization component. For example, the following bundle:
morpho([cat=noun, pred=company, num=plur])
will be mapped onto the potential word w(noun, companies) and then onto the string 'companies'. When used in the template de nitions, the values of morphological features can be variables: morpho([cat=noun, pred=company, num=Num])
Morphological variables allow to treat agreement phenomena which are dicult to handle with static templates; these variables are instantiated during the template resolution process.
Picture descriptors A template can introduce a picture in a hypertext by specifying the absolute name of a le or a functional expression which is evaluated during template resolution.
Slots A slot is a gap to be lled by linguistic items during template resolution. In the SAX application domain these linguistic items are selected from the labels of boxes and arrows of a diagram. More speci cally, slots are de ned in relation to a certain activity. The author can refer to "the label of" or "the inputs going in to" or "the outputs coming out of" an activity with a given identi er. Here is an example of a slot speci cation slot(inputs, ActId, Agreement)
This expression refers to the input labels of the activity identi ed by ActId (input parameter). The variable is instantiated as a result of lling the slot (output parameter). A slot expression can be extended by a further expression specifying the syntactic elaboration that must be performed on the labels that ll the slot. For example: Agreement
slot(activity_label, ActId) with_parse optional(nominalization)
This slot will be lled by the label of the ActId activity; if possible the label's verb phrase which forms the label will be nominalized; otherwise the label will be inserted without elaboration.
Control expressions Template de nitions can include conditional and disjunctive expressions. Conditional expressions bind the resolution of a subpart of the template to the satisfaction of certain constraints. Here is an example: template(controls(ActId), if_then_else( exist_several(controls, ActId), %if [template(itemized_controls(ActId))], %then [template(coordinated_controls(ActId))])). %else
Following this de nition the controls of an activity are presented in an itemized list only if there are several of them. The example also shows that templates can use other templates recursively in their de nition. Disjunctive expressions give alternative ways of expressing something. The template resolution mechanism uses disjunctive expressions to avoid repetition of expressive cliches: or(['taking into account', 'considering'])
template(diagram title(DiagId, ActId), [ /* format operator list */ format([title1, title case], /* slot without syntactic elaborations */ slot(label, ActId)), /* special character */ &newline, /* control expression */ if then else( /* constraint conjunction */ ( has father diagram(DiagId, FatherDiagId), not sadt parameter(presentation, only text) ), /* then */ [format( /* parametric style */ [picture link(FatherDiagId]), /* picture with absolute identifier */ [picture(image, 'father.gif')]), &newline], /* else */ [&newline]), if then else( /* simple constraint */ has son diagram(DiagId), /* pictures identified through functional descriptors; the first one is a clickable image */ [picture(map, diagram image(DiagId))], [picture(image, diagram image(DiagId))]) ]).
Figure 4: A sample hyper-template de nition
Formatting The template formalism allows to include any subpart of a body within the scope of one
or more formatting operators such as: italic, bold, title1 etc. All HTML format operators can be used. In addition some extra operators have been de ned; for example a list of items can be formatted as list (column of bullet items), enumerate (column of numbered items) but also as coordinate (simple coordination: ", ... and "). Note that a format operator can be applied to a slot. For example: template(listed_outputs(ActId), ['The activity produces:', format([list, italic], slot(outputs, ActId))]).
The formalism also supports style de nitions, as sets of format operators: style(my_style, [list, italic]).
Links Hypertext links are treated as a special class of format instructions. They are speci ed through compound terms, which refer to linked documents through absolute addresses ( le name) or functional descriptors, evaluated at run time. Here is an example of a functional link description: link(to, glossary(activity, paragraph))
This descriptor is evaluated as a link from an activity description to the corresponding entry of the activity glossary. Fig. 4 shows an example of a complex template de nition.
4 The SAX system The input to SAX is a SADT model description represented with the formal language internally used in the \DAFNE" CASE tool. The output of the system is a natural language description (in Italian), composed of one hypertext page for each diagram in the model plus one page for each of the two glossaries attached to the model and an index page.
4.1 The Generation Process
After a preliminary step (not shown in g. 1) in which the input model is translated from the source language to a Prolog representation, text planning is activated. Hypertext generation is performed in three phases: Text Planning, Linguistic Realization and HTML Translation. The Text Planning component computes the structure and the content of the text on the basis of communication schemas and templates. The Text Planner recursively expands a root schema following a top down strategy and building a text tree structure. The selection of communication schemas is guided by the topological structure of the diagram being described and by the user preferences on the description strategy (see section 4.3). When the Text Planner reaches the sentence level a template solver is called which integrates diagram labels in the text tree structure. The output of the Text Planner is a textual tree with instantiated templates as leaves. Instantiated templates may include strings, potential words and ground morphological bundles. The Linguistic Realizer maps morphological bundles onto words; all possible phonological and orthographic adjustments are completed. The HTML Translator then maps the textual tree to a html document.
4.2 Label transformations
The activity and arrow labels cannot be inserted in the text as they are. In order to produce a uent and readable text, a label processor adapts labels to the context in which they are inserted. Basically, three kinds of operations are performed:
a word can be substituted with a morphological bundle parametric with respect to some agreement features; a verb can be substituted by the corresponding nominalization; articles and prepositions can be inserted.
The adaptation of labels is performed through a special kind of syntactic analysis. During the system analysis phase two approaches to label transformation were considered. The rst one is based on pattern matching, the second on a full cycle of syntactic analysis, transformation and re-generation. After some testing we realized that using pattern matching would have led to a great number of ad hoc rules; on the other hand using the full machinery of a text analyzer and generator seemed an overkill given the relatively simple syntactic transformation that were needed in our application domain. The solution adopted in the system is a hybrid one. The transformations are carried out by a De nite Clause Grammar which performs a shallow syntactic analysis of labels and produces as output a sequence of linguistic items which can be strings, potential words or morphological bundles, i.e. a subset of the kind of items that can be used to
de ne templates. These sequences are mapped onto actual sentences through morphological synthesis and phonological adjustment; deep sentence generation is unnecessary. Note that both the morphological and the syntactical analysis of labels can fail. In this case labels are returned unchanged; however, a dierent template is selected to arrange them in the text. In this way, robustness with respect to lack of linguistic coverage has been achieved, and performance declines gradually.
4.3 User preferences
The analyst can make choices about the text properties by setting the values of some attributes. There are two global attributes, valid for the text as a whole: presentation format and description strategy. Setting the presentation mode consists in choosing between plain ASCII text, HTML formatted text or HTML hypertext. When choosing the description strategy the overall content structuring strategy is selected. As could be expected, and as a thorough analysis of a corpus of human-written descriptions has con rmed, a diagram is usually described by presenting one sub-activity after the other, starting from the top-left and ending at the bottom-right (forward strategy). There are situations, however, in which the reader is interested in a dierent presentation, for instance if she wants to trace how each output in a diagram is produced. To deal with such cases, SAX is able to structure its output along dierent guidelines. For instance, the description can be forced to proceed from each output to the activity that produces it, and then backward to the arrows involved in this activity (backward- ow strategy) and so on till all outputs have been presented. Sometimes the choices made by the system with regard to the description of diagram elements lead to sentences that are, if not incorrect, at least clumsy. A way to overcome this problem would be to make a more sophisticated domain model available to the system in order to exploit semantic constraints. Unfortunately it is unrealistic to assume the availability of such a domain model: there is no plausible restriction to the domain a SADT model could be designed for. As an alternative, we decided to allow the analyst to force some of the choices by setting the values of local attributes associated with single diagram elements. For instance, box labels are often subject to nominalizations, and sometimes this may lead to poor text. If the analyst is not satis ed with a certain nominalization, she can set a local attribute preventing such a transformation. User preferences are collected by means of a graphical interface and stored with the model representation. The current version of the system allows to set the following attributes:
label transformation. The analyst can force or prevent any of the implemented label transformations. grammatical number speci cation. Sometimes the grammatical number of a label is ambiguous. The analyst can disambiguate it by adding morphological information to the words in a label. verb selection. As the system can not perform a deep semantic analysis of the labels, it describes the relation between activities and arrows with generic verbs, e.g.: \activity x produces y". The analyst can force the system to use a more speci c verb.
Local attributes allow user preferences to be kept and re-used in subsequent description generations for the same model: this represents a clear advantage over \brute" post-editing.
5 System output In this section a hypertext generated by SAX is presented. A discussion of the overall structure of the hypertexts produced by the system follows. In g. 5 the hypertext generated for the SADT diagram previously presented in g. 2 can be seen. The gure shows the clickable diagram and the rst part of the text (an approximate English translation can be read in g. 6). The text is generated using a forward strategy (see section 4.3). The text shows some examples of label transformations. The arrow label \materiali" has undergone a transformation where a preposition (\da") and a determiner (\i") have been added to obtain \da i materiali"; after phonological adjusment the nal \dai materiali" is obtained. There is also an example of verb nominalization. The activity label \de nire prodotti e obiettivi di produzione" has become \la de nizione dei prodotti e degli obiettivi di produzione".
Figure 5: Generated model description with clickable diagram image
The activity of managing industry production is carried out using the materials, the orders and the customer confirmations. The activity is influenced by the following factors: -
the the the the
budget regulation plant characteristics company strategies
The following results are obtained from the activity: -
the the the the
production goals production numbers products purchase orders
The activity of managing industry production can be divided into the following subactivities: 1. the definition of products and production goals 2. the manufacturing of quality products 3. the management of customer/provider relations
Figure 6: Translation of the generated text The noun phrases referring to an activity (for instance \gestire la produzione dei beni industriali") represent a link to a glossary page where the activity is explained in more detail. The same holds for arrow references (for instance \dai materiali" and \la normativa"). These are just some of the available links. Let us take a look at the overall hypertext structure generated for a generic model. Recall from section 1.1 that diagrams are organized in a tree structure: there is one root diagram that is decomposed into several sub-diagrams which in turn might be decomposed into other sub diagrams etc. In SAX one hypertext diagram page is generated for each diagram in the SADT model. Two glossaries are also generated, one for the activity labels and one for the arrow labels. Finally, an index keeps track of all the available model pages. A network of links has been created to ease the navigation between all the model description pages. Fig. 7 contains a schematization of this network of links5 . The links can be grouped in three classes:
diagram/diagram links: they occur in two forms, from Up or Down buttons to parent or child diagrams respectively, or from the map images to child diagrams. diagram/glossary links: the text contains references to activities or ows, possibly rephrased with regard to the original model labels. Each reference points to the glossary page providing further information about the activity/ ow. index/diagram links: the reader can access any diagram page or glossary from the index and viceversa.
5 The bold boxes in the gure are hypertext pages, the grey arrows are hyperlinks, the arrow icons inside the main diagram page are hyperlink buttons, and the activity and arrow references are pieces of text with connected links. Finally, the map image is a clickable version of the SADT diagram.
Figure 7: Page hierarchy and connecting links
6 Conclusions and further development This paper discussed the approach followed in SAX, a system for automatic generation of hypertextual descriptions of SADT models. The adopted solutions combine the advantages of hypertextual output format with those of a hybrid natural language generation architecture. Hypertext pages corresponding to whole diagrams are built statically, so that they can be used for system documentation also in printed form. A full network of links allows a rapid access to all the information the
domain expert needs to validate the model. A hybrid solution has been chosen for text generation: a classical NLG approach is adopted for the planning of the global structure of the text, while exible templates are used for sentence level generation. The formalism devised for templates copes with morphological agreement and phonological adjustment, thus allowing the generation of exible and uent text. The decision to give up with traditional sentence-level generation led to a system running on PCs, in the Windows environment, and capable of generating each diagram description in less than three seconds. The analyst is given the possibility to in uence the output both at a global level, by stating a \description strategy" to follow, and locally, by constraining the way single diagram elements are linguistically expressed. The work will proceed in the following directions in the future. First, a template editing environment suited for the non-linguist system maintainer will be developed. Furthermore, the exploitation of the text contained in glossary entries for generation purposes will be considered. As glossary entries are basically arbitrary texts, their use in the description generation process was at rst discarded. Nevertheless, an evaluation activity conducted together with potential SAX users pointed out the convenience of having at least some of the information contained in the glossaries readily available while reading the description, also when this is in the form of printed text. Our investigation is consequently proceeding in two directions. On the one hand, we are trying to apply super cial analysis techniques to extract information relevant to the reader from glossary entries, while on the other hand we are considering which constraints should be imposed on glossary entries in order to exploit them more thoroughly. Other future work might include the realization of an English version of the system as well as an extension to dierent conceptual modeling formalisms.
References [Dale and Milosavljevic 1996] Robert Dale and Maria Milosavljevic, March 1996. Authoring on Demand: Natural Language Generation in Hypertext Documents. In Proceedings of the First Australian Document Computing Conference, Melbourne, Australia. [Dalianis 1992] Hercules Dalianis, 1992. A method for Validating a Conceptual Model by Natural Language Discourse Generation. In Proceedings of the Fourth International Conference on Advanced Information Systems Engineering, Springer Verlag, 425{444. [Geldof 1995] Sabine Geldof, May 1995. Interfacing Knowledge: Text Generation from Knowledge Systems. In Proceedings of the Fifth European Workshop on Natural Language Generation, Leiden, the Netherlands. [Geldof 1996] Sabine Geldof, June 1996. Hyper-Text Generation from Databases on the Internet. In Proceedings of the Second International Workshop on Applications of Natural Language to Information Systems, Amsterdam, The Netherlands. [Gulla 1995] Jon Atle Gulla, 1995. A General Explanation Component for Conceptual Modeling in CASE Environments. ACM Transactions on Information Systems. [Knott et al. 1996] Alistair Knott, Chris Mellish, Jon Oberlander and Mick O'Donnell, 1996. Sources of Flexibility in Dynamic Hypertext Generation. In Proceedings of the International Workshop on Natural Language Generation. [Marca and McGowan 1988] David A. Marca and Clement L. McGowan, 1988. SADT, Structured Analysis and Design Technique. McGraw-Hill Book Company. [McKeown 1985] Kathleen McKeown, 1985. Text Generation. Cambridge University Press, Cambridge. [Milosavljevic et al. 1996] Maria Milosavljevic, Adrian Tulloch and Robert Dale, January 1996. Text Generation in a Dynamic Hypertext Environment. In Proceedings of the 19th Australasian Computer Science Conference, Melbourne, Australia.
[Not and Pianta 1995] [Passonneau et al. 1996]
[Reiter 1994] [Reiter 1995] [Reiter et al. 1995]
Elena Not and Emanuele Pianta, April 1995. Speci cations for the Text Structurer. GIST deliverable, TST-2, LRE Project 062-09. Rebecca Passonneau, Karen Kukich, Jacques Robin, Vasileios Hatzivassiloglou, Larry Lefkowitz and Hongyan Jing, June 1996. Generating Summaries of Work Flow Diagrams. In Proceedings of the International Conference on Natural Language Processing and Industrial Applications, Moncton, Canada. Ehud Reiter, 1994. Linguistically Based Generation of Software Documentation. Final Technical Report RL-TR-94-110, Rome Laboratory (USAF), New York, USA. Ehud Reiter, 1995. NLG vs. Template. In Proceedings of the Fifth Workshop on Natural Language Generation, Leiden, The Netherlands. Ehud Reiter, Chris Mellish and John Levine, 1995. Automatic Generation of Technical Documentation. Applied Arti cial Intelligence, 9:259{287.