ITRI-99-05 Component tasks in applied NLG systems - Semantic Scholar

1 downloads 0 Views 166KB Size Report
nen, M. Maybury, M. Zock, and I Zukerman, editors, Proceedings of the ECAI'96 ... and William C. Mann, editors, Natural Language Generation in Artificial Intelli-.
ITRI-99-05

Component tasks in applied NLG systems Lynne Cahill and Mike Reape March, 1999

 Dept of Artificial Intelligence, University of Edinburgh

This work was produced under the RAGS project, supported by the EPSRC under grants L77041 and L77102 Information Technology Research Institute Technical Report Series ITRI, Univ. of Brighton, Lewes Road, Brighton BN2 4GJ, UK TEL: 44 1273 642900 EMAIL: [email protected] FAX: 44 1273 642908 NET: http://www.itri.brighton.ac.uk

+ +

Component tasks in applied NLG systems Lynne Cahill and Mike Reape

1 Introduction This paper reports on the findings of an investigation of the component tasks of twentyone applied NLG systems. The systems were chosen according to three criteria: 1. they had to be fully implemented; 2. they had to be complete, i.e. perform complete generation of text from non-textual input; 3. they had to take non-hand-crafted input. The systems which are discussed in this paper are those documented in A survery of applied NLG systems by Daniel Paiva ([Pai98]). These do not represent all existing systems which fulfill the three criteria above, but we hope form a representative sample of those systems. In this report, we present the results of attempting to fill a simple table characterising where in the system architecture each of seven component tasks occurs. The architecture of each system is defined in terms of the three-stage pipeline architecture proposed in [Rei94], consisting of Content Determination, Sentence Planning and Surface Realisation. For each system, we provide approximate mappings from the system’s modules to Reiter’s, and specify in which of the three Reiter stages each task occurs. The report is structured as follows. First we define the seven tasks in question. We then briefly discuss the interesting points of each system covered and correspondences between the actual system modules and Reiter’s stages in tabular form.1 We then present our table of “component task/stage in the Reiter architecture” correspondences. Finally, we make some preliminary conclusions based on our table filling exercise. This report is the second deliverable of the RAGS project2 .

1

In these tables, CD = content determination, SP = sentence planning and SR = surface realization. RAGS is an EPSRC project, funded under grants GR/L77041 to Chris Mellish (Edinburgh) and GR/L77102 to Donia Scott (Brighton). 2

1

2 The seven component tasks These seven tasks are as follows: 1. lexicalisation 2. aggregation 3. rhetorical structuring 4. referring expression generation 5. ordering 6. segmentation 7. centering/salience/theme It is worth noting that these tasks are not intended to be exhaustive or even necessarily representative. In particular, they are not evenly distributed over the Reiter architecture, with the Surface Realiser particularly under-represented. We now take each in turn and specify what they are taken to mean and the kinds of problems we found (or anticipate) in delimiting the definitions.

2.1 Lexicalisation We take lexicalisation in the first instance to mean the choice of particular content words which will appear in the final output text. We do, however, differentiate between “lexicalisation” and “lexical choice”. The first we take to indicate a broader meaning of the conversion of something3 to lexical items, while the second we use in a narrower sense to mean deciding between lexical alternatives. Not all of the systems we looked at perform genuine “lexical choice”, so we have opted for the term “lexicalisation” to cover the phenomena we are discussing here. More detailed discussion of lexicalisation in the given systems is in [Cah98].

2.2 Aggregation This can have more or less specific interpretations. On one very general reading, it can be taken to mean the putting together of two pieces of information. On a narrower interpretation it can mean the putting together of sentences. Within the context of RAGS, it does not make much sense to work with the narrower interpretation, so when we talk of aggregation, we mean any process of putting together more than one piece of information, which at some other level are separate. More detailed discussion of aggregation in the given systems is in [Rea]. 3 This may involve concepts or representations in some logical formalism or may be something closer in form or denotation to linguistic lexical forms.

2

2.3 Rhetorical structuring This determines the rhetorical relations between pieces of information. It involves concepts such as “elaboration”, “contrast” etc., and determines how the pieces of information should be related by the text structure.

2.4 Referring expressions One of the key components of most generation systems is that which decides how to refer to concepts or entities. This relates to lexicalisation, but it may also be done at a higher level, determining, for instance, that a pronoun is appropriate in a particular case, without determining which pronoun is used. This is most obviously the case in multilingual systems, where the pronoun is determined by the language in question, but may also apply in monolingual systems, which might, for instance, specify at some higher level that a feminine personal pronoun is required, but not determine whether it is “she” or “her” until after the syntactic structure for the surface realisation has been generated.

2.5 Ordering The linear ordering of pieces of information may be determined at a fairly high level, but equally it is possible for ordering of sentences in the surface text to be determined at quite a late stage in the generation process. Where appropriate, we distinguish between interand intra-sentential ordering.

2.6 Segmentation Segmentation involves the dividing up of information/text into sentences and paragraphs. It would seem that there is a balance between aggregation and segmentation. Aggregation puts pieces of information together and segmentation takes them apart, so it may well be the case that where one system makes use (almost) totally of aggregation, another makes use (almost) totally of segmentation. That is, if you start with a large, complex single representation of the information to be conveyed, then a process of segmentation is necessary to get down to the sentences in question. If, however, you start out with a large set of small pieces of information, then a process of aggregation is required to put together some pieces of information into single sentences. We do not restrict our definition of segmentation to segmentation into actual text units, but consider also the segmentation into “sentence-sized chunks” of information.

2.7 Centering/salience/theme Although we initially treated centering as a separate task, it was found that very few systems make use of centering, and similarly, rather few make any use of salience or theme. We have therefore collapsed the two into a single category. By centering, we mean the notion of “forward-facing” and “backward-facing” centers which affects pronominalisation. Like centering, salience/theme applies to the connectedness of sentences in a text. The

3

topic of a sentence relates to the theme of the text as a whole, and salience is determined by this relation. Within the RAGS project, we do not restrict ourselves to any theoretical biases in the use of these terms, but consider the general concept and the usage of the authors of the systems. Salience is also related to rhetorical structuring.

3 Systems which accept input from a user In the next part of this paper we consider a set of nine systems all of which accept input from a user. The systems are: GIST, Drafter, Drafter2, Patent Claim Expert, Joyce, ModEx, Exclass, HealthDoc and Ghostwriter. First, we give a brief description of each system and then a tentative analysis of what we have found. Each system description is followed by a tabular representation of our best guess as to the correspondence of the system modules to the three stages of the Reiter architecture.

3.1 GIST, Drafter and Drafter2 We consider these three systems together because of their historical relatedness. GIST and Drafter were variants on a system for different domains. GIST ([Con96], [PC96]) produces administrative texts (e.g. pension claim documentation), while Drafter I ([PVF+ 95]) produces software manuals. Drafter2 ([PSE98], [PS98], [SPE98]) is a development of the Drafter concept, but with a rather different architecture. All three systems have an authoring tool, which allows the user to interact with the system to produce the domain model from which the text is ultimately generated. In addition, all three systems are fully multilingual, with texts generated in English and French in Drafter, English, German and Italian in GIST and English, French and Italian in Drafter2. The texts can be generated in any of the languages regardless of the input language. The architectures of GIST and Drafter (henceforth G+DI) are to all intents and purposes the same, and for most of the discussion they will be treated as one. There is a minor difference in the modular structure which is reflected in the system module correspondence table, but this does not affect the assignment of the component tasks to modules. The architecture of Drafter2 (DII), however, is rather different and illustrates very clearly some of the problems with assigning the RAGS components to particular modules in the generic pipeline architecture. Lexicalisation in all three systems takes place in both the surface realiser and the content determiner. Although there is often a one-to-one correspondence between concepts used at higher levels of processing and lexical items, and the concept names are often the same as the English lexical items which express them, this is irrelevant. As can be seen in the case of multilingual generation, the lexicalisation for different languages is necessarily distinct at the surface realisation level. Referring expressions in G+DI are generated in both the sentence planner and the surface realiser. There are two stages to the generation of referring expressions: the determination of the general type of expression (e.g. personal pronoun) and the specification of the actual surface expression to be used (e.g. “she” vs “her”). In DII, there is no specific apparatus for generating referring expressions, but they emerge during the CD and SP phases.

4

Segmentaton in G+DI takes place in the sentence planning module. In DII, it takes place throughout. In fact, the difference between G+DI and DII illustrates nicely the relation between aggregation and segmentation. Whereas in G+DI, aggregation takes place in both the CD and SP modules, and segmentation takes place in the SP module, in DII there is no aggregation, only segmentation. The information to be represented in G+DI is a collection of pieces of information, which are combined and then segmented into sentencesized chunks, before being output as actual sentences. In DII the information starts out as a single high level goal, with a set of phrase structure rules which defines the structure of the subgoals and their expression. Therefore, the whole process can be seen in some sense as a process of repeated segmentation, from the top level goal of the text to be generated right down to the generation of the individual words. Neither system has a component responsible for centering/salience/theme. Ordering is determined both at the SP and SR level in G+DI, with some ordering relations being determined at the level of information and others being determined by syntactic considerations. DII does not do any explicit ordering. Rhetorical structuring takes place in the CD module of both systems, although it can be argued in both cases that it does not really take place in the system itself at all. The content determination includes the user’s preparation of the domain model, and the content and structuring of it is largely predetermined for each system. Gist

User interface Strategic planner Tactical generators

CD  

SP

SR





Drafter

User interface Strategic planner Micro planner Tactical generators

CD  

SP

SR





Drafter2

Authoring tool Generator

CD 

SP

SR





3.2 Patent Claim Expert The Patent Claim Expert (PCE) ([SN96]) assists in the authoring of patent claims, once again using an authoring tool with which the user interacts to build a model from which 5

the text is generated. It is an interesting system in that the texts it produces have to conform to very strict legal requirements, in particular the texts all have to consist of a single (albeit possibly very complex) sentence. This puts a rather different perspective on some of the component tasks, for instance, centering, segmentation and inter-sentential ordering. However, despite this, the texts produced are very complex, with clausal relations which mirror sentential ones in more standard texts. The avoidance of ambiguity is also paramount because of the legal status of the texts. Lexicalisation in the PCE takes place in Content Determination, being largely done by the user, either by choosing terms/expressions from menus, or by typing them in. Although the authors claim that the system has been implemented for both English and Russian, there is no discussion of the Russian version of the system and very little about multilingual aspects of it. However, it is reasonable to assume that, although the lexicalisation is done in the CD part of the system, the language specific determination of the precise output takes place in the surface realiser. Aggregation takes place in the realiser, where related propositions are grouped in tree representations. This applies also to rhetorical structuring, the relations between the propositions being determined by these representations. Intra-sentential ordering takes place in the surface realiser. This is effectively the same as inter-sentential ordering for most systems, as the clauses within the single sentence are often equivalent, in both actual length and the information they convey, to separate sentences in “normal” texts. The generation of referring expressions is done by the user in deciding which items are coreferents.

Authoring module Text planner Realizer

CD  

SP

SR





3.3 Joyce Joyce ([KKR91], [Ram90], [RK]) generates descriptions of software components that have been developed using the Ulysses programming environment, which is intended to facilitate the design of secure, distributed software systems. It consists of three modules, the text planner, the sentence planner and the realiser, which correspond exactly to Reiter’s three stages. The surface realiser in Joyce does not do any of the RAGS component tasks, performing only morphological realisation together with some syntactic fine tuning. In addition, rather little is done by the content determination, since the system interfaces to the Ulysses environment, which does much of the content determination. The relation between aggregation and segmentation is interesting in Joyce, with segmentation taking place in the text planner and aggregation happening in the sentence planner. This is in some sense counterintuitive but it does reflect what the system does, which is divide the information to be conveyed into sentence-sized chunks and then aggregate those chunks into larger units. 6

As well as segmentation, rhetorical structuring takes place in the text planner, with the relations between the bits of information determined at that stage. The linear ordering of the chunks is also determined at this stage. Referring expressions are generated in the sentence planner, which is also responsible for lexicalisation.

Text planner Sentence planner Realizer

CD 

SP 

SR



3.4 The ModelExplainer (ModEx) The ModelExplainer ([LRR96], [LRR97], [LR97]) generates descriptions of object-oriented software models in hypertext. The input to the generator is a model built interactively by the user. It has a four-stage pipeline architecture, consisting of a text planner, sentence planner, document realiser and formatter, the latter falling outside of the three-stage Reiter architecture. ModEx is an interesting system in that it does not do any lexical choice or referring expression generation, this all being determined by the input data. ModEx requires that the relations in its models are defined as arcs between nodes representing the entities. The nodes must all be labelled with nouns and the arcs must all be labelled with third person present tense verbs. These verbs and nouns are then used directly by the system. The generation of texts is directed by the use of eight text schemas, which are chosen in the text planner. This takes care of the rhetorical structuring, ordering and segmentation. ModEx, like Joyce, does segmentation of the information into sentence-sized chunks in the text planner, before aggregating these into larger units in the sentence planner.

Text planner Sentence planner Realizer Formatter

CD 

SP 

SR

 

3.5 Exclass Exclass ([CK94]), generates job descriptions from a knowledge base that is constructed by the user. The user specifies the input in terms of things like “main activities”, “routine assignments” and “special assignments”. It consists of two modules, the “job description builder”, which is the authoring tool that the user interacts with, and the “job description generator”, which generates the text. Exclass, like Gist and Drafter, makes use of an authoring tool with which the user interacts to build the abstract text plan. As with Gist and Drafter, there is a certain degree of lexical selection done at this phase, the concepts chosen by the user almost always corresponding to the lexical item that is output. However, there is also, we believe, some choice between 7

synonyms in the surface realiser, although this is not explicitly documented. The authoring tool (which broadly corresponds to the content determination module) is responsible for the segmentation, as well as the rhetorical structure and ordering, which are done by the user. Aggregation is done in Exclass at a late stage, by the use of ellipsis. This takes place in the surface realiser. Referring expression generation is not performed in any real sense, all referring expressions being “always generic or proper”. There is no use of centering or salience.

Job description builder Job description generator

CD 

SP 

SR 

3.6 HealthDoc HealthDoc ([DHW95], [DHH97], [HDHP97], [HW96], [Par97], [WH96], [Wil95]) is an unusual system in that it does not do generation of language from another mode, but rather selects and repairs existing text parts. The input to HealthDoc is a full natural language text, which (allegedly) contains every piece of information which could be wanted, and the output is a smaller text, containing only that information relevant to the patient in question. Given this, there is no segmentation done by the system, the sentences being already present in the input text. The system does, however, perform the remainder of the RAGS tasks, as it needs to process and make the necessary alterations to the text in order to ensure textual coherence and cohesion. To this end, the sentence planner part of the system performs a certain amount of lexicalisation (changing words where necessary), aggregation and ordering. Rhetorical structuring takes place in both content determination and the sentence planner. Referring expressions are determined by the input. We believe that centering and salience are determined in the content determination phase, although this is not explicitly discussed.

Authoring tool Selection of content Sentence repair Realization and formatting

CD  

SP



SR



3.7 GhostWriter GhostWriter ([MCM96], [Cer96]) “generates draft documentation of aircraft systems both in English and French”. GhostWriter has a pipelined architecture with four modules: 1. a text planning module, 2. a sentence planning module, 8

3. a surface realization module, and, 4. a post-editing module. At face value then, it would appear to be a perfect match with the Reiter architecture with the post-editing module being an additional module. However, no real description of the text planner is given in the references. GhostWriter is an MTT-based system and the planner generates semantic representations (SemRs) of the kind found in Gossip and LFS. The text planning module performs a type of semantic aggregation and lexical choice, as well, we assume, as segmentation. As in FoG, Gossip and LFS, the result of the lexical choice is a deep syntactic representation (DSyntR). Referring expression generation is treated as part of lexical choice. The following quote exhausts what is published about the surface realization component: “As a basic component of the linguistic realisation step, we have used the GLOSE sentence generation system (. . . ). A significant part of its theoretical background relies on Meaning-Text Theory (MTT) (. . . ).” The fourth module produces “post-edited text”. The information above really covers what is published about the architecture of GhostWriter. The references listed concentrate on knowledge acquisition and multilingual lexical choice.

text planner sentence planner surface realization post editing

CD 

SP 

SR

 

4 Systems which accept input from a knowledge base or database In the next part of this paper we consider a set of ten systems all of which accept input from a database or knowledge base. The systems are: Ana, PlanDoc, FoG, Gossip, LFS, Caption Generation System (CGS), PostGraphe, Komet, AlethGen and Proverb. First, we give a brief description of each system and then a tentative analysis of what we have found. Each system description is followed by a tabular representation of our best guess as to the correspondence of the system modules to the three stages of the Reiter architecture.

4.1 Ana Ana ([Kuk88]) takes “a set of half-hourly stock quotes from the Dow Jones News and Information Retrieval System” as input and produces “a summary report of the day’s activities on Wall Street, similar in content and format to the human-generated reports found on the financial pages of daily newspapers”. The system contains four modules:

9

1. a fact generator, 2. a message generator, 3. a discourse organizer, and, 4. a surface generator. The fact generator just prepares the numerical data for the rest of the system and apparently contains nothing of interest to us here. The message generator’s function is to infer “messages” from the facts, that is, it determines which facts are interesting. The discourse organizer groups messages according to topic, combines and eliminates them. The surface generator takes the ordered messages and maps them into phrases in its phrasal lexicon, builds clauses from the phrases, and combines clauses into complex sentences according to its clause-combining grammar. It is therefore responsible for referring expression generation and segmentation. Centering/salience/theme are unmentioned and apparently not treated. As far as we can tell, aggregation is only performed by virtue of the “clause-combining grammar” (if this should be called aggregation). Lexical choice is nonstandard by today’s standards in that a phrasal lexicon is used rather than a traditional grammar plus lexicon. Kukich says “By macro-level knowledge structures and processes is meant the use of semantic units consisting of whole messages, lexical items consisting of whole phrases, syntactic categories at the clause level, and a clause-combining as opposed to sentencegenerating grammar.” Kukich goes to some lengths to justify this approach based on “the knowledge engineering principle” and sublanguage considerations. There is also an accompanying emphasis on revision given the engineering considerations of sublanguage, phrasal lexicon and clause-combining grammar and fluency given that “the report generation system can produce fluent text without having the detailed knowledge needed to construct mature phrases from their elementary components”. [Kuk88] is a rather long and detailed paper but it is still rather difficult to tell what Ana does and doesn’t do and where it does or doesn’t do it. In terms of the Reiter architecture, it appears to be more of a two-module pipeline with the first three stages (encompassing content selection, text planning and segmentation) belonging to the first module and the fourth to surface realization.

fact generator message generator discourse organizer surface generator

CD   

SP

SR



4.2 PlanDoc PlanDoc ([KMS+ 93, MKS94]) is a system which generate reports about telephone network planning operations. The system design and goals were determined by a sublanguage analysis of the reports that telephone routing engineers prepare manually. The system architecture type is a pipeline with five modules: 10

1. a message generator 2. an ontologizer 3. a content planner 4. a lexicalizer 5. a surface generator The message generator converts the telephone routing report into Lisp readable format. The ontologizer augments the messages with semantic knowledge from the domain of discourse. The content planner determines which information in the messages should appear in the various paragraphs and organizes the overall narrative. It is in this module that aggregation, paraphrasing, cue word selection and “message combination” is performed. Among other things, it is claimed to choose “paraphrasing forms that maintain focus and ensure coherence”. The output of the content planner is “a set of complex messages in hierarchical attribute-value format”. The lexicalizer maps the attributes of the messages into case roles and selects content words for the values of the attributes. Finally the surface generator consists of the FUF/SURGE packaged developed by Michael Elhadad. Among other things, the surface generator chooses function words, performs morphology and linearizes the derivation tree. The implementation status of centering and referring expression generation is unclear. Also, segmentation and rhetorical structuring are apparently done in the content planner, but it is not clear that the content planner does anything more than select, group and order messages. One notable feature about PlanDoc is that in a certain sense the lexicalizer is the sentence planner since it maps message attributes to “semantic case roles” and chooses all content words.

message generator ontologizer content planner lexicalizer surface generator

CD   

SP



SR



4.3 FoG FoG ([GDK94], [KP91a], [KP91b], [BCG+ 90]) is a system that generates weather forecasts in English and French. Its main area of emphasis is the treatment of the “telegraphic style” of its weather forecasting sublanguage and sublanguages more generally. FoG converts data to text in three stages: data extraction, conceptual processing, and linguistic processing. The linguistic model chosen is Meaning-Text Theory (MTT) ([MP87]). This particular choice of linguistic model deeply influences FoG’s system architecture. The system has a pipeline architecture of either five or six modules (depending on the reference read). 0. the conceptualizer, 1. the planner, 11

2. the interlingual component, 3. the deep-syntactic component, 4. the surface-syntactic component, and, 5. the morphological component. The conceptualizer is an expert system that takes the raw data as input and produces a set of conceptual representations in frame-like form. Very little is said about the conceptualizer and it is not mentioned in some of the references. The planner takes these “unordered and unstructured” conceptual representations as input and maps them to an interlingual representation. [KP91a] state that “[the] conceptual representations undergo a three-step text planning process, which we will not describe here.” This is not further described in the references in this paper as far as the authors are aware. Two text planner designs are described in [GDK94], one based on data salience and one based on temporal order. Both use text schemas, and the work of the planner includes rhetorical structuring (via the schemas), ordering, segmentation and salience/theme. The interlingual component carries out lexical choice and has roughly the same function as the “lexicalizer” of PlanDoc. The similarities are quite striking given that FUF/SURGE is based on Functional Unification Grammar and systemic functional linguistics while FoG’s interlingual component is based on MTT. The deep-syntactic component takes a deep-syntactic representation as input and maps it to a self-descriptive surface-syntactic representation (SSyntR). This does not include any of the RAGS tasks. The surface-syntactic component takes a surface-syntactic representation and maps it to a morphological representation and is roughly equivalent to the linearisation/morphological component of FUF/SURGE, which occurs after surface realization proper. Apparently, this also means that referring expression generation (if it is done at all) is performed in the surface-syntactic component. The morphological component “produces the final text inflecting the words when necessary”. The only component task remaining to be addressed is aggregation. The references are somewhat unclear on this point. On the one hand, there may be no need for aggregation in a sublanguage with a “telegraphic style”. On the other hand, implementations of MTT (especially by this group) normally make use of lexical functions for “paraphrasing”. This paraphrasing could be considered a form of aggregation but it is unclear from the systems whether only later versions of the system use lexical functions or even whether any of them do. In general, a comparison of FoG’s architecture with the levels of MTT makes it quite clear that FoG’s architecture is determined almost entirely by the MTT model. What is remarkable is the fact that PlanDoc and FoG both use lexical choice to build (deep) syntactic structure before surface realization and they do it in approximately the same location in their respective pipelines relative to the other modules. We will see two more examples of MTT-based systems developed by the Montr´eal/CoGenTex group in the next two sections, namely Gossip and LFS. 12

conceptualizer planner interlingual component deep-syntactic component surface-syntactic component morphological component

CD  

SP

 

SR

 

4.4 Gossip Gossip ([CI89], [IKP91], [KP91b]) produces summaries about users’ activities on an operating system with the intention of assisting the administrator in detecting suspicious usage of the system. In the references to this system, much is made of the adoption of MTT as the basis for the system design. In particular, the lexicon plays the central role and is involved in “(1) semantic net simplification, (2) determination of root lexical node for the deep syntactic dependency tree, (3) possible application of deep paraphrase rules using lexical functions, and (4) surface syntactic realization.” Content determination is based on topic/comment structure and semantic nets input to the linguistic module contain theme/rheme specifications which influence lexical and syntactic choices during realization. First, the planner produces “sentence-sized semantic nets” which it marks with themerheme information. An MTT “generator” does not do text planning. Furthermore, the theme/rheme constraints influence clause ordering (our “ordering”), pronominalization (i.e., referring expression generation) and lexical choice. Gossip consists of four modules, three comprising the planner and one the linguistic module, and, like FoG, is based on Meaning-Text Theory. We will not discuss the breakdown of the planner into modules as this is difficult to ascertain from the references. The planner consists of two processes: content determination which is based on a representation of textual knowledge called the Topic Tree (based on topic/comment structure) and a process of text structuring. Both processes are heavily dependent on the conceptual knowledge of the domain. The output of the planner is an ordered set of sentence-sized semantic nets or semantic representations (SemRs) marked with theme/rheme information. These are the input to the lexicalisation process, which we assume occurs in the linguistic module. The planner performs ordering and theme/rheme identification which is used throughout the entire linguistic generation process. We also know that whatever rhetorical structuring is done is done here. We also know that segmentation is definitely done in the planner. Centering theory as such is not mentioned explicitly although topic/comment structure and theme/rheme marking are. Referring expression generation is also unmentioned although pronominalization is mentioned without an indication of its implementation status. We assume it occurs in the surface realiser. It is not at all clear whether any form of aggregation is implemented.

13

planner unnamed unnamed linguistic module

CD 

SP

SR

? ? 

4.5 LFS LFS ([IKLP92], [KP91b], [LR97], [MP87]) generates labour force reports in both English and French based on statistical database tables of numerical values. LFS has a pipeline architecture with six modules: 1. the planner, 2. the interlingual component, 3. the semantic component, 4. the deep-syntactic component, 5. the surface-syntactic component, and, 6. the morphological component. As should be obvious, LFS is based on MTT and its architecture is very similar to FoG’s. It also was developed by the Montr´eal/CoGenTex group. It is unclear whether one can infer that referring expression generation is performed in the planner or whether the planner just specifies referring expression types for later realization. Theme/rheme specifications are added by the text planner which carries out rhetorical structuring and segmentation as well. In LFS, aggregation is specifically mentioned as occurring in the planner (although not named as such). The interlingual component performs lexical choice. Therefore, LFS uses a “semantic component” like Gossip and an “interlingual component” like FoG. Furthermore, it splits the process of lexical choice into a purely lexical choice part (performed by the interlingual component) and an “arborization” (deep syntactic structure generation) part. Furthermore, some lexical functions may be applied at this point, implying paraphrase and possibly aggregation as well.

planner interlingual component semantic component deep-syntactic component surface-syntactic component morphological component

14

CD  ?

SP ?  ?

SR

?  

4.6 Caption Generation System The Caption Generation System (CGS) ([MMCRng], [MRM+ 95], [MP93]) generates explanatory captions of graphical presentations of various types. Its input is a picture representation (graphical elements and its mapping from the data set) generated by the graphics presentation system SAGE annotated with a complexity metric. The major areas of emphasis are multi-modality and system evaluation. CGS’s architecture is a highly articulated pipeline with seven modules which include all of the component tasks. Since the purpose here is to locate those tasks in the pipeline and in the Reiter architecture, the review of CGS will be relatively brief, since these locations are all indicated explicity. The seven modules of CGS are: 1. the text planning module, 2. the ordering module, 3. the aggregation module, 4. the centering module, 5. the referring expression module, 6. the lexical choice module, and, 7. the surface realization module. The text planning module performs rhetorical structuring and presumably segmentation through its use of decomposition operators. It uses the [MP93] planner. The ordering module takes a partially ordered text plan and imposes a total order on the speech acts. The aggregation module is able to get by with a simple strategy because of the nature of the application, namely, “[it] only conjoins pairs of contiguous propositions about the same grapheme types in the same space”. The centering module also performs intraclausal ordering amongst co-arguments to increase the coherence of text. The centering module is based on the centering and focus models of Sidner, Grosz, Joshi and Weinstein. The center and focus information/structure is used in referring expression generation, deciding when to perform subordination and aggregation and intraclausal ordering. The referring expression module is largely based on [DR95]. CGS is a FUF/SURGE-based system and the lexical choice module performs the same function as in PlanDoc. The surface realization module is FUF/SURGE.

15

text planning module ordering module aggregation module centering module referring expression module lexical choice module FUF/SURGE

CD   

SP

  ?

SR

? 

4.7 PostGraphe PostGraphe ([Fas96], [FL96], [GL96]) generates pairs of figures and French texts for statistical reports from statistical data with annotations. PostGraphe’s architecture is a pipeline with two modules: 1. the planning module and 2. the realization module. [GL96] call the process the planning module implements deep generation. It is divided into two steps in PostGraphe. First, the conceptual representation is segmented and structured to build a discourse representation, which consists of sentence-sized chunks, linked by rhetorical relations. Second, the discourse representation is traversed, and the chunks converted into sentences. The first step of the planner then performs both rhetorical structuring and segmentation. The second step of the planner seems to mark elements of the segmented discourse represetations as instructions for the surface realization component and so it neither corresponds to any of the component tasks or any of the modules of the Reiter architecture. Prior to the planner, a component would be needed to translate the underlying conceptual structure (domain knowledge) into segmented discourse representations. [GL96] say “We do not know yet how to produce the semantic representation from the conceptual representation . . . ”. PostGraphe uses “Segmented Discourse Representation Theory (SDRT) proposed by Asher (1993), which extends DRT by adding rhetorical relations such as those found in RST (Mann and Thompson 1987)” so it clearly would be doing rhetorical structuring but for the comment that “We do not yet produce this discourse structure, but we are working on the problem . . . ”. The planning module has four phases. The last of these is the post-optimization phase which at least potentially does some aggregation, removing any redundancy that arises from lack of grouping of intentions at earlier stages. The realization module is based on Pr´etexte, a Prolog implementation of systemic grammar. As in Penman/KPML, Pr´etexte makes two passes over the semantic input, establishing functional structure during the first pass and performing lexical choice (and, presumably, referring expression generation) during the second. Final constituent structure is determined at the end after the first two passes (i.e., linearization). This is really all we know about PostGraphe’s architecture based on the references. The references concentrate mostly on graphics generation and the linguistic representation 16

of time and its generation. In particular, centering, referring expression generation and salience/theme are unaddressed or unmentioned.

planning module realization module

CD 

SP – –

SR 

4.8 Komet Komet ([BT95], [TB94]) is a multilingual (English, German and Dutch) system which produces biographical texts from a database of facts about people. The examples shown all involve architects. The pipeline architecture does not represent the work Komet does very well, and it is difficult to assign specific tasks to certain modules. The diagrammatic representations of the system indicate a modular structure, but the “levels” of processing are actually more of a continuous network which is traversed during the construction of a GSP (“Generic Structure Potential”) which is then used to generate the text. For our purposes the construction of the GSP probably covers the content determination and sentence planner modules, with the surface realiser performing the linguistic realisation. The overall organisation of the system does allow for the possibility of more varied choices in generation, i.e. for generating language in different modes, registers and genres, but in practice the system as implemented always generates text which is chronologically ordered in the biography genre. Rhetorical structuring, segmentation, ordering, centering/salience and, we believe, referring expression generation are all determined by the GSP, with lexicalisation and (we believe) aggregation taking place in the realiser.

GSP construction Linguistic realiser

CD 

SP  

SR

4.9 AlethGen AlethGen ([Coc96a],[Coc96b],[CD94b], [CD94a], [CD94c]) is a system which generates letters in French to customers from facts in a customer database. It is particularly interesting as it combines full linguistic generation with the use of canned and semi-fixed text. For the full linguistically generated text, lexicalisation takes place in the surface realiser, with some cue expressions being chosen in the sentence planner. However, it is not clear how this relates to the generation of canned text. The “rhetorical planner” and “noun phrase and anaphora planning”, which correspond together to the sentence planning module, take care of aggregation, rhetorical structuring, referring expression generation and ordering, with segmentation (we believe) in Content Determination (although possibly not present in the system, being predetermined in the input). No processing of centering or salience is performed by the system.

17

Conceptual planner Rhetorical planner Noun phrase and anaphora planning Linguistic realization

CD 

SP  

SR



4.10 Proverb Proverb ([GS86], [Hua94], [Hua96], [Hua97], [HF96], [HF97], [Met91]) is probably the system which most closely fits the Reiter three-stage pipeline architecture, having a macroplanner, microplanner and realizer, which correspond for all intents and purposes to the content determiner, sentence planner and surface realiser. The surface realiser does not perform any of the tasks in which we are interested, doing only morphological and some surface syntactic processing. The macroplanner (content determiner) is responsible for segmentation of the input proof into sentence sized information chunks as well as the rhetorical structuring and intersentential ordering. Intrasentential ordering is performed by the microplanner (sentence planner). The microplanner performs lexicalisation and referring expression generation, specifically generating (and choosing between) paraphrases. It is also responsible for aggregation, putting together “chunks” of the proof.

macroplanner microplanner realizer

CD 

SP 

SR



5 System/Task tables The following two tables give our interpretation of where in each system the component tasks are performed. Where a task is definitely not performed by a system, an X is placed in the table. Where it is unclear whether or where a task is performed, we have made a guess and indicated this with brackets. The stages are called by the names [Rei94] gives the three stages (CD = Content Determination; SP = Sentence Planning; SR = Surface Realiser). Where a task is performed in more than one stage, all relevant stages are given. A number of systems also have (user) to indicate that the task is performed by the user, but may be considered part of the system, as the user interacts with the system in performing the task. Table 1 is for the systems which accept input from a user and Table 2 is for the systems which accept input from a database or knowledge base.

6 Conclusions The analysis described above leads us to a number of conclusions about the architecture of applied NLG systems. 18

Gist Drafter Drafter2 Pat Claim Joyce ModEx Exclass HealthDoc GhostWriter

Lex CD/SR CD/SR CD/SR CD/SR SP X CD/(SR) SP SP(?)

Agg CD/SP CD/SP SR SP SP SP SR SP SP

Rhet CD CD (user) SP CD CD (user) CD/SP CD

Ref SP/SR SP/SR CD/SP CD SP (user) (user) (user) SP(?)

Ord CD/SP CD/SP X SR CD CD (user) SP ?

Seg SP SP CD/SP/SR X CD CD CD (user) CD

Cent/Sal X X X X X X X ?CD CD/SP(?)

Table 1: Systems which accept input from a user

Ana PlanDoc FoG Gossip LFS CGS PostGraphe Komet AlethGen Proverb

Lex SR SP/SR SP/SR SP/SR SP/SR SR CD SR SR/SP SP

Agg SR CD/SP(?) SP ? CD(?) CD CD(?) (SR) SP SP

Rhet CD CD(?) CD CD(?) CD CD CD CD/SP SP CD

Ref SR CD(?) SR SR CD SP SR CD/SP SP SP

Ord CD SR(?) CD CD CD CD CD CD/SP SP CD/SP

Seg SR CD CD CD(?) CD CD(?) CD CD/SP (CD) CD

Cent/Sal ? CD X CD/SP/SR CD/SP/SR CD/SP ? CD/SP X X

Table 2: Systems which accept input from a database or knowledge base

Although there is no immediately obvious consensus on the modules required by an applied NLG system, virtually all of the systems we looked at could be viewed as consisting of a pipeline architecture in which all of the modules correspond to one, part of one or more than one of Reiter’s three modules. The nature of the system’s input has a significant influence on the presence/absence and the nature and timing of certain of the tasks. Those systems that take user input, for instance, tend to perform lexicalisation in a rather different way (often involving virtually no actual choice by the system itself) and at a different point (in both the CD and SR) than those which take database/knowledge base input, which tend to perform more traditional “lexical choice” either in the SP or the SR. A more detailed analysis of the ordering of the component tasks in the database-input systems also reveals some interesting facts about the systems in question. Although the systems permitted between them a large number of possible orderings, there were some very clear tendencies. For instance, rhetorical structuring is virtually always the first or one of the first tasks performed, and lexical choice is virtually always the last or one of the last to happen. There are a number of factors which prohibit a definitive ordering analysis, not least of which is the lack of detailed enough descriptions of the systems in question. However, there are significant groupings of task which are always either in interchangeable order (i.e., we don’t know which order they happen in) or adjacent in the ordering. These groups are as follows:  rhetorical structuring, centering/salience/theme, ordering;  aggregation, segmentation;

19

 lexical choice, referring expression generation.

Despite the tempting conclusion that these three groupings correspond to the three-stage Reiter architecture, this cannot quite be made to work, primarily because of the surface realiser. As we have already noted, the seven component tasks chosen for the RAGS analysis do not tend to occur in the surface realiser in many systems at all. However, it is true to say that lexicalisation and referring expression generation are the most likely to occur in the surface realiser. Equally, it is true to say that rhetorical structuring, salience/theme and (inter-sentential) ordering are most likely to occur in the content determination. The middle grouping is more problematic in that its members vary much more widely in where in the pipeline they occur. The results of this analysis have been in some respects disappointing, and it has not been possible to ascertain conclusively at what stage many of these chosen tasks actually happen in existing applied NLG systems. However, the exercise has been valuable in determining what generalisations it is possible to make about such systems and to determine the degree of flexibility that will be needed if we are to accomodate the large proportion of existing fully implemented applied systems in the Rags architecture.

References [ANL94]

Proceedings of the Fourth Conference on Applied Natural Language Processing, Stuttgart, Germany, 1994.

[ANL97]

Proceedings of the Fifth Conference on Applied Natural Language Processing, Washington, DC, 1997.

[BCG+ 90] Laurent Bourbeau, Denis Carcagno, E. Goldberg, Richard Kittredge, and Alain Polgu`ere. Bilingual generation of weather forecasts in an operational environment. In Proceedings of the 13th International Conference on Computational Linguistics (COLING-90), volume 1, pages 90–92, Helsinki, 1990. [BT95]

J.A. Bateman and E. Teich. Selective Information Presentation in an Integrated Publication System: an Application of Genre-driven Text Generation. Information Processing and Management, 31(5):753–767, 1995.

[Cah98]

Lynne Cahill. Lexicalisation in applied NLG systems. Manuscript, ITRI, University of Brighton, 1998.

[CD94a]

J. Coch and R. David. Causality and Multisentential Text. In Proceedings of the 1st International Workshop on Computational Semantics, pages 41–50, ITK, Tilburg University, 1994.

[CD94b]

J. Coch and R. David. Representing Knowledge for Planning Multisentential Text. In ANLP’94 [ANL94], pages 203–204.

[CD94c]

J. Coch and R. David. Une application de generation de textes. In Proceedings of Le Traitement Automatique du Langage Naturel en France aujourd’hui (TALN-94), pages 37–45, Marseille, April 1994.

20

[Cer96]

F. Cerbah. A Study of Some Lexical Differences between French and English Instructions in a Multilingual Generation Framework. In INLG’96 [INL96], pages 131–140.

[CI89]

D. Carcagno and L. Iordanskaja. Content Determination and Text Structuring in GOSSIP. In Extended Abstracts of the 2nd European Natural Language Generation Workshop (ENLG’89), pages 15–21, University of Edinburgh, 1989.

[CK94]

D. Caldwell and T. Korelsky. Bilingual Generation of Job Descriptions from Quasi-Conceptual Forms. In ANLP’94 [ANL94], pages 1–6.

[Coc96a]

J. Coch. Evaluating and Comparing Three Text-Production Techniques. In Proceedings of the 16th International Conference on Computational Linguistics (COLING’96), pages 249–254, Copenhagen, Denmark, 1996.

[Coc96b]

J. Coch. Overview of AlethGen. In INLG’96 [INL96], pages 25–28. Demonstrations and Posters.

[Con96]

GIST Consortium. GIST - Generating InStructional Text. Technical Report LRE Project 062-09 Final Report, Information Technology Research Institute (ITRI), University of Brighton, 1996.

[DHH97]

C. DiMarco, G. Hirst, and E. Hovy. Generation by selection and repair as a method for adapting text for the individual reader. In Proceedings of the Workshop on Flexible Hypertext, 8th ACM International Hypertext Conference, Southampton, May 1997.

[DHW95]

C. DiMarco, G.and L. Wanner Hirst, and J. Wilkinson. Healthdoc: Customizing patient information and health education by medical condition and personal characteristics. In First International Workshop on Artificial Intelligence in Patient Education, Glasgow, August 1995.

[DR95]

Robert Dale and Ehud Reiter. Computational interpretations of the Gricean maxims in the generation of referring expressions. Cognitive Science, 18:233– 263, 1995.

[Fas96]

M. Fasciano. G´en´eration int´egr´ee de textes et de graphiques statistiques. PhD thesis, D´epartement d’informatique et de recherche op´erationnelle, Universit´e de Montr´eal, 1996.

[FL96]

M. Fasciano and G. Lapalme. Postgraphe: a system for the generation of statistical graphics and text. In INLG’96 [INL96], pages 51–60.

[GDK94]

E. Goldberg, N. Driedger, and R. Kittredge. Using natural-language processing to produce weather forcasts. IEEE Expert, 9(2):45–53, 1994.

[GL96]

M. Gagnon and G. Lapalme. From conceptual time to linguistic time. Computational Linguistics, 22(1):91–127, 1996.

[GS86]

Barbara J. Grosz and Candace L. Sidner. Attention, intentions and the structure of discourse. Computational Linguistics, 12(3):175–204, July-September 1986.

21

[HDHP97] G. Hirst, C. DiMarco, E. Hovy, and K. Parsons. Authoring and generating health-education documents that are tailored to the needs of the individual patient. In Proceedings of the 6th International Conference on User Modeling, Italy, June 1997. [HF96]

X. Huang and A. Fiedler. Paraphrasing and aggregating argumentative text using text structure. In INLG’96 [INL96], pages 21–30.

[HF97]

X. Huang and A. Fiedler. Proof verbalization as an application of nlg. In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI’97), Nagoya, Japan, 1997.

[Hua94]

X. Huang. Planning argumentative texts. In Proceedings of the 15th International Conference on Computational Linguistics (COLING’94), pages 329–333, Kyoto, 1994.

[Hua96]

X. Huang. Human Oriented Proof Presentation: A Reconstructive Approach. DISKI 112. Infix, Sankt Augustin, 1996.

[Hua97]

X. Huang. Planning reference choices for argumentative texts. In Proceedings of the 8th Conference of the European Chapter of the Association for Computational Linguistics (EACL’97), pages 190–197, Madrid, 1997.

[HW96]

E. Hovy and L. Wanner. Managing sentence planning requirements. In K. Jokinen, M. Maybury, M. Zock, and I Zukerman, editors, Proceedings of the ECAI’96 Workshop Gaps and Bridges: New Directions in Planning and Natural Language Generation, 1996.

[IKLP92]

Lidija Iordanskaja, Richard Kittredge, Benoit Lavoie, and Alain Polgu`ere. Generation of extended bilingual statistical report. In Proceedings of the 15th International Conference on Computation Linguistics (COLING’92), pages 1019–1023, Nante, 1992.

[IKP91]

L. Iordanskaja, R. Kittredge, and A. Polgu´ere. Lexical selection and paraphrase in a meaning-text generation model. In C´ecile L. Paris, William R. Swartout, and William C. Mann, editors, Natural Language Generation in Artificial Intelligence and Computational Linguistics, pages 292–312. Kluwer Academic Publishers, Boston, 1991.

[INL94]

Proceedings of the Seventh International Workshop on Natural Language Generation, Kennebunkport, Maine, 1994.

[INL96]

Proceedings of the Eighth International Workshop on Natural Language Generation, Herstmonceux, Sussex, UK, 1996.

[KKR91]

Richard Kittredge, Tanya Korelsky, and Owen Rambow. On the need for domain communication knowledge. Computational Intelligence, 7(4):305–314, November 1991.

[KMS+ 93] K. Kukich, K. McKeown, J. Shaw, J. Robin, J. Lim, N. Morgan, and J. Phillips. User-needs analysis and design methodology for an automated documentation generator. In Proceedings of the 4th Bellcore/BCC Symposium on User-Centered Design, Piscataway, NJ, 1993. 22

[KP91a]

R. Kittredge and A. Polgu´ere. Dependency Grammars for Bilingual Text Generation: Inside FoG’s Stratificational Models. In Proceedings of the International Conference on Current Issues in Computational Linguistics, pages 318–330, Penang, 1991.

[KP91b]

R. Kittredge and A. Polgu´ere. Generating extended bilingual texts from application knowledge bases. In Proceedings on Fundamental Research for the Future Generation of Natural Language Processing, pages 147–160, Kyoto, 1991.

[Kuk88]

Karen Kukich. Fluency in natural language reports. In Natural Language Generation Systems, pages 280–311. Springer-Verlag, New York, NY, 1988.

[LR97]

B. Lavoie and O. Rambow. A fast and portable realizer for text generation systems. In ANLP’97 [ANL97], pages 265–68.

[LRR96]

B. Lavoie, O. Rambow, and E. Reiter. The modelexplainer. In INLG’96 [INL96], pages 9–12. Demonstrations and Posters.

[LRR97]

B. Lavoie, O. Rambow, and E. Reiter. Customizable descriptions of objectoriented models. In ANLP’97 [ANL97], pages 253–256.

[MCM96]

B.P. Marchant, F. Cerbah, and C. Mellish. The GhostWriter Project: a demonstration of the use of AI techniques in the production of technical publications. In Proceedings of Expert Systems, 16th Conference of the British Computer Society, pages 9–25, Cambridge, 1996.

[Met91]

Marie Meteer. Bridging the generation gap between text planning and linguistic realization. Computational Intelligence, 7(4):296–304, November 1991.

[MKS94]

K. McKeown, K. Kukich, and J. Shaw. Practical issues in automatic documentation generation. In ANLP’94 [ANL94], pages 7–14.

[MMCRng] V. O. Mittal, J. D. Moore, G. Carenini, and S. Roth. Describing complex charts in natural language: A caption generation system. Computation Linguistics, Forthcoming. [MP87]

Igor Mel’ˇcuk and Alain Polgu`ere. A Formal Lexicon in Meaning-Text Theory (or How to Do Lexica with Words). Computational Linguistics, 13:276–289, 1987.

[MP93]

Johanna D. Moore and C´ecile L. Paris. Planning text for advisory dialogues: capturing intentional and rhetorical information. Computational Linguistics, 19(4):651–695, 1993.

[MRM+ 95] V. O. Mittal, S. Roth, J. D. Moore, J. Mattis, and G. Carenini. Generating explanatory captions for information graphics. In Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI’95), pages 1276–1283, Montreal, Canada, August 1995. [Pai98]

Daniel Paiva. A survey of applied natural language generation systems. Technical Report 98-03, Information Technology Research Institute (ITRI), University of Brighton, 1998. Available at http://www.itri.brighton.ac.uk/techreports. 23

[Par97]

K. Parsons. An authoring tool for customizable documents. Master’s thesis, Department of Computer Science, University of Waterloo, May 1997.

[PC96]

R. Power and N. Cavallotto. Multilingual generation of administrative forms. In INLG’96 [INL96], pages 17–19.

[PS98]

R. Power and D. Scott. Multilingual authoring using feedback texts. In Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics, pages 1053–1059, Montreal, Canada, 1998.

[PSE98]

R. Power, D.R. Scott, and R. Evans. What You See Is What You Meant: direct knowledge editing with natural language feedback. In Proceedings of the 13th Biennial European Conference on Artificial Intelligence (ECAI’98), Brighton, UK, August 1998.

[PVF+ 95]

C´ecile Paris, Keith Vander Linden, Markus Fischer, Anthony Hartley, Lyn Pemberton, Richard Power, and Donia Scott. A support tool for writing multilingual instructions. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 1398–1404, Montreal, Canada, 1995.

[Ram90]

Owen Rambow. Domain communication knowledge. In Proceedings of the Fifth International Natural Language Generation Workshop, pages 87–94, Dawson, PA, 1990.

[Rea]

Mike Reape. Survey of aggregation in NLG. Manuscript, University of Edinburgh.

[Rei94]

Ehud Reiter. Has a consensus NL generation architecture appeared and is it ´ [INL94], pages 163–170. psycholinguistically plausible? In INLG94

[RK]

Owen Rambow and Tanya Korelsky. Applied text generation. pages 40–47.

[SN96]

S. Sheremetyeva and S. Nirenburg. Knowledge elicitation for authoring patent claims. IEEE Computer, pages 57–63, July 1996.

[SPE98]

D.R. Scott, R. Power, and R Evans. Generation as a solution to its own problem. In Proceedings of the Ninth International Workshop on Natural Language Generation, pages 256–265, Niagara-on-the-Lake, Ontario, 1998.

[TB94]

E. Teich and J.A. Bateman. Towards the application of text generation in an ´ [INL94]. integrated publication system. In INLG94

[WH96]

L. Wanner and E. Hovy. The HealthDoc Sentence Planner. In INLG’96 [INL96], pages 1–10.

[Wil95]

J. Wilkinson. Aggregation in natural language generation: Another look. Coop work term report, September 1995.

24

Suggest Documents