Anthony Hartley. ITRI. University of Brighton ..... Anthony F. Hartley and C6cile L. Paris. 1995. Supporting Multilingual ... Earl D. Sacerdoti. 1977. A Structure for ...
Language-Specific Mappings from Semantics to Syntax Judy Delin Department of English Studies U n i v e r s i t y of S t i r l i n g Stirling FK9 4LA U.K. j Idl~st ir. a c . uk
ITRI University of Brighton Mithras Annexe Lewes Road Brighton BN2 4AT, U.K.
Anthony Hartley ITRI University of Brighton Mithras Annexe Lewes Road Brighton BN2 4AT, U.K.
drs2@itri, bton. a c . uk
afh~it ri. bton. a c . uk
D o n i a R. Scott
Abstract We. present a study of the mappings from semantic content to syntactic expression with the aim of isolating the precise locus and role of pragmatic information in the generation process, l~om a corpus of English, French, and Portuguese instructions for consumer products, we demonstrate the range of expressions of two semantic relations, G E N EI~ATION and E N A B L E M E N T (Goldman, 1970) in each language, and show how the available choices are constrained syntactically, semantically, and pragmatically. The study reveals how multilingum NLG can be informed by languagespecific principles for syntactic choice.
1
Introduction
We report here on work which addresses the message-to-syntax mapping in the context of automatic generation of instructional texts the kinds of texts found in the procedural parts of manuMs or information leaflets, pharmaceuticM products. Instructional texts do not simply consist of lists of imperatives: instructions may also describe, eulogise, inform and explain. Generating good-quality draft instructions requires a detailed specification of how to map from semantic representations of the task actions onto a wide range of linguistic expressions. Our corpus is composed of naturally-occurring instructions in the three languages of study. Our overall approach is to obtain different-language drafts that are congruenl with the technical content embodied in the task to be performed (and with other relevant information about the task). A satisfactory level of congruence requires the use of syntactic and pragmatic rules appropriate to each target language, mat)ping fi'om the semantics to appropriate expression in a way that is frec from influence from any source language 1. We 1See (Hartley and Paris, 1995) for discussion of the
292
begin the generation process with a plan-based model of the underlying task} In our study, we have looked at two specific procedurM relations that can hold between pairs of actions in a task, identified by the philosopher Alvin Goldman as the relations of GENEP~ATION and ENABLEMENT (Goldman, 1970) relations which have the advantage of being tbrmally specified (see e.g.(Pollack, 1986; Balkanski, 1993)), and need to be expressed regularly within instructional texts. In section 2, we give a brief definition of generation and enablement, before going on in section 3 to describe how the two relations are realised in the corpus of Portuguese, English, and French instructions.
2
The Semantic Relations
Generation and enablement are relations that can hold between pairs of states, events, processes, or actions. A simple test of generation holding between action pairs is whether it can be said that by performing one of the actions (a) under appropriate conditions, the other (/9) will automaticMly occur (Pollack, 1986). If so, it can be said that c, generates/). The two actions must be performed, or perceived to be performed, by the same human agent, and the two actions must be asymmetric (i.e. if a generates fl, then fl cannot generate a). Simple examples of generation are ~ follows: 3 (1) Heat gently to soften the coating. (2) Dial the numbers of the Mercury authorisation code by pressing the appropriate numbers on the keypad. In example 1, the action of heating gently ha~s the effect of softening the coating. In example 2, advantages of this approach over the inherent limitations of a translation-based approach to producing multilinguM instructions. 2See (Paris et al., 1995) for a discussion of the modelling of domain knowledge for instructions in the coiltext of a support tool for drafting instructional texts. 3So far, we have concentrated on examples of these semantic relations that do not cross sentence boundaries.
pressing the correct keypad numbers has the automarie effect of dialing the numbers of the Mercury authorisation code. In each case, by performing the ¢t action (or set of actions), the user has automatically performed the fl action. Note that the two actions can bc presented in either order: generatiNG first, or generatED first. q'he term cnablemeut is commonly used to refer to the procedural relation between preconditions and actions. It obtains between two actions where the execution of the first brings about a set of conditions that are necessary, but not necessarily suJJicienl for the subsequent performance of the second (Pollack, 1986). This is different from the generation ('rose, since enablement requires the further intervention of an agent - and it need not be the same agent - to bring about the fl eventuality. (3) (?lose cover and test a.s recornmended in 'Operation' section. (4) l)br prolonged viewing, ttle slide may be pushed downwards and then backwards until it locks under the ledges at each end of the slot. Example 3, taken from the instructions for a household smoke alarm, shows the enabliNG action appearil, g tirst: closing the cover enables testing to take place, but does not automatically result in a test. Example 4, front the instructions for a home photographic slide viewer, presents the enablED action ~ prolonged viewing - first, and describes to tile user what must be done to facilitate it. These two relations have been formalised by Pollack (1986) and Balkanski (1993) for the purposes of plan recognition, and can be represented in a plan formalism that is a simple extension of STRll)S-styled operators developed by Fikes (1971) and expanded in the NOAII system (Sacerdoti, 1977). [Iere, we summarise the two relations in the form of the following planning statements: ,, (~ generates fl iff c~ is the body of a plan e whose goal is ft. ,, oe enables fl if ce is a precondition of a plan e and/3 is the goal of plan e, or iffl is the body of e and t~ is a preconditkm of/3.
In order to generate instructions clearly, it must be obvious which, if either of the two relations is intended at any given point: eonflmion of one with the other will lead to inadequte, incomplete, or even dangerous execution of the t ~ k described.
3
F r o m S e m a n t i c s to S y n t a x
Ilow, then, are generation and enablement realised in the three languages of study? In what follows, we look at the syntactic resources that are used in each language to convey the two parts of the two relations, and look at tile constraints
293
on tile ordering of tile two parts; then, at what discourse markers play a role in further ensuring the clarity of the relation intended, and finally show how different rhetorical interpretations result from these choices. Together, these factors explain a significant amount of the cross-linguistic wtriation that occurs within the instructions sublanguage, in what follows, however, it is not our intention to suggest an ordering tbr the set of de~ cisions that need to be made for generation: so far, our research suggests a complex interaction of factors is involved in choice of expression, and tim ther research is required to establish their relative priorities in the decision-making process. Our corpora for the study consisted of 65 exampies of generation, and 65 examples of enablement for each of the three languages of study. 4
3.1
Syntacti(. R e s o u r c e s
The distribution of expressions among the two components tED and ING) of the g e n e r a t i o n relation for Portuguese is shown in figure 1'~.
hdinitive ~ ~ I --27 V hnperative | 0 I 0 I
p,.