Underspeci cation in Discourse Structure and ... - Semantic Scholar

7 downloads 0 Views 178KB Size Report
Mar 30, 1998 - (Jon does yoga if he has a headache) triggers additional constraints (i.e. ... In A. Manaster-Ramer, ed., Mathematics of Language. John ...
Underspeci cation in Discourse Structure and Semantics Claire Gardent Computational Linguistics University of Saarbrucken Germany [email protected] Bonnie Webber Computer and Information Science University of Pennsylvania 200 South 33rd Street Philadelphia PA 19104-6389 USA [email protected] March 30, 1998

1 Description theory Traditionally, grammars generate concrete data structures. Typically, these are trees whose nodes are labelled with syntactic and semantic information. However, a number of approaches have questioned this basic assumption arguing that grammars should manipulate more abstract structures namely descriptions of trees. In this paper, we show that a description-based approach is also useful in discourse, supporting the incremental construction of discourse structure for monologic discourse and the semantics to be associated with such structure. In the past, descriptions have been used for at least three distinct purposes in language processing. [M.P.Marcus et al.1983] used tree descriptions for deterministic parsing. Crucially, these descriptions allow statements about dominance as opposed to immediate dominance. By permitting attachment to be underspeci ed, descriptions avoid local attachment ambiguities. Instead of being committed to being the daughter of some existing node, an incoming constituent can be minimally restricted to being 1

dominated by that node. For instance, given the initial fragment I drove my aunt, Marcus' approach only commits the NP my aunt to being part of the VP. This is expressed by requiring that the VP node dominates the NP node. In this way, if the sentence turns out to be I drove my aunt from Peoria's car, the initial description remains consistent with the nal one. Within the framework of Feature-Based Tree Adjoining Grammar (FTAG, [Vijay-Shanker and Joshi1988]), [Vijay-Shanker1992] has used descriptions to maintain monotonicity. Adjunction (the operation used in TAG to combine trees) is non-monotonic in the sense that relations that hold in the trees being combined no longer hold in the resulting tree. Vijay-Shankar shows that the problem can be resolved if the grammar manipulates partial structures (i.e. tree descriptions) instead of fully speci ed ones (i.e. trees). Speci cally, he proposes to replace each node at which adjunction can take place with a quasi-node i.e., a pair of nodes which stand in a dominance relation. Whereas the top node of a quasi-node expresses restrictions about the root node of auxiliary trees that are adjoined at this node, its bottom node expresses restrictions about their foot node. When processing ends, the actual structures derived by the grammar are obtained by collapsing quasinodes to single nodes thereby unifying their respective feature-structures. Finally, [Muskens1997] has used descriptions to capture semantic underspeci cation { in particular, scope ambiguities: since one description can be satis ed by several structures, descriptions can be used to provide a compact representation for the several semantic representations triggered by scope ambiguities. For instance, given the sentence Every man likes a woman, Muskens'grammar generates a description stating inter alia that both the node labelled with the quanti er every man and that labelled with a woman dominate the node labelled with the verb semantics. However, the relation between the two quanti er nodes is left unde ned. As a result, two trees satisfy the description, one tree where every man dominates a woman and the other where a woman dominates every man. We show in this paper that a description-based approach bene ts discourse and discourse processing as well. Bene ts correlate with those given above. First, with respect to attachment ambiguities in deterministic parsing, the precise point of attachment of an incoming discourse constituent can also only become known at a later stage in processing. Consider the following example from [Moser and Moore1995]: (1) a. altho you know that part1 is good, b. you should elminate part2 before troubleshooting in part3. c1. this is because part2 is moved frequently 2

c2. and thus is more susceptible to damage. The parallel is between c1 and \my aunt" in the \I drove my aunt" example mentioned earlier. Here, one only wants to limit c1 to being dominated by an evidence relationship with b. If the discourse ends with c1, this under-speci ed dominance relation can be further speci ed to immediate dominance. However, if the discourse ends with c2, both c1 and c2 are immediately dominated by a cause node, the resulting unit being immediately dominated by the evidence relation with b. In section 2.2, we show how the description approach permits dealing with such local ambiguities. Second, the bene t of monotonicity that Feature-based Tree Adjoining Grammar has gained from the use of descriptions, is also available to the growing number of researchers [Polanyi1986, Webber1991, Prust et al.1994, Schilder1997b, Cristea and Webber1997] who have found appealing an approach to discourse based on the structures and operations of Tree Adjoining Grammar [Joshi1987]. As we will see in section 2.3, not only does such an approach begin to eliminate some unnecessary distinctions between sentencelevel syntax/semantics and discourse-level syntax/semantics, it allows discourse to take advantage of insights and techniques developed through a considerable body of research on sentence-level syntax and semantics. Third, as with intra-sentential semantics, discourse relations may evince genuine scope ambiguities. For example, (2) a. I try to read a novel b. if I feel bored c. or I am unhappy. Here, there are two discourse relations { a conditional relation (cued by \if") and a contrast relation (cued by \or"). Di erent readings follow from what is taken to be their relative scopes: one reading in which the speaker tries to read a novel under one of two conditions (boredom or unhappiness), the other where the speaker unhappy if s/he can't read a novel when bored. In section 3, we show that discourse-level scope ambiguities and their disambiguation can be handled much in the same way Muskens treats sentence-level scope ambiguities.

2 Discourse Attachment Ambiguities As example 1 shows, incremental discourse processing exhibits cases of local attachment ambiguity similar to those encountered in incremental sentence 3

processing. In this section, we start by clarifying the syntactic and semantic diculties raised by such cases (section 2.1). We then show that Marcus' proposed tree descriptions can be used to capture their syntax (section 2.2) and that Vijay-Shankar's quasi-nodes permits capturing their semantics (section 2.3).

2.1 The problem

Just as incremental processing is one goal of research in sentence-level processing, so it is in discourse research. Here we illustrate with an example, how incremental discourse processing deals with attachment ambiguities. Consider the following pair of texts: (3a-c1) and (3a-c2). (3) a. The trains aren't running now. b. The conductors' union called a strike last Sunday. c1. So we must drive to your sister's. c2. Then the signalmen's union walked out in sympathy. Depending on whether the third clause is (c1) or (c2), structurally, (3b) will serve as either a daughter or a grand-daughter of the node that also dominates (3a). In other words, labelling non-terminal nodes with the coherence relation that holds between their daughters, the resulting structures for (3a-c1) and (3a-c2) are respectively: (4)

cause result a

result c1 a

seq

b

b

c2

To capture this attachment ambiguity during processing, researchers have used an adjunction operation which in essence, substitutes a local tree for a node on the right frontier (set of rightmost nodes) of the current discourse tree1 . This handles the attachment ambiguity of example (3) as follows, considering rst (3a-c1).2 1 Such an adjunction operation is used under this name in [Polanyi1986] and [Webber1991, Cristea and Webber1997]. [Schilder1997b] refers to it as \insertion". It is also found in the \construction rules" of [Prust et al.1994] and the topic-based updating of extended Discourse Representation Structures found in [Asher1993]. 2 Here and in what follows we represent the semantics of a clause by the letter labelling it. This can be thought of as its denotation, which we will refer to as its assertion. So

4

Clause (3a) and (3b) are taken to license two single node trees, labelled \a" and \b" respectively. To maximize discourse coherence between these clauses, the hearer tries to infer a coherence relation between them: At a minimum, (3b) further elaborates the situation introduced in (3a). Since the event described in (3b) is explicitly noted as being prior to that in (3a), the hearer might infer that it caused the event described in (3a), licensing a result relation between the clauses. This supports the construction of a local tree representing the discourse segment associated with (3a-b). This local structure can be represented as: (5) result a

b

As (3c1) is processed, a cause relation is taken to hold between (3a-b) and (3c1) thereby licensing the nal tree: (6) cause result

c1

a

b

Consider now the processing of discourse (3a-c2). As before, a result relation is inferred to hold between (3a) and (3b), again licensing structure (5). As (3c2) is processed however, a sequence (seq) relation is inferred to hold between (3b) and (3c2), thereby licensing a second local tree: (7)

seq b

c2

At this point, the two trees are combined using adjunction whereby the b node of (5) is replaced with the local tree given in (7). This results in the following discourse tree: in example (3), the rst clause is the a. clause and therefore its semantics is represented as \a".

5

(8)

result a

seq b

c2

There are two diculties with such an approach. First, adjunction is non-monotonic in that relations that hold in the trees being combined no longer hold in the resulting structure. For instance, (5) is a component of (8) but whereas in (5), the b node is an immediate daughter of the result node, this no longer holds in (8). Second, the e ect of adjunction on the semantic information associated with the tree is unclear. Crucially, (3a-c2) means that the attributed cause for the trains not running is the sequence of events comprising the conductors' union striking and the signalmen calling a strike in sympathy. In other words, there is a nal inference which revises the rst inference about the causal relation and extends its scope from (3b) to the sequence of (3b) and (3c2). In previous work, [Prust et al.1994] have ignored the problem and would (incorrectly) assign (3a-c2) the conjunctive reading result(a,b) and sequence(b,c). This is because they use a de nition of adjunction that stipulates that adjunction only a ects the node of the current discourse tree at which it takes place; the rest of the tree remains unchanged. Hence the mother node of a node at which adjunction takes place remains unchanged and in particular, its semantics. The approach taken in [Asher1993] involves a complex substitution operation (topic-based updating), which does achieve the correct result. However, this operation is destructive (i.e. non-monotonic) and must perform a global substitution on the current discourse structure, which is computationally costly. This same non-monotonicity problem holds of Schilder's insertion operation as well [Schilder1997b]. In what follows, we show that the description approach provides a simple solution to the problem of discourse attachment while preserving both monotonicity and incrementality3. In his recently proposed treatment of \ ashbacks", Schilder Schilder:ttdp97 has used descriptions as a compact representation of scope ambiguity in discourse, but does not consider their use in handling attachment ambiguity nor the use of Vijay-Shankar's quasinodes in providing a monotonic treatment of semantics. 3

6

2.2 Discourse Attachment and Syntax

To capture discourse attachment ambiguity while preserving determinism and monotonicity, we use domination in much the same way Marcus did. Thus the following description characterises both structures in (4): The dashed line indicates domination, the plain line immediate domination. This description requires that the node labelled result dominates the \b" node and immediately dominates the \a" node { which is true of both trees in (4)4 . (9) result a

b

In fact, the description in (9) is not quite general enough, because attachment ambiguity may also a ect the left-hand argument of a coherence relation. For instance, given the discourse (10a-b1) (a) appears as the immediate (left) daughter of the conditional relation cued by \if", while given the discourse (10a-b2), (a) is just contained in that daughter, being immediately dominated by a contrast relation cued by \or". (10) a. If Mary is mad at her husband, b1. they don't go out. b2. or he is mad at her, they don't go out. Consequently, we require that the description for local discourse trees be: (11) R a

b

where R stands for the coherence relation, and a and b for the assertions of two adjacent discourse segments respectively. Such descriptions allow us A more precise de nition of descriptions can be formulated using a tree logic e.g. [Vijay-Shanker1992]. In such logics, models are labelled trees and the language allows us to describe these trees: constants refer to tree nodes and relation symbols denote relations between nodes, typically: dominance, immediate dominance and label-hood (i.e. which node has which label). For this paper however, we will continue to use a graphic presentation such as (9) above, as it is easier to read than sets of logical formulae and is sucient to show that an appropriate modelling of discourse semantics gains from using descriptions rather than trees. 4

7

to capture the indeterminacy of attachment and construct appropriate syntactic structures, thereby handling the syntactic aspect of local attachment ambiguities.

2.3 Discourse Attachment and Semantics

Recall that in example (3), attachment indeterminacy leads to revising the inference that a result relation holds between (3a) and (3b) (i.e. result(a,b)) to the inference that such a relation holds between (3a) and (3b-c) (i.e. result(a,sequence(b,c))). In other words, the semantics of a discourse segment is modi ed from a relation holding between a and b to a relation between a and an assertion containing b. This is in fact an instantiation of a more general scheme: in discourse, when a coherence relation R is taken to hold between the assertions a and b of two adjacent discourse segments, the only relation holding between these two assertions that is guaranteed to be an invariant of the possible continuations of this discourse, is that R holds between some propositions containing a and b. Using Vijay-Shankar's quasi-nodes Vijayshanker:udotiatag92, we can capture this generalisation by making use of the following type of tree description for basic trees: (12) R(A,B) A

B

a

b

Here, nodes linked by dashed lines are quasi-nodes (pairs of nodes linked by dominance). Labels on the nodes are rst-order terms representing their associated semantic information. Capital letters indicate variables, lower letters indicate constants, and shared variables indicate re-entrancy. During processing, when two node descriptions are identi ed and taken to refer to the same node, their labels must unify. Our hypothesis then is that whenever a coherence relation R is inferred to hold between the assertions a and b of two discourse segments, a description is licensed which says that a local tree should be built whose root semantics is R(A,B), where A and B are the semantics of the daughter nodes, which are taken to dominate the nodes labelled a and b respectively. Intuitively, A and B represent the nal arguments of R, whereas a and b stand for its current arguments. Given this, example (3a-c2) is analysed as follows. Inferring the causal relation to hold between (3a) and (3b) licenses the following description: 8

(13)

result(A,B)3 A2

B4

a1

b5

(Each node here is indexed with a unique integer.) The recognition of a sequence relation holding between (b) and (c2) extends this description to: result(A,B)3

seq(C,D)7

A2

B4 C6

D8

a1

b5

c29

When the end of discourse is reached, the scope of the various relations becomes known, thereby licensing node identi cations: the top quasi-nodes of relations (which represent their realised arguments) are identi ed with the bottom quasi-nodes representing the discourse segments that provide their arguments. For instance, consider the arguments to the sequence relation: node 6 can be identi ed with node 5, xing to b the left-hand argument of this relation. Similarly, nodes 8 and 9 can be identi ed, thereby xing its right-hand argument to c2. From this, it follows that node 4 (which is constrained to dominate node 5) must be identi ed with node 7 yielding the structure shown below: result(a,seq(b,c))3 a1 2

seq(b,c)4 7

;

;

b5 6 ;

c28 9 ;

In this structure, by uni cation, the semantics of the root node becomes result(a,sequence(b,c)), as appropriate. By contrast, the derivation for (3a-c1) is as follows. As with (3a-c2), a result relation is inferred to hold between (3a) and (3b) thus licensing the description in (13). As (3c1) is processed and a cause relation is inferred to hold between (3a-b) and (3c1), this initial description is expanded to: 9

(14)

cause(C,D)7 C6

D8

result(A,B)3

c19

A2

B4

a1

b5

When the discourse ends after (c1), the quasi-nodes can be closed o : 1 is identi ed with 2, 4 with 5, 3 with 6 and 8 with 9. As a result there is only one tree structure satisfying description (14) namely: cause(result(a,b),c))7 result(a,b)3 6

c18 9

;

a1 2 ;

;

b4 5 ;

3 Scope Ambiguity Discourse exhibits scope ambiguities in much the same way sentences do: whenever n scope bearing elements co-occur, the number of possible readings is 2n?1 . Whereas in sentences, scope ambiguities stem from such items as quanti ers and modal operators, in discourse they are associated with the coherence relations that hold between the assertions of discourse segments. This is illustrated by example 2 which (as noted earlier) is ambiguous between two readings { one in which if scopes over or and the other where or scopes over if. Further, just like sentence-level scope ambiguities can be lifted by further discourse, so discourse-level ones. Consider: (15) a. Sarah reads novels b. if she is unhappy c. or Jon does yoga d. if he has a headache. 10

As in example (2), there are two possible readings for (15a-c) (either if scopes over or, or vice versa). But if the discourse continues with (15d), there is only one possible reading namely the one where or scopes over if. Discourse-level scope ambiguities can be captured as in [Muskens1997] by leaving the structural relations holding between scope bearing elements underspeci ed. If as in (2) the discourse is really ambiguous, then the description will be compatible with two tree structures. For example, the description for (2) is5 : A if B3 A2

C or D7 B4

a1

C6

D8

b5

c9

where this description is satis ed by the two following tree structures: a if (b or c)3 a1 2

or(b,c)4 7

;

;

b5 6

c8 9

;

;

(a if b) or c7 (a if b)3 6 )

c8 9

;

a1 2 ;

;

b4 5 ;

On the other hand, if further discourse resolves the scope ambiguity, additional constraints are added to the description which restricts the set of trees satisfying it. Thus the description for (15) is: A if B3

5

C or D7

G if H11

A2

B4 C6

D8 G10

H12

a1

b5

c9

d13

For simplicity, we abbreviate the discourse relations of condition and contrast to

if and or respectively.

11

Interpreting (15) as either (Sarah reads novels if she is unhappy) or (Jon does yoga if he has a headache) triggers additional constraints (i.e. quasi-node identi cations) so that the only tree structure satisfying the nal description is: (a if b) or (c if d)

(a if b)

a

(c if d)

b

c

d

4 Conclusion In this paper, we have shown how a set of techniques developed to handle well-known problems in sentence processing can also bene t the processing of monologic discourse. We believe that this should not be surprising. Clearly, there are phenomena in discourse that go far beyond what one sees at the level of the sentence, especially when one considers issues of speaker intention and the presence of multiple speakers, as in dialogue. Nevertheless, as [Scott and de Souza1990] have pointed out, it is not sentences that are the units of discourse but clauses. When sentences contain multiple clauses, those clauses bear the same discourse relations to each other as they would as clauses realized as independent sentences. It is thus not surprising that certain problems in recognizing coherence relations resemble well-known problems in parsing sentences, and that similar techniques can be used for both.

References [Asher1993] N. Asher. 1993. Reference to abstract objects in discourse. Kluwer, Dordrecht. [Cristea and Webber1997] D. Cristea and B. Webber. 1997. Expectations in incremental discourse processing. Proceedings of ACL. [Joshi1987] A. Joshi. 1987. An introduction to Tree Adjoining Grammar. In A. Manaster-Ramer, ed., Mathematics of Language. John Benjamins, Amsterdam. 12

[Moser and Moore1995] M. Moser and J. Moore. 1995. Investigating cue selection and placement in tutorial discourse. In Proc. ACL, pages 130{ 135, MIT, Boston MA. [M.P.Marcus et al.1983] M.P.Marcus, D. Hindle, and M.M.Fleck. 1983. Talking about talkning about trees. In Proceedings of ACL, Cambridge, MA. [Muskens1997] R. Muskens. 1997. Order-independence and underspeci cation. University of Tilburg. [Polanyi1986] L. Polanyi. 1986. The linguistic discourse model: Towards a formal theory of discourse structure. Technical Report TR-6409, BBN Laboratories. [Prust et al.1994] H. Prust, R. Scha, and M. van den Berg. 1994. Discourse grammar and verb phrase anaphora. Linguistics & Philosophy, 17:261{ 327. [Schilder1997a] F. Schilder. 1997a. Towards a theory of discourse processing { ashback sequences described by D-trees. In Proceedings of the Formal Grammar Conference (ESSLLI'97), Aix-en-Provence. [Schilder1997b] F. Schilder. 1997b. Tree discourse grammar, or how to get attached to a discourse. In Proceedings of the Tilburg Conference on Formal Semantics, Tilburg, Netherlands, January. [Scott and de Souza1990] D. Scott and C. Sieckenius de Souza. 1990. Getting the message across in RST-based text generation. In R. Dale, C. Mellish, and M. Zock, editors, Current Research in Natural Language Generation. Academic Press, London, England. [Vijay-Shanker and Joshi1988] L. Vijay-Shanker and A. Joshi. 1988. Feature based tags. In Proceedings of ACL, Budapest. [Vijay-Shanker1992] K. Vijay-Shanker. 1992. Using descriptions of trees in a tree-adjoining grammar. Computational Linguistics. [Webber1991] B. Webber. 1991. Structure and ostension in the interpretation of discourse deixis. Language and Cognitive Processes.

13

Suggest Documents