Referring Expressions and Anaphora R - Semantic Scholar

1 downloads 0 Views 191KB Size Report
Nov 12, 2010 - We distinguish indefinite reference (eg, violent strikes, a spokesman, ..... The is called the CF-list of an utterance, the list of its forward-looking ...
1

Advanced Natural Language Processing Lecture 23 Discourse Anaphora and Coreference (Part 1)

Referring Expressions and Anaphora • Violent strikes rocked Happyland again. A spokesman for the country’s Department of Peace said they would meet with the strikers tomorrow. Another spokesman said that this was intended to demonstrate the country’s commitment to resolving the dispute.

Bonnie Webber and Mark Steedman • “Violent strikes, “Happyland”, “the country”, “the country’s Department of Peace”, “a spokesman for the country’s Department of Peace”, “they”, “the strikers”,“tomorrow”, “another spokesman”,“this”, “the country’s commitment”, “the dispute” are all referring expressions.

12 November 2010

• “again”, “the country”, “they”, “the strikers”, “another spokesman”, “this”, “the dispute” are all anaphoric expressions. Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

2

12 November 2010

3

Reference

Definite Reference

• Reference is the relation between linguistic referring expressions and referents

• Definite referring expressions like “Happyland” or “the country’s department of Peace” presuppose the availability of a uniquely identifiable referent.

• Referents are entities in the model or representation of the world Z

Referents may or may not correspond to things in the real world.

• We distinguish indefinite reference (eg, violent strikes, a spokesman, another spokesman), and definite reference (eg, Happyland, the dispute, they). • Anaphoric expressions are linguistic expressions that rely on something from the previous discourse for their interpretation.

Webber and Steedman

ANLP Lecture 23

12 November 2010

• This means that if the hearer does not already have such a referent in their discourse model, then they will accommodate it – that is, add an appropriate referent to their model, provided it is not inconsistent. Quick Quiz: 1. Is everything calm in Happyland? 2. Which department is trying to deal with the situation? Neither Happyland, nor its Department of Peace exists! But we are happy to accommodate both referents and answer questions about them. Webber and Steedman

ANLP Lecture 23

12 November 2010

4

5

Coreference

Indefinite Reference

• If two referring expressions have the same referent, they are said to corefer.

• Indefinite NPs usually introduce new referents to the discourse: – There are fairies at the bottom of my garden. – I need a beer.

• The referents of indefinite NPs are often the referents of subequent coreferential definite referring expressions:

• Unlike definites, when in the scope of propositional attitude verbs (eg, want, need ), indefinite NPs are highly ambiguous concerning the referent:

(1) There are fairiesi in my garden. The fairies i are having a ball. • They also support anaphoric expressions, either coreferential (2) or not (3):

– Harry wants to marry a Norwegian.

(2) There are fairiesi in my gardenm. Theyi are having a ball. • This might be a specific Norwegian that the speaker knows, or a Norwegian that only Harry knows, or an arbitrary Norwegian.

other fairies ≡ fairies other than fairiesi (ie, the ones in my gardenm) elsewhere ≡ places other than my gardenm

• Do I need a specific beer? (eg, your pint) or any arbitrary beer? Webber and Steedman

ANLP Lecture 23

(3) There are fairiesi in my gardenm Other fairies j live elsewherek .

12 November 2010

Webber and Steedman

ANLP Lecture 23

6

7

Coreference and World Knowledge

Coreference and Pronouns • Pronouns are a type of anaphoric expression because they rely on the previous discourse for their interpretation. – Definite pronouns: – He, she, it, they, etc. – Indefinite pronouns: One, some, elsewhere, another, other, etc. (some, another, and other are also anaphoric as articles. We’ll hear more about them shortly.) • Definite pronouns are a form of definite reference, so must corefer. • Indefinite pronouns are a form of indefinite reference, so are not used to corefer.

Webber and Steedman

ANLP Lecture 23

12 November 2010

12 November 2010

• Consider who the definite pronoun they corefers with in this minimal pair: – The policemeni refused the women a permit for the demonstration because theyi feared violence. – The policemen refused the womeni a permit for the demonstration because theyi advocated violence. • Winograd (1972) suggested that resolving this coreference depended on world knowledge, applied to the propositions “the policemen/demostrators feared/advocated violence.” – A simple way of representing this knowledge and inference is with a Head Dependency Parsing Model of the kind proposed by Collins (1997). Z There is an obvious danger of sparse data. Webber and Steedman

ANLP Lecture 23

12 November 2010

8

9

Bound Anaphora

Binding Conditions • All anaphora, including pronominal anaphora, can depend on material in previous clauses or even previous sentences, as on the previous slide.

• Such binding may be dependent on a quantified variable: – – – –

• However, pronouns can also refer sentence internally, subject to certain “binding conditions” – John likes *him/himself – John thinks Mary likes him/*himself

Z

Every Every Every Every

man thinks Mary farmer who owns man thinks every man thinks every

likes him. a donkey feeds it. woman thinks she likes him. woman thinks he likes her.

Like free anaphora, bound anaphora is subject to no clear syntactic constraint. – Both nested and crossed dependencies are freely allowed

• Binding conditions need to be acknowledged in automated anaphora resolution, to avoid errors.

Webber and Steedman

ANLP Lecture 23

12 November 2010

• You can learn about binding conditions in a book on English Grammar (eg, Rodney Huddleston and Geoffrey Pullum, The Cambridge Grammar of the English Language, both the regular and the student editions). Webber and Steedman

ANLP Lecture 23

10

11

Types of Anaphora Anaphors are linguistic expressions that rely on something from the previous discourse for their interpretation. • Definite NP anaphors • Definite pronoun anaphors

(6) One student thought Bayes Nets were great. Another student thought they were confusing. ??A student other than who?? (7) The QLT Regulations allow certain overseas lawyers and other UK qualified lawyers to become qualified as solicitors in England and Wales. ??UK qualified lawyers other than who?? • Discourse adverbials

• Event anaphors (aka “discourse deixis”) (4) We’ll meet the strikers tomorrow. This will show our commitment. ??What will show our commitment?? • Comparative anaphors (5) I like Labrador Retrievers. Smaller dogs are too noisy. ??Dogs smaller than what?? Webber and Steedman

12 November 2010

ANLP Lecture 23

(8) He wouldn’t eat the bananas. Instead he ate more spinach. Instead of what? (9) Stop at red lights. Otherwise you might get hurt. If you don’t do what? (10) He sped past the speed camera. Afterwards he received a speeding ticket. After what?

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

12

13

• Verb phrase anaphora

Questions to ask of a Resolution Algorithm

(11) I don’t have time to buy milk. Can you do it for me? Do what for me?

• What factors does the algorithm (try to) take account of? • What is its coverage? Does it handle all/some instances of a particular form (eg, all definite pronouns, all singular definite pronouns) or all/some instances serving a particular function? • When are resolution decisions made (eg, incrementally as encountered, after the rest of the sentence is parsed and interpreted)? • What is the effect of such a decision? Does it change the state of the system? • What has the algorithm been evaluated on? • How well does it do?

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

14

12 November 2010

15

Hobbs 1978

1. Begin at the NP node immediately dominating the pronoun.

Hobbs (1978) established a baseline in terms of syntax-guided resolution of coreferential pronouns (ie, not in weather or time phrases). Input: Parse tree of each sentence in the text up to and including current S.

2. Go up to the first NP or S above it. Call the node X and the path to it p. 3. Do a LR breadth-first traversal of all branches below X to the left of p . Propose as antecedent any NP node encountered that has an NP or S between it and X.

Overview: Parse trees are searched for the antecedent in order of recency. Syntactic preferences are approximated by the order in which search performed.

4. If X is the highest S in the sentence, consider the parse trees of previous Ss in recency order and traverse each in turn in LR breadth-first order. When an NP is encountered, propose it as an antecedent. If X is not the highest S, go to step 5.

Grammar fragment defining SynStruc, used in search order:

5. From X, go up to the first NP or S above it. Call it X and the path to it p.

S → NP VP NP → (Det) Nominal ({PP | Rel})* NP → pronoun NP → npr Det → determiner Webber and Steedman

PP → preposition NP Rel → wh-word S VP → verb NP (PP)* Nominal → (adj)* noun (PP)* Det → NP ’s

ANLP Lecture 23

12 November 2010

6. If X is an NP and p doesn’t pass through the Nominal that X immediately dominates, propose X as an antecedent. 7. Do a LR breadth-first traversal of all branches below X to the left of p. Propose any NP encountered as the antecedent.

Webber and Steedman

ANLP Lecture 23

12 November 2010

16

17

Applying Hobbs’ algorithm

8. If X is an S, do a LR breadth-first traversal of all branches below X to the right of p, but don’t go below any NP or S encountered. Propose any NP encountered as the antecedent. 9. Go to step 4.

(13) John saw a beautiful Integra at the dealership. He showed it to Bob. He bought it.

Notes: S

S

• When an NP is proposed as antecedent, gender/number agreement is checked.

VP

• Algorithm covers he, she, they, his (he+’s), her (she), them (they), their (they+’s). Also it, when not referring to a clause or a time/weather construction.

v pro

• Performance of algorithm is improved by applying simple selectional restrictions – e.g. Dates, places, large fixed objects can’t move. Cf. Head dependencies.

he

VP

NP

NP NP

PP

pro

prep

NP

it

to

npr

showed

• Algorithm distinguishes pronouns in (12a) and (12b). PP in (12a) attaches to NP and “his” can co-refer with “driver”, while PP in (12b) attaches to Nominal and “his” cannot. (12) a. Mr Smith saw a driver in his truck. b. Mr Smith saw a driver of his truck.

v

pro

NP

bought

he

pro

it

Bob

(14) The castle in Camelot remained the residence of the king until 536 when he moved it to London. Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

18

19

Performance of the Hobbs’ Algorithm

S2

Applied to one sample each of technical writing, fiction and news magazine.

Det

N

the

castle

#

C0

C1-C0

C2-C1

C3-C2

C9-C8

139 7 71 83 300

126 7 64 74 271

10 0 4 9 23

2 0 1 0 3

0 0 2 0 2

1 0 0 0 1

Alg correct

VP

NP3

he she it they Total

PP in

NP5 Nom Camelot

remained

PP

NP4 Det

Nom

until

NP2

the residence

PP of

Nom

C0 = entities in current & previous S, if pronoun before verb; entities in current S if after verb.

Rel

536

NP6

when Det

Nom

the

king

S1

NP

C1 = entities in current & previous S.

VP

he moved

130 7 55 73 265

Alg correct after selection 130 7 59 79 275

NP1 it

PP to

NP

Cn = entities in current & previous n Ss.

Nom London

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

20

21

Hobbs’ Algorithm: Instances of conflict

Lappin & Leass 1994

Important: Hobbs observed that over half the cases (168/300 = 56%) were “no conflict”: there was only one nearby plausible antecedent against which to resolve the pronoun. (He didn’t say precisely what he meant by “nearby”). Of the remaining 132,

Lappin and Leass (1994) developed the first algorithm to be applied to and evaluated on a substantial text corpus: IBM computer training manuals.

he she it they Total

Conflicts before selection 31 0 48 53 132

Alg correct 22 0 33 43 98

Conflicts after selection 31 0 44 45 120

Alg correct 22 0 33 41 96

Contrast: • Hobbs organises the search space, such that the first NP found that satisfies constraints is taken to be a pronoun’s antecedent.

Kehler et al. (2004) tried to exploit these “no conflict” cases in a “self-training” approach to pronoun resolution. Webber and Steedman

Main features: L&L approximate a discourse model through equivalence classes – bundling together all references to the same entity.

ANLP Lecture 23

12 November 2010

• L&L use an empirically developed weighting scheme (in effect, a hand-built Perceptron) on equivalence classes. The class with the highest weight is taken to be the pronoun’s referent. Webber and Steedman

ANLP Lecture 23

22

12 November 2010

23

Lappin & Leass 1994: Overview

Lappin & Leass 1994 Salience Value

• L&L associate initial salience value with a new referential NP based on a set of salience factors ≈ updating DM with new entity.

• A referential NP is paired with its salience value, which is the sum of its salience factors:

• L&L resolve pronouns against entities in DM, updating salience of the chosen referents before halving all values. This produces a recency preference. • Since several NPs (with different salience values based on salience factors) may refer to the same referent, need a way of combining all their contributions. So all co-referring NPs are put into the same equivalence class. The weight that a salience factor assigns to an entity is the highest of the weights it assigns to the members of its equivalence class. • Mention of an entity in successive sentences results in weights being added. Multiple mentions within the same sentence results in the maximum value of each saliency factor being assigned. Webber and Steedman

ANLP Lecture 23

12 November 2010

sentence recency (added to current S) subject emphasis (added if NP is S subject) existential emphasis (added if “there is NP ...”) direct object emphasis indirect object or oblique complement emphasis non-adverbial emphasis (added if NP isn’t in demarcated advP) head noun emphasis (added if NP isn’t embedded in another NP)

100 80 70 50 40 50 80

• A hand-built perceptron! Webber and Steedman

ANLP Lecture 23

12 November 2010

24

25

L&L: Resolving a 3rd person pronoun (L→R) 1. Identify potential equiv classes (from up to the previous four sentences) whose salience value exceeds some threshhold. 2. Remove any that don’t agree in number or gender with the pronoun. 3. Remove any that fail binding (i.e., intra-sentential syntactic coreference) constraints. 4. Compute the total salience value of each remaining referent by adding salience factors corresponding to role parallelism (+35) and cataphora (-175), to model these preferences and dispreferences.

Applying L&L algorithm (1) Sue drives an Alfa Romeo. ID Equiv Classi Valuei Equiv Classo Valueo eSue Sue 310 Sue 310 eAl f a Alfa Romeo 280 Alfa Romeo 280 (2) She drives too fast. ID Equiv Classi Valuei Equiv Classo Valueo eSue Sue 155 Sue, she2 310 eAl f a Alfa Romeo 140 Alfa Romeo 140

5. Select referent (equiv class) with highest salience value. Add pronoun to its equiv class. In case of tie, choose closest in terms of string position (direction independent). Update salience values.

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

26

12 November 2010

27

(4’) She often beats her.

(3) Mary races her on week–ends ID Equiv Classi Valuei Equiv Classo Valueo eMary Mary 310 eSue Sue, she2 155 Sue, she2, her3 280 eAl f a Alfa Romeo 70 Alfa Romeo 70

ID eMary eSue

Equiv Classi Mary Sue, she2, her3

Valuei 155 140

Equiv Classo Mary, she40 Sue, she2, her3, her40

Valueo 310 280

(4) She goes to Laguna Seca. ID Equiv Classi Valuei Equiv Classo Valueo eMary Mary 155 Mary, she4 310 eSue Sue, she2, her3 140 Sue, she2, her3 140 eAl f a Alfa Romeo 35 Alfa Romeo 35 eLS Laguna Seca 270 Laguna Seca 270

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

28

29

Performance of the Lappin & Leass algorithm

InterimConclusion: Hobbs and L&L

rd

RAP was tuned on a corpus consisting of 560 3 person pronouns (including reflexives and reciprocals) from five different computer manuals. There was some lexical and syntactic substitution to improve parses, but not so as to change syntactic relations. Results of training phase Number of pronouns Number of correct resolutions

Total 560 475 (85%)

Intersentential cases 89 72 (81%)

Intrasentential cases 471 403 (86%)

Results of testing phase Number of pronouns Number of correct resolutions

Total 360 310 (86%)

Intersentential cases 70 52 (74%)

Intrasentential cases 290 258 (89%)

• Applied to the same test cases, the Hobbs’ algorithm and RAP agreed on 83% of cases, showing significant convergence between salience measured by RAP and the configurational prominence used in the Hobbs’ algorithm on a language (English) with relatively fixed word order which is usually a good indicator of grammatical roles. • Lappin and Leass report a positive but non-significant effect of an experiment by Ido Dagan on interpolating a head dependency model based on Penn Treebank. Z

The latter inconclusive effect merely tells us that the model is too small and the data too sparse (and out of domain).

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

30

12 November 2010

31

Centering Theory

Main claims of Centering

Centering Grosz et al. (1995) is a theory of • Entity Coherence, in terms of how entities are introduced and discussed in a discourse, and • Entity Salience, predicting which entities are most salient at a given time and hence require least effort to access. The hearer’s constantly changing local attentional state tracks changes in the entities introduced into the discourse and in the extent of their salience.

• Coherence: All but the first utterance of a segment have a unique main link with the previous utterance, called the backward looking center (Cb) of the utterance. This unique link simplifies the complexity of procedures required to integrate an utterance into the discourse. • Salience: The set of entities realised by an utterance can be rank-ordered by their salience. The is called the CF-list of an utterance, the list of its forward-looking centers.

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

32

33

Coherence and Transitions

Further Claims of Centering • Identity of Cb is determined by this ranking. For utterance U j , Cb(U j ) is normally the highest ranked element of C f (U j−1) that is realised in U j . • If any entity is realised in an utterance by a pronoun, its Cb must be. (Rule 1 in [GJW 95].

Centering takes (local) discourse coherence to depend on how entities are introduced and discussed. Thus it needs means of characterising how certain ways of introducing and discussing entities are more or less coherent than other ways. It does so by means of transitions between adjacent utterances. Preferred center (C p(U j ): First element in CF-list of U j . Consider transitions between utterance U j−1 and U j .

B j = Pj B j 6= P j Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

B j = B j−1 or no B j−1 continue (CON) retain (RET)

B j 6= B j−1 smooth–shift rough–shift

ANLP Lecture 23

34

12 November 2010

35

Summary of Centering Algorithm [BFP 87] 1. CREATE a list of ref exprs (REs) for Un ordered by grammatical role. 2. IDENTIFY all possible C f lists for Un (i.e., a set of lists) by associating each RE with each discourse entity it can refer to. 3. IDENTIFY all possible Cb’s for Un (i.e. a set of elements). These are all discourse entities from Un−1 plus NIL (to allow for the absense of a Cb). 4. GENERATE all possible Cbn/C fn combinations (called “anchors”).

5. FILTER each anchor by binding constraints and adherence to centering rules. • Go through C fn−1 keeping only those which appear in C f of anchor. If anchor Cb does not equal its first elt, eliminate anchor. (Constraint on Cb realises highest ranking elt appearing in Un.) • If no entity realised as a pronoun in anchor C f equals anchor Cb, eliminate anchor. (Constraint: if anything pronominalised, Cb is.) 6. CLASSIFY by transition type and RANK by transition ordering. 7. SELECT highest ranking assignment.

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

36

37

(8) She goes to Laguna Seca.

(5) Sue drives an Alfa Romeo. (6) She drives too fast.

i. continue: Cb8=Cb7 and Cb8=Cp8 Cb8 = E1: Sue C f8 = (R3: E1) Cp8 = E1: Sue

C f6 = (E1: Sue) Cb6 = E1: Sue Cp6 = E1: Sue continue (7) Mary races her on week–ends.

ii. retain: Cb8 6= Cb7 but Cb8=Cp8 Cb8 = E2: Mary C f8 = (R3: E2) Cp8 = E2: Mary

C f7 = (R1:E2, R2:her→E1) Cb7 = E1: Sue Cp7 = E2: Mary retain: Cb7=Cb6 but Cb7 6= Cp7

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

38

12 November 2010

39

8’. She often beats her.

References

i. Cb80 = E2: Mary C f80 = (R3: E2, R4: E1) Cp80 = E2: Mary smooth-shift: Cb80 6= Cb7 but Cb80 =Cp80

Collins, Michael, 1997. “Three Generative Lexicalized Models for Statistical Parsing.” In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid. San Francisco, CA: Morgan Kaufmann, 16–23.

ii. Cb80 = E2: Mary C f80 = (R3: E1, R4: E4) Cp80 = E1: Sue rough-shift: Cb80 6= Cb7 and Cb80 6= Cp80

Grosz, Barbara, Joshi, Aravind, and Weinstein, Scott, 1995. “Centering: A Framework for Modeling the Local Coherence of Discourse.” Computational Linguistics 2:203–225. Hobbs, Jerry, 1978. “Resolving pronoun references.” Lingua 44:311–338. Kehler, Andrew, Appelt, Douglas, Taylor, Lara, and Simma, Alexandr, 2004. “The (Non)Utility of Predicate-Argument Frequencies for Pronoun Interpretation.” In Proceedings of the Annual Meeting of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL 2004). Boston, May, New Brunswick, NJ: ACL.

Webber and Steedman

ANLP Lecture 23

12 November 2010

Webber and Steedman

ANLP Lecture 23

12 November 2010

40

Lappin, Shalom and Leass, Herbert, 1994. “An Algorithm for Pronominal Anaphora Resolution.” Computational Linguistics 20:535–561. Winograd, Terry, 1972. Understanding Natural Language. Edinburgh: Edinburgh University Press.

Webber and Steedman

ANLP Lecture 23

12 November 2010