Localizing Dependencies and Supertagging - DFKI

6 downloads 392 Views 699KB Size Report
May 17, 2005 - Examples - “purchase”. Noun Phrase. Nominal Modifier. Nominal Predicative. The domain over which a word specifies its syntactic constraints: ...
Localizing Dependencies and Supertagging Srinivas Bangalore – Data­oriented Parsing (Chp. 15)

Dafydd Jones 17 May 2005

Supertags ●

“Simple” tags reflect a word's syntactic category – –



nouns: N, verbs: V, adjectives: JJ car/N jump/V

Supertags are: – –

associated with at least one lexical item larger than Context-Free rules

VPV NP

Examples - “purchase” The domain over which a word specifies its syntactic constraints:

Noun Phrase Nominal Modifier Nominal Predicative

Properties: Lexicalization Part of lexicalized grammar, e.g. one that contains supertags, consists of: ●





a finite set of elementary structures (strings, trees, directed acyclic graphs, etc.), each structure anchored on a lexical item. lexical items, each associated with at least one of the elementary structures of the grammar. a finite set of operations combining these structures.

Properties: EDL Extended Domain of Locality ●



Every supertag must contain all and only the arguments of the anchor in the same structure. For each lexical item, the grammar must contain a supertag for each syntactic environment the lexical item can appear in.

Properties: FRD ●

Factoring Recursion away from Domain of Dependencies – – –

Recursive constructs are represented as auxiliary supertags Initial supertags define the domains for agreement, subcategorization Auxiliary trees, by adjunction to initial supertags, allow for the long-distance behaviour of these dependencies

Combining Supertags ●

Substitution – –



Adjunction – –



inserts elementary trees at the substitution nodes of other elementary trees. root label must match label of substitution node an auxiliary tree is inserted into an elementary tree at a node that matches both root and foot node of aux tree the node adjoined to splits

Example tree

Combining Supertags: Examples Parse tree for the sentence: “The purchase price includes two ancillary companies”

Extracting Supertags ● ●

● ●

Where do they come from? Supertags can be extracted from an annotated corpus e.g. Penn Treebank Use a head-word percolation table to build trunk Mechanism to decide between adjunct and complements

● ●

Resulting in upto 99.96% coverage of corpus

Supertag Disambiguation ●



● ●



Even when a word has unique standard POS – e.g. Verb – it will most likely have multiple Supertags After parsing, each lexical item should be associated with one supertag. Parser could do this – but expensive Supertag disambiguation before parse greatly speeds up the work of the parser Use information about local dependencies and statistical information to disambiguate

Using Structural Information ●

Span constraint: – –



Left/Right Span constraint: –



calculate the minimum number of lexical items the supertag covers if the input contains fewer items than the span, then eliminate supertag similarly, if the span to the left/right of anchor is larger than input, then eliminate

Neighbouring lexical items: –

if the terminals specified in the supertag do not appear in the input, then eliminate

Structural Filtering: Example ●

Leads to a reduction of almost 50% of supertag ambiguity Span constraint: “Includes batteries!” Left Span: “The price includes sales tax” Lexical items: Active/Passive - “..included by..”

Trigram Model ●



Adapted from state-of-the-art method for POS tagging – can achieve around 97% accuracy Based on sequences of n tags –



Unigrams – most likely tag for lexical item

n=3 usually taken - Trigrams ^

N

T =argmaxT ∏ PrT i∣T i−2 ,T i−1∗PrW i∣T i  i=1

Contextual probability * Word emit probability

Trigrams: Results ●

Trained on 1,000,000 words of Wall Street Journal corpus –



Tested on 47,000 words –

● ●

sections 00 to 24, except section 20 of WSJ section 20 of WSJ

300 supertags (tree-frames) used Results: 92.2% accuracy

● ●

Can be improved slightly by assigning sets of supertags (average ambiguity 1.3 supertags)

Parsing from Supertags ●

Use directly encoded requirements to establish dependency links

● ● ● ●

Fill substitution nodes with complements Attach foot node to a modified supertag

Lightweight Dependency Analyzer ● ●

Heuristic-based,linear time,deterministic algorithm Algorithm: Pass 1: For each modifier supertag s Compute dependencies for s Mark complements (unavailable for pass 2) Pass 2: For each non-modifier supertag s Compute dependencies for s Compute dependencies for s (anchor w) For each frontier node d in s Connect nearest word to left/right of w depending on direction of d, such that label(d) matches an internal node, ignoring marked supertags

LDA: Results ● ●

Trained on 200,000 words Tested on 2000 sentences (from section 20 WSJ)

● ● ●

Recall – 82.3% Precision – 93.8%

● ●



Produces partial linkages because of need to satisfy local constraints Robust against incomplete sentences (fragments)

Applications ●

Information Retrieval – –







Generate patterns to use in post-filtering of IR results, to improve precision Make use of contextual syntactic information in query, obtained using supertagger

Manually select set of relevant sample sentences, and a word of interest Associate supertags with words in training sentences, and generalize context to tag name Experiment shows increase in precision from 33.3% -> 79.3%

Conclusions ●

● ● ●

By localizing dependencies, we assign richer descriptions to words Simple disambiguation leads to an “almost parse” Reduces work required by the parser Leads to robust analysis of irregular input

● ●

Fast, “shallow” method that leads to useful linguistic analysis

References ●







Bangalore, S. 2003. Localizing Dependencies and Supertagging,Chapter 15 in Data Oriented Parsing, eds. Rens Bod, Remko Scha and Khalil Sima'an, CSLI Publications Bangalore, S and Joshi, A. 1999. Supertagging: An approach to almost parsing. Computational Linguistics 25(2) Chandrasekar, R. and Srinivas, B. 1998. Glean: Using syntactic information in document filtering. Information Processing and Management 34(5) Supertagging without Tears http://www.cis.upenn.edu/~mickeyc/stag/supertags.html