Incremental Joint POS Tagging and Dependency ... - Google Sites
Recommend Documents
Yusuke Miyao (National Center of Informatics). Jun'ichi Tsujii (Microsoft Research Asia). In IJCNLP-2011 ... tags that a
range syntactic information. Also, the traditional pipeline approach to POS tagging and depen- dency parsing may suffer
makes other error detection and correction tasks very difficult. .... taining various
errors, applying a statistical machine translation (SMT) technique where input ...
était le Verbe.” Fig. 1. ... between the copula était and the subject Verbe. ..... (
because >=2000 analyse) >=2000 666. 1000. 867. × (becausenoanalyse). 11.
bitrary depth, the corresponding dynamic grammar provides an unlimited ... way ms a top down parser. They are not ... :1. This accepts .' grammatical any string which maps from the initial state, 0, to the final state, 3 (i.c. strings of the form: ..
modello HMM (Hidden Markov Model), basato su un clas- sificatore perceptron ... Model, based on a regularized perceptron
The precedence constraints are satisfied if there is no cycle between ...... Nuijten, W.; Bousonville, T.; Focacci, F. Godard, D. Le Pape, C.: MaScLib: Prob-.
tation guidelines or standards. This seems to be a great waste of human efforts, and it would be nice to automatically a
regularity within a particular POS, so the base- line used in Clark (2003) may be informative. ..... 784â789. Alexander Clark. 2003. Combining distri- butional and ...
Improving POS Tagging Using Machine{Learning Techniques. Llu s M .... Finally, section 6 concludes. .... 3The programs were implemented using PERL-5.0.
usually an incremental approach to training the tagger is ap- plied. ... going through hours and hours of manual work or how this task could be ..... Stylebook for.
particular taggers (Toutanova and Manning,. 2000), the various ... and Manning (2000) a number of other hand- ..... line used in Clark (2003) may be informative.
the language properties and are used to improve tagging accuracy (Brill, 1995; Wilson and. Heywood ... context in a sentence (Steven Bird and Loper, 2009).
algorithm in such a way that for each state in the search space, we maintain the b best candidates, ..... [8]J ohn C arroll, Ted B riscoe, and Antonio San fi lippo.
used to train the parsing models. Although the ... Catalan and Hungarian. This, we believe, also has ... how much a parse tree deviates from the gold stan- dard.
Department of Linguistics and Philology, Uppsala University, Box 635, 75126
Uppsala, Sweden. E-mail: ... 2008 Association for Computational Linguistics ...
Jul 30, 2015 - Proceedings of the 19th Conference on Computational Language Learning, pages .... transferred from one task to another is embedded .... relevant hidden representations, htc is the hidden ... tions, which we call chunks.
dency parsing of spoken Japanese monologue on a clause-by- clause basis. ... advocating the capital punishment is nearly 80%.)â is presented in Fig. 1. Here ...
un dolce di frutta ha ordinato il maestro a cake with fruits has ordered the teacher and in general all the five permutations of the "basic". (i.e. more likely) SVO ...
as speech recognition and machine translation because it could allow for ... a semi-incremental (two-pass), linear-time parser that employs fixed and ... most graph is fully connected at each parse state, which has ..... Based on these syntactic patt
[email protected], [email protected]. 2Istituto ... ROM has been tagged with Part of Speech (Pos) and morpho-syntactic information, using and adapting an already existing tool trained ..... http://www.ilc.cnr.it/EAGLES96/browse.html.
This paper presents our experimental work on analysis of sentiments and mood from a large number of Weblogs (blog posts) on two interesting topics namely ...
Incremental Joint POS Tagging and Dependency ... - Google Sites
sentence length. ⦠Solution: Incremental (shift-reduce) parsing with beam search. Partial state packing (DP with graph
Incremental Joint POS Tagging and Dependency Parsing in Chinese Jun Hatori (University of Tokyo) Takuya Matsuzaki (University of Tokyo) Yusuke Miyao (National Center of Informatics) Jun’ichi Tsujii (Microsoft Research Asia)
In IJCNLP-2011 Chiang Mai, Thailand ’11/11/11
Why Joint?
Jointly solve tagging and dependency parsing (assuming gold segmentation) ◦ Traditional pipeline approach to POS tagging and dependency parsing may suffer from error propagation. ◦ Chinese POS tagging sometimes requires long-range syntactic information. A noun or verb? 的: DEG (genitive marker) v.s. DEC
(complementizer)
2
Overview
Joint POS tagging and dependency parsing ◦ First incremental approach Simple extension of shift-reduce algorithm Advantageous in computational efficiency
◦ Achieved the new state-of-the-art performance for Chinese tagging and parsing Still competitive in speed to baseline systems First positive tagging result in the joint approach
◦ Experiments based on Mandarin, but generally applicable to other languages as well.
Challenges for a Joint Model
Computational complexity
◦ Search space increases with the factor of 𝑇 𝑁 , where T is the number of tags and N is the sentence length. ◦ Solution: Incremental (shift-reduce) parsing with beam search Partial state packing (DP with graph-structured stack)
Lack of look-ahead POS information ◦ POS tags of look-ahead words are undetermined when choosing the next action ◦ Solution: Introduce a concept of “delayed features”
4
Baseline Tagger
Trigram POS tagger ◦ Viterbi search using beam size of 16 ◦ Use features described in [Zhang & Clark, 2008] ◦ Standard pruning for Chinese Tag dictionary: for frequent words, only considers POS tags that appear in the training "把"∘ ∘𝑞0𝑞.0𝑤. 𝑤==忘记 忘记∘ ∘𝑞0𝑞.0𝑡. 𝑡=="𝐀𝐑𝐆𝟏” "𝐕𝐕”
26
Experiment
Penn Chinese Treebank 5 (CTB-5) ◦ assume gold segmentation for input
Baseline models ◦ Pipeline POS tagger and dependency parser Baseline-Tagger: re-implementation of [Zhang & Clark 08] Parser-HS: dependency parser by [Huang & Sagae 10] Parser-ZN: dependency parser by [Zhang & Nivre 11]
◦ Third-order graph-based joint models by [Li+ 11]
Joint models ◦ Joint-HS: joint model using features in Parser-HS ◦ Joint-ZN: joint model using features in Parser-ZN
27
Feature ablation results
Delayed features, dynamic programming, and syntactic features improved parsing accuracy. DP is not effective for joint-ZN because of richer features. 82
81.5 81
default wo/delay wo/dp wo/syn
80.5 80 79.5
Joint-HS
Joint-ZN
28
Final result POS accuracy Pipeline-HS Pipeline-ZN
93.82 †
Joint-HS
94.01
Joint-ZN
93.94
Dependency UAS
Dependency Root accuracy
Speed (sentence/sec)
77.13
72.59
32.7
77.83
74.82
4.8
73.86
9.5
77.93
1.5
†
79.83
81.33
†
0.1–0.2% improvement on tagging accuracy 2.7–3.5% improvement on parsing accuracy Joint parsing takes ~3x time
Analysis Resolved many POS ambiguities that critically affect syntactic structure. Most of the increased error patterns are not critical for the syntactic structure.
Related works
Dual decomposition ◦ Rush et al. (2010) combine a constituency parser and a trigram POS tagger.
Graphical model ◦ Lee et al. (2011) solve morphological disambiguation and dependency parsing in morphologically-rich languages.
Graph-based model ◦ Li et al. (2011) built a third-order joint POS tagging and dependency parsing model, with finely-tuned pruning techniques. 32
Conclusion
Proposed the first incremental framework for joint POS tagging and dependency parsing. Outperforms the pipeline and baseline models, and achieved the best accuracies for Chinese. ◦ Tagging of syntactically-influential POS tags are selectively improved. ◦ Still competitive in speed to baseline systems, and comparable to singleton parsers. 33