Jun 2, 2018 - (, 0.75). (, 0.35). (, 0.20). Prior Work: Lack of Accurate ...
Joint Bootstrapping Machines for High Confidence Relation Extraction Author(s): Pankaj Gupta1,2, Benjamin Roth1 and Hinrich Schütze1 Presenter: Pankaj Gupta @NAACL/ACL New Orleans, USA. 2nd June 2018 1CIS, University of Munich (LMU) | 2 Machine Intelligence, Siemens AG | June 2018 Intern © Siemens AG 2017
Outline
1. Semantic Relation Extraction 2. Semi-supervised Bootstrapping for Relation Extraction - seeding entity-pairs: BREE/BREDS system - seeding templates: BRET system 3. Joint Bootstrapping Machines (JBM) - seeding entity-pairs + templates: BREJ system 4. Experimental Evaluation
Intern © Siemens AG 2017 Seite 2
May 2017
Corporate Technology
Semantic Relation Extraction
The US Federal Trade Commission announced the decision by Google to buy DoubleClick.
Tech company Microsoft earnings hurt by Nokia acquisition. ….
…
…
Set of Sentences
…
Relationship Triples
Intern © Siemens AG 2017 Seite 3
May 2017
Corporate Technology
Relation Extractors: Prior Work Vs This Work (, 0.30) (, 0.75) (, 0.20)
Relation Extractor (traditional)
(, 0.35) (, 0.20)
Intern © Siemens AG 2017 Seite 4
May 2017
Corporate Technology
Relation Extractors: Prior Work Vs This Work (, 0.30)
Output Extractions
(, 0.75) (, 0.20)
Relation Extractor (traditional)
>0.70
(, 0.35) (, 0.20)
(, 0.75)
Intern © Siemens AG 2017 Seite 5
May 2017
Corporate Technology
Relation Extractors: Prior Work Vs This Work (, 0.30)
Output Extractions
(, 0.75) (, 0.20)
>0.70
(, 0.35) (, 0.20)
(, 0.75)
Prior Work: Lack of Accurate Estimation of Confidence)
Intern © Siemens AG 2017 Seite 6
May 2017
Corporate Technology
Relation Extractors: Prior Work Vs This Work (, 0.30)
Output Extractions
(, 0.75) (, 0.20)
>0.70
(, 0.35) (, 0.20)
(, 0.75)
Prior Work: Lack of Accurate Estimation of Confidence) (, 0.90) (, 0.95) (, 0.92) (, 0.95)
Relation Extractor (this work)
(, 0.92)
Intern © Siemens AG 2017 Seite 7
May 2017
Corporate Technology
Relation Extractors: Prior Work Vs This Work (, 0.30)
Output Extractions
(, 0.75) (, 0.20)
>0.70
(, 0.35) (, 0.20)
(, 0.75)
Prior Work: Lack of Accurate Estimation of Confidence) (, 0.90)
(, 0.90)
(, 0.95)
(, 0.95) (, 0.92) (, 0.95) (, 0.92)
>0.70
(, 0.92) (, 0.95) (, 0.92)
This Work: Reliable High Confidence Extractions Intern © Siemens AG 2017 Seite 8
May 2017
Corporate Technology
Approaches for Relation Extraction
Relation Extraction
Rule-based
Supervised
Semi-Supervised
Distant-Supervised
Bootstrapping • Plethora of unlabelled data • With few seed instances • Deal with semantic drift • Effective confidence assessment
Intern © Siemens AG 2017 Seite 9
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 10
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 11
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 12
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 13
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 14
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 15
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 16
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 17
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 18
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 19
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 20
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 21
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 22
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 23
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 24
May 2017
Corporate Technology
Bootstrapping in General: Expand Seed Set Via Multiple Hops
Intern © Siemens AG 2017 Seite 25
May 2017
Corporate Technology
Bootstrapping with Seed Types: Entity-pairs Document Collection ...Google to buy DoubleClick... ...Microsoft earnings hurt by Nokia acquisition… ...Google acquired YouTube… ...
Seed Entity-pairs
Bootstrapping Machine (BREE) (rely on input seeds and contextual similarity)
Output
Intern © Siemens AG 2017 Seite 26
May 2017
Corporate Technology
Bootstrapping with Seed Types: Templates Document Collection ...Google to buy DoubleClick... ...Microsoft earnings hurt by Nokia acquisition… ...Google acquired YouTube… …
Seed Templates
Bootstrapping Machine (BRET) (rely on input seeds and contextual similarity)
Output
Intern © Siemens AG 2017 Seite 27
May 2017
Corporate Technology
Bootstrapping with Seed Types: Entity Pairs + Templates Document Collection ...Google to buy DoubleClick... ...Microsoft earnings hurt by Nokia acquisition… ...Google acquired YouTube… …
Joint Bootstrapping Machine (BREJ) Seed: Entity-pairs + Templates
(rely on input seeds and contextual similarity)
Output
Intern © Siemens AG 2017 Seite 28
May 2017
Corporate Technology
Bootstrapping Formulation: BREX
Intern © Siemens AG 2017 Seite 29
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Intern © Siemens AG 2017 Seite 30
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Intern © Siemens AG 2017 Seite 31
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Intern © Siemens AG 2017 Seite 32
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Similar context + entity pairs
Similar context, but entity pairs does not match to seed
match to seed set
Intern © Siemens AG 2017 Seite 33
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Intern © Siemens AG 2017 Seite 34
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Intern © Siemens AG 2017 Seite 35
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Entity Pair : BREE
Can not learn “ ’s purchase of ” for REL acquired with low confidence due to tiny seed set
Intern © Siemens AG 2017 Seite 36
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Intern © Siemens AG 2017 Seite 37
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Intern © Siemens AG 2017 Seite 38
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Intern © Siemens AG 2017 Seite 39
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Intern © Siemens AG 2017 Seite 40
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Intern © Siemens AG 2017 Seite 41
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Precision-Oriented !!
Intern © Siemens AG 2017 Seite 42
May 2017
Corporate Technology
Bootstrapping Relation Extractor with Templates : BRET
Can not capture reliable patterns e.g. “ ’s pursuit of ” for REL acquired due to semantically distant from the tiny seed set
Intern © Siemens AG 2017 Seite 43
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Intern © Siemens AG 2017 Seite 44
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Disjunctive Matching
Intern © Siemens AG 2017 Seite 45
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Contains instances due to both entity pairs and templates
Intern © Siemens AG 2017 Seite 46
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Intern © Siemens AG 2017 Seite 47
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Intern © Siemens AG 2017 Seite 48
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Intern © Siemens AG 2017 Seite 49
May 2017
Corporate Technology
Joint Bootstrapping with Entity Pairs + Templates: BREJ
Amplified Extractor Confidence due to effective estimation using entity-pairs and templates
Intern © Siemens AG 2017 Seite 50
May 2017
Corporate Technology
WHY Joint Bootstrapping Machine (BREJ) ?
Joint Bootstrapping for Relation Extraction (BREJ): an alternative to the entity-pair-centered bootstrapping advantage of both entity-pair and template-centered methods jointly learn extractors consisting of instances due to both entity pair and template seeds
The proposed BREJ offers to: Scale up positives in reliable extractors Boost extractors’ confidence Generate High Confidence extractions Better deal with semantic drift Improve system recall Intern © Siemens AG 2017 Seite 51
May 2017
Corporate Technology
Cross-Context Attentive Similarity Today Microsoft acquires Nokia for $8 Billion
BEFORE (v-1)
BETWEEN (v0)
AFTER (v1)
In BREE/BREDS (Batista et al., 2015) system:
BEF
[X]
W-1 (=0.0)
today
BEF
acquires
BET
W0 (=1.0) BET
AFT
acquire
[Y]
W1 (=0.0)
j
Tech company Microsoft earnings hurt by Nokia acquisition
for $8 billion
AFT
i
[X]
Tech company
BEF
Earnings hurt by
BET
Proposed Similarity measure: BET
acquire
[Y]
max
acquisition
AFT
Intern © Siemens AG 2017 Seite 52
May 2017
Corporate Technology
Experimental Evaluation: Data and Evaluation • Use processed dataset (Batista et al., 2015) of 1.2 million sentences from new articles • Bootstrap four relationships: - acquired (ORG-ORG), - founder-of (ORG-PER), - headquartered (ORG-LOC) and - affiliation (ORG-PER) • Bootstrap with seed entity-pairs (BREDS), templates (BRET) and both (BREJ) • Investigate four similarity measures • Evaluation based on: - Bronzi et al. (2012) to estimate precision and recall using FreebaseEasy (Bast et al., 2014) : Pointwise Mutual Information (PMI) (Turney, 2001) to evaluate system automatically Intern © Siemens AG 2017 Seite 53
May 2017
Corporate Technology
Experimental Evaluation: State-of-the-art Comparison
Intern © Siemens AG 2017 Seite 54
May 2017
Corporate Technology
Experimental Evaluation: State-of-the-art Comparison
Intern © Siemens AG 2017 Seite 55
May 2017
Corporate Technology
Experimental Evaluation: State-of-the-art Comparison
Intern © Siemens AG 2017 Seite 56
May 2017
Corporate Technology
Experimental Evaluation: State-of-the-art Comparison
Intern © Siemens AG 2017 Seite 57
May 2017
Corporate Technology
Experimental Evaluation: State-of-the-art Comparison
Intern © Siemens AG 2017 Seite 58
May 2017
Corporate Technology
Experimental Evaluation: State-of-the-art Comparison
0.11+ F1
0.13+ F1
Intern © Siemens AG 2017 Seite 59
May 2017
Corporate Technology
Results Analysis: Scaling low-confidence extractions to high-confidence
Different thresholds over the extracted instances for acquired relationship: Evaluate only the extractions with score > threshold Instance Confidence scaled and moved from low-to-high confidence region - Higher #out, Recall and F1 in BREJ than BREE Intern © Siemens AG 2017 Seite 60
May 2017
Corporate Technology
Qualitative Analysis: High Confidence Relation Extractors
Intern © Siemens AG 2017 Seite 61
May 2017
Corporate Technology
Qualitative Analysis: High Confidence Relation Extractors
Intern © Siemens AG 2017 Seite 62
May 2017
Corporate Technology
Qualitative Analysis: High Confidence Relation Extractors
Intern © Siemens AG 2017 Seite 63
May 2017
Corporate Technology
Summary !! Proposed a Joint Bootstrapping Machine (JBM) for relation extraction alternative to the entity-pair-centered bootstrapping takes advantage of both entity-pair and template-centered methods jointly learn extractors consisting of instances due to both entity pair and template seeds In results, BREJ offers to: Scale up the confidence score for reliable extractors Extract High Confidence relationship triples Better deal with semantic drift via effective confidence estimation Improve system recall Demonstrate (on average for four relationships): 0.11+ in F1 (0.85 Vs 0.74) due to BREJ 0.13+ in F1 (0.87 Vs 0.74) due to BREJ+cross-context attentive similarity
Intern © Siemens AG 2017 Seite 64
May 2017
Corporate Technology
Thanks !! 1.0 Related Work
(BREE)
(, 0.99)
High Confidence
(, 0.98) (, 0.97) (, 0.96) (, 0.95)
This Work: BREJ
(, 0.35) (, 0.30) (, 0.20)
Low Confidence
(, 0.15) (, 0.10)
.00 Intern © Siemens AG 2017 Seite 65
May 2017
Output Extractions, Confidence Corporate Technology
References Eugene Agichtein and Luis Gravano. 2000. Snowball: Extracting relations from large plain-text collections. In Proceedings of the 15th ACM conference on Digital libraries. Association for Computing Machinery, Washington, DC USA, pages 85–94 Gabor Angeli, Melvin Jose Johnson Premkumar, and Christopher D Manning. 2015. Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Association for Computational Linguistics, Beijing, China, volume 1, pages 344– 354. Hannah Bast, Florian Baurle, Bj ¨ orn Buchhold, and El- ¨ mar Haußmann. 2014. Easy access to the freebase dataset. In Proceedings of the 23rd International Conference on World Wide Web. Association for Computing Machinery, Seoul, Republic of Korea, pages 95–98. David S. Batista, Bruno Martins, and Mario J. Silva. ´ 2015. Semi-supervised bootstrapping of relationship extractors with distributional semantics. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal, pages 499–504. Sergey Brin. 1998. Extracting patterns and relations from the world wide web. In International Workshop on The World Wide Web and Databases. Springer, Valencia, Spain, pages 172–183. Mirko Bronzi, Zhaochen Guo, Filipe Mesquita, Denilson Barbosa, and Paolo Merialdo. 2012. Automatic evaluation of relation extraction systems on large-scale. In Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Webscale Knowledge Extraction (AKBCWEKEX). Association for Computational Linguistics, Montreal, Canada, pages 19–24. Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R. Hruschka Jr., and Tom M. Mitchell. 2010. Toward an architecture for neverending language learning. In Proceedings of the 24th National Conference on Artificial Intelligence (AAAI). Atlanta, Georgia USA, volume 5, page 3. Laura Chiticariu, Yunyao Li, and Frederick R. Reiss. 2013. Rule-based information extraction is dead! long live rule-based information extraction systems! In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Seattle, Washington USA, pages 827–832. Intern © Siemens AG 2017 Seite 70
May 2017
Corporate Technology
References Anthony Fader, Stephen Soderland, and Oren Etzioni. 2011. Identifying relations for open information extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Edinburgh, Scotland UK, pages 1535–1545. Pankaj Gupta, Thomas Runkler, Heike Adel, Bernt Andrassy, Hans-Georg Zimmermann, and Hinrich Schutze. 2015. Deep learning methods for the extraction of relations in natural language text. Technical report, Technical University of Munich, Germany. Pankaj Gupta, Hinrich Schutze, and Bernt Andrassy. 2016. Table filling multi-task recurrent neural network for joint entity and relation extraction. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers. Osaka, Japan, pages 2537–2547. Sonal Gupta, Diana L. MacLean, Jeffrey Heer, and Christopher D. Manning. 2014. Induced lexicosyntactic patterns improve information extraction from online medical forums. Journal of the American Medical Informatics Association 21(5):902– 909. Sonal Gupta and Christopher Manning. 2014. Improved pattern learning for bootstrapped entity extraction. In Proceedings of the 18th Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Baltimore, Maryland USA, pages 98–108. Marti A Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 15th International Conference on Computational Linguistics. Nantes, France, pages 539–545. Winston Lin, Roman Yangarber, and Ralph Grishman. 2003. Bootstrapped learning of semantic classes from positive and negative examples. In Proceedings of ICML 2003 Workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining. Washington, DC USA, page 21.
Intern © Siemens AG 2017 Seite 71
May 2017
Corporate Technology
References Mausam, Michael Schmitz, Robert Bart, Stephen Soderland, and Oren Etzioni. 2012. Open language learning for information extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, Jeju Island, Korea, pages 523–534. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the Workshop at the International Conference on Learning Representations. ICLR, Scottsdale, Arizona USA. Thien Huu Nguyen and Ralph Grishman. 2015. Relation extraction: Perspective from convolutional neural networks. In Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. ACL, Denver, Colorado USA, pages 39–48. Robert Parker, David Graff, Junbo Kong, Ke Chen, and Kazuaki Maeda. 2011. English gigaword. Linguistic Data Consortium . Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. ACL, Doha, Qatar, pages 1532–1543. Ellen Riloff. 1996. Automatically generating extraction patterns from untagged text. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI). Portland, Oregon USA, pages 1044– 1049. Peter D. Turney. 2001. Mining the web for synonyms: Pmi-ir versus lsa on toefl. In Proceedings of the 12th European Conference on Machine Learning. Springer, Freiburg, Germany, pages 491–502. Ngoc Thang Vu, Heike Adel, Pankaj Gupta, and Hinrich Schutze. 2016a. Combining recurrent and con- ¨ volutional neural networks for relation classification. In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). ACL, San Diego, California USA, pages 534–539. Ngoc Thang Vu, Pankaj Gupta, Heike Adel, and Hinrich Schutze. 2016b. Bi-directional recurrent neural network with ranking loss for spoken language understanding. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, Shanghai, China, pages 6060–6064. Intern © Siemens AG 2017 Seite 72
May 2017
Corporate Technology