Natural Language Processing and Text Mining for ...

Natural Language Processing and Text Mining for BIOfid Giuseppe Abrami, Sajawel Ahmed, Rüdiger Gleim, Wahed Hemati, Alexander Mehler, Tolga Uslu Goethe-Universität Frankfurt Text-technlogy Lab (TTLab)

08.03.2018

1st meeting of the Scientific Advisory Board on FID Biodiversity Research

1

Motivation Text sample

08.03.2018


2

Motivation Semantic tagging

08.03.2018


3

Motivation „Three-World“ approach

08.03.2018


4

Motivation „Three-World“ approach

08.03.2018


5

Motivation Natural Language Processing: Beispiel Frame Analysis

08.03.2018


6

Motivation Semantic search: Scenarios

08.03.2018


7

Motivation Semantic search: Scenarios

08.03.2018


8

Agenda 1. Motivation 2. Natural Language Processing

3. Ontologies 4. Machine Learning

5. Annotation 6. Conclusion

08.03.2018


9

Natural Language Processing Example: Parsing

08.03.2018


10

Natural Language Processing Result annotation: UIMA XMI and TEI P5

08.03.2018


11

Natural Language Processing The BIOfid-NLP-Model

08.03.2018


12

Natural Language Processing The BIOfid-NLP-Model

08.03.2018


13

Natural Language Processing TextImager

08.03.2018


15




08.03.2018


16

Ontologies Architecture

08.03.2018


17

Ontologies User Interface

08.03.2018

1. Sitzung des Wissenschaftlichen Beirats zum FID Biodiversitätsforschung

18


08.03.2018


19


08.03.2018


20

Ontologies Ressources I

08.03.2018


21

Ontologies Ressources II

08.03.2018


22

Ontologies Ontology Matching

08.03.2018


23




08.03.2018


24

Machine Learning ML Architecture: Entire model

08.03.2018


25

Machine Learning ML Architecture: Model components

08.03.2018


26

Machine Learning ML Architecture: ML Model Class

08.03.2018


27

Machine Learning ML Architecture: Structural Semantics

08.03.2018


28

Machine Learning ML Architecture: Structural Semantics

08.03.2018


29

Machine Learning ML Architecture : Named Entity Recognition

08.03.2018


30

Machine Learning ML Architecture: Named Entity & Kind Recognition

08.03.2018


31

Machine Learning ML Architecture: Natural Language Inference (NLI)

08.03.2018


32

Machine Learning ML Architecture: Natural Language Inference (NLI)

08.03.2018


33

Machine Learning ML Architecture: Ontology Enrichment

08.03.2018


34

Machine Learning ML Architecture: Ontology Enrichment

08.03.2018


35




08.03.2018


36

Annotation Requirements analysis

08.03.2018


37

Annotation TextAnnotator (Helfrich et al. 2018; Abrami & Mehler 2018)

08.03.2018


38


08.03.2018


39


08.03.2018


40


08.03.2018


41




08.03.2018


42

Conclusion

08.03.2018


43

Bibliography

08.03.2018


44

Bibliography

08.03.2018


45

Bibliography

Abrami, Giuseppe und Alexander Mehler (2018). „A UIMA Database Interface for Managing NLP-related Text Annotations“. In: Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 7 - 12. LREC 2018. accepted. Miyazaki, Japan. Ahmed, Sajawel (2017). „Steps towards Question Answering with Deep Neural Memory Networks“. http://publica.fraunhofer.de/documents/N-467083.html. Master Thesis. Fraunhofer IAIS, PricewaterhouseCoopers, Goethe University Frankfurt. Batista-Navarro, R., M.A. LaPorte, M. Regan, W. Ulate und Claus Weiland (Sep. 2018). „Workflow of text-mining, term recommendation and data annotation pipeline“. In: Proceedings of ICEI 10th International Conference on Ecological Informatics. submitted. Jena, Germany. Bohnet, Bernd und Joakim Nivre (2012). „A Transition-based System for Joint Part-of-speech Tagging and Labeled Non-projective Dependency Parsing“. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. EMNLP-CoNLL ’12. Jeju Island, Korea: Association for Computational Linguistics, S. 1455–1465. Bowman, Samuel R., Gabor Angeli, Christopher Potts und Christopher D. Manning (2015). „A large annotated corpus for learning natural language inference“. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics. Finkel, Jenny Rose, Trond Grenager und Christopher Manning (2005). „Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling“. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. ACL ’05. Ann Arbor, Michigan: Association for Computational Linguistics, S. 363–370. Gleim, Rüdiger u. a. (2018). „Practitioner’s view: A comparison and a survey of lemmatization and morphological tagging in German and Latin“. In: Journal of Language Modelling. submitted. Hemati, Wahed, Tolga Uslu und Alexander Mehler (2016). „TextImager: a Distributed UIMA-based System for NLP“. In: Proceedings of the COLING 2016 System Demonstrations. Federated Conference on Computer Science und Information Systems. Osaka, Japan. Komninos, Alexandros und Suresh Manandhar (2016). „Dependency based embeddings for sentence classification tasks“. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, S. 1490–1500. Levy, Omer und Yoav Goldberg (2014). „Dependency-based word embeddings“. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Bd. 2, S. 302–308. Levy, Omer, Yoav Goldberg und Ido Dagan (2015). „Improving Distributional Similarity with Lessons Learned from Word Embeddings“. In: Transactions of the Association for Computational Linguistics 3, S. 211–225. Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S Corrado und Jeff Dean (2013). „Distributed representations of words and phrases and their compositionality“. In: Advances in neural information processing systems, S. 3111–3119. Moro, Andrea, Alessandro Raganato und Roberto Navigli (2014). „Entity Linking meets Word Sense Disambiguation: a Unified Approach“. In: Transactions of the Association for Computational Linguistics (TACL) 2, S. 231–244. Naderi, Nona und Graeme Hirst (2017). „Classifying Frames at the Sentence Level in News Articles“. In: Policy 9, S. 4–233. Parsons, Terence (1994). Events in the Semantics of English: A Study in Subatomic Semantics. Bd. 19. Current Studies in Linguistics Series. Cambridge: MIT Press. Pilehvar, Mohammad Taher und Roberto Navigli (2015). „From senses to texts: An all-in-one graph-based approach for measuring semantic similarity“. In: Artificial Intelligence 228, S. 95–128. Remus, Robert, Uwe Quasthoff und Gerhard Heyer (2010). „SentiWS-A Publicly Available German-language Resource for Sentiment Analysis.“ In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Malta. Strötgen, Jannik und Michael Gertz (Juni 2013). „Multilingual and cross-domain temporal tagging“. In: Language Resources and Evaluation 47.2, S. 269–298. Uslu, Tolga, Alexander Mehler, Daniel Baumartz, Alexander Henlein und Wahed Hemati (2018). „fastSense: An Efficient Word Sense Disambiguation Classifier“. In: Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 7 - 12. LREC 2018. accepted. Miyazaki, Japan. Uslu, Tolga, Alexander Mehler, Andreas Niekler und Wahed Hemati (2018). „Towards a DDC-based Topic Network Model of Wikipedia“. In: Proceedings of 2nd International Workshop on Modeling, Analysis,and Management of Social Networks and their Applications (SOCNET 2018), February 28, 2018. accepted. Wang, Zhen, Jianwen Zhang, Jianlin Feng und Zheng Chen (2014). „Knowledge Graph Embedding by Translating on Hyperplanes.“ In: AAAI. Bd. 14, S. 1112–1119.

08.03.2018


46

Bibliography How to cite these slides Abrami, G., Ahmed, S., Gleim, R., Hemati, W., Mehler, A. and Uslu, T. (March 2018). Natural Language Processing and Text Mining for BIOfid. Presentation at the 1st Meeting of the Scientific Advisory Board of the BIOfid Project.

BibTex @misc{Abrami:Ahmed:Gleim:Hemati:Mehler:Uslu:2018, author = {Abrami, Giuseppe and Ahmed, Sajawel and Gleim, R{\"u}diger and Hemati, Wahed and Mehler, Alexander and Uslu, Tolga}, title = {{Natural Language Processing and Text Mining for BIOfid}}, howpublished = {Presentation at the 1st Meeting of the Scientific Advisory Board of the BIOfid Project}, adress = {Goethe-University, Frankfurt am Main, Germany}, year = {2018}, month = {March}, day = {08} }

08.03.2018


47

Natural Language Processing and Text Mining for ...

Natural Language Processing and Text Mining for ...

Suggest Documents

Distributed text mining and natural language processing for ... - DFKI

Natural Language Processing and Text Mining - Coh-Metrix

Text Mining: Natural Language techniques and Text Mining applications

Text-mining of PubMed abstracts by natural language processing to ...

Natural Language Processing for Intelligent

Natural Language Processing for Amazigh Language ... - AfLaT.org

Natural Language Processing for Amazigh Language ... - AfLaT.org

Blunsom - Natural Language Processing Language Modelling and ...

Natural Language Processing

Natural Language Processing

Natural Language Processing

Natural Language Processing

Natural Language Processing

Text Gathering and Processing Agent for Language

Pattern Mining with Natural Language Processing - Instituto Superior

Pattern Mining with Natural Language Processing: An exploratory ...

From Web Content Mining to Natural Language Processing

Mining Process Models from Natural Language Text: A ... - Tom Thaler

Natural Language Processing for Biosurveillance - Semantic Scholar

Introduction to Linguistics for Natural Language Processing

Replicability Analysis for Natural Language Processing: Testing ...

Evaluating Natural Language Processing Systems - Association for ...

Natural Language Processing - Association for Computational ...

Knowledge Sources for Natural Language Processing