Relation Classification for Clinical Data using ...

0 downloads 0 Views 298KB Size Report
[2] Linlin Wang, Zhu Cao, Gerard de Melo, and Zhiyuan Liu. Relation classification via multi-level attention cnns. In ACL, 2016. Acknowledgements. I would like ...
Relation Classification for Clinical Data using Attention-based Neural Networks Desh Raj (130101018) Department of Computer Science and Engineering, Indian Institute of Technology Guwahati

Proposed Methods

Objectives Our objectives in this work are as follows: • We

aim to extract structured knowledge from biomedical data such as journal articles, discharge summaries, and electronic health records. • To achieve this, we classify relations existing between entities such as problem, treatment, and test. • We exploit the power of attention-based convolutional neural networks to focus on the important words that decide which class of relations an entity pair belongs to.

Motivation

The attention model fails due to several theoretical and implementation-based reasons that we have identified. To alleviate these issues:

We propose two models for this purpose: 1 Attention for feature extraction [1]: i Variant 1: Instance-based attention ii Variant 2: Keyword-based attention 2

• Use

relative probability as degree of relevance. • Downsample the “None” class during training. • Implement attention-based-pooling and compare results. • Integrate domain knowledge for attention.

Multi-level attention: This contains attention-based pooling in addition to the feature selection [2].

Figure : Instance-based attention model for feature extraction. In this model, the attention assigned to any word is the average of its inner product with both the entity pairs.

Figure : Multi-level attention based CNN. It consists of two attention layers - primary attention for feature selection, and secondary attention for pooling.

Some words or phrases are more important than others to identify a relation. Example:

Important Result

“The fire inside the World Trade Center was by exploding fuel.”

Integrating domain knowledge to model the degree of relevance for attention improves their performance significantly compared to an instance-based attention model.

caused

“The flowers are

Conclusion

Multi-level Attention CNN

Future Work An attention-based pooling mechanism may learn to focus on important sentence-level features. Further, it may be important to study the effect of attention in conjugation with recurrent networks and LSTMs. These models should also be validated on larger and more unstructured data such as the DDI extraction 2013 data set.

References

carried into the chapel.”

• The

first example motivates attention at the feature layer → similarity with entity words or domain-specific keywords. • The second example motivates attention based pooling to identify important phrases → global feature extraction.

Experimental Results

Failure Analysis Inner product of word vectors is not always effective to model attention. 2 Absolute attention value is drastically reduced in large sentences due to softmax → learning rate is compromised. 3 PIP class is often wrongly classified due to presence of strong indicators for other classes. 1

Figure : Description of the partial i2b2 2010 data set used for evaluation.

[1] Sunil Kumar Sahu, Ashish Anand, Krishnadev Oruganty, and Mahanandeeshwar Gattu. Relation extraction from clinical texts using domain invariant convolutional neural network. In ACL BioNLP, 2016. [2] Linlin Wang, Zhu Cao, Gerard de Melo, and Zhiyuan Liu. Relation classification via multi-level attention cnns. In ACL, 2016.

Acknowledgements I would like to thank my BTP supervisor, Prof. Ashish Anand, for his insights and suggestions, and the CSE department, for the resources used in the project.

Model Precision Recall F1-score CNN [1] 71.0% 55.4% 60.3% Att-CNN Variant 1 23.7% 5.6% 9.06% Att-CNN Variant 2 51.6% 34.9% 41.2 % Table : Relation classification results using various models. Figure : An attention model.

Contact Information (a) Attention proportionate to (b) Absolute attention assigned highest value. to different words. Figure : Degree of relevance for the sample sentence: “We will monitor for signs of alcohol withdrawal according to the CIWA scale.” with relation TeCP. Red bars denote the words comprising the entity pair.

• Web:

http://www.rdesh26.wixsite.com/home • Email: [email protected] • Phone: +91-8011025825