Learning the Scope of Negation via Shallow Semantic Parsing ... tend SoN learning from the chunking level .... Figure 2: An illustration of a negation signal and its arguments in a parse tree. ... Table 1 lists the basic features for argument.
Learning the Scope of Negation via Shallow Semantic Parsing Guodong Zhou ∗ Hongling Wang Qiaoming Zhu School of Computer Science and Technology Soochow University at Suzhou {lijunhui, gdzhou, redleaf, qmzhu}@suda.edu.cn Junhui Li
the accuracy of 95.8%-98.7% on the three subcorpora of the Bioscope corpus (Morante and Daelemans, 2009). In this paper, we focus on negation scope finding instead. That is, we assume golden negation signal finding. Finding negative assertions is essential in information extraction (IE), where in general, the aim is to derive factual knowledge from free text. For example, Vincze et al. (2008) pointed out that the extracted information within the scopes of negation signals should either be discarded or presented separately from factual information. This is especially important in the biomedical domain, where various linguistic forms are used extensively to express impressions, hypothesized explanations of experimental results or negative findings. Szarvas et al. (2008) reported that 13.45% of the sentences in the abstracts subcorpus of the BioScope corpus and 12.70% of the sentences in the full papers subcorpus of the Bioscope corpus contain negative assertions. In addition to the IE tasks in the biomedical domain, SoN learning has attracted more and more attention in some natural language processing (NLP) tasks, such as sentiment classification (Turney, 2002). For example, in the sentence “The chair is not comfortable but cheap”, although both the polarities of the words “comfortable” and “cheap” are positive, the polarity of “the chair” regarding the attribute “cheap” keeps positive while the polarity of “the chair” regarding the attribute “comfortable” is reversed due to the negation signal “not”. Most of the initial research on SoN learning focused on negated terms finding, using either some heuristic rules (e.g., regular expression), or machine learning methods (Chapman et al., 2001; Huang and Lowe, 2007; Goldin and Chapman, 2003). Negation scope finding has been largely ignored until the recent release of
Abstract In this paper we present a simplified shallow semantic parsing approach to learning the scope of negation (SoN). This is done by formulating it as a shallow semantic parsing problem with the negation signal as the predicate and the negation scope as its arguments. Our parsing approach to SoN learning differs from the state-of-the-art chunking ones in two aspects. First, we extend SoN learning from the chunking level to the parse tree level, where structured syntactic information is available. Second, we focus on determining whether a constituent, rather than a word, is negated or not, via a simplified shallow semantic parsing framework. Evaluation on the BioScope corpus shows that structured syntactic information is effective in capturing the domination relationship between a negation signal and its dominated arguments. It also shows that our parsing approach much outperforms the state-of-the-art chunking ones.
1
Introduction
Whereas negation in predicate logic is well-defined and syntactically simple, negation in natural language is much complex. Generally, learning the scope of negation involves two subtasks: negation signal finding and negation scope finding. The former decides whether the words in a sentence are negation signals (i.e., words indicating negation, e.g., no, not, fail, rather than), where the semantic information of the words, rather than the syntactic information, plays a critical role. The latter determines the sequences of words in the sentence which are negated by the given negation signal. Compared with negation scope finding, negation signal finding is much simpler and has been well resolved in the literature, e.g. with ∗
Corresponding author
671 Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), pages 671–679, Beijing, August 2010
the BioScope corpus (Szarvas et al., 2008; Vincze et al., 2008). Morante et al. (2008) and Morante and Daelemans (2009) pioneered the research on negation scope finding by formulating it as a chunking problem, which classifies the words of a sentence as being inside or outside the scope of a negation signal. However, this chunking approach suffers from low performance, in particular on long sentences, due to ignoring structured syntactic information. For example, given golden negation signals on the Bioscope corpus, Morante and Daelemans (2009) only got the performance of 50.26% in PCS (percentage of correct scope) measure on the full papers subcorpus (22.8 words per sentence on average), compared to 87.27% in PCS measure on the clinical reports subcorpus (6.6 words per sentence on average). This paper explores negation scope finding from a parse tree perspective and formulates it as a shallow semantic parsing problem, which has been extensively studied in the past few years (Carreras and Màrquez, 2005). In particular, the negation signal is recast as the predicate and the negation scope is recast as its arguments. The motivation behind is that structured syntactic information plays a critical role in negation scope finding and should be paid much more attention, as indicated by previous studies in shallow semantic parsing (Gildea and Palmer, 2002; Punyakanok et al., 2005). Our parsing approach to negation scope finding differs from the state-of-the-art chunking ones in two aspects. First, we extend negation scope finding from the chunking level into the parse tree level, where structured syntactic information is available. Second, we focus on determining whether a constituent, rather than a word, is negated or not. Evaluation on the BioScope corpus shows that our parsing approach much outperforms the state-of-the-art chunking ones. The rest of this paper is organized as follows. Section 2 reviews related work. Section 3 introduces the Bioscope corpus on which our approach is evaluated. Section 4 describes our parsing approach by formulating negation scope finding as a simplified shallow semantic parsing problem. Section 5 presents the experimental results. Finally, Section 6 concludes the work.
672
2
Related Work
While there is a certain amount of literature within the NLP community on negated terms finding (Chapman et al., 2001; Huang and Lowe, 2007; Goldin and Chapman, 2003), there are only a few studies on negation scope finding (Morante et al., 2008; Morante and Daelemans, 2009). Negated terms finding Rule-based methods dominated the initial research on negated terms finding. As a representative, Chapman et al. (2001) developed a simple regular expression-based algorithm to detect negation signals and identify medical terms which fall within the negation scope. They found that their simple regular expression-based algorithm can effectively identify a large portion of the pertinent negative statements from discharge summaries on determining whether a finding or disease is absent. Besides, Huang and Lowe (2007) first proposed some heuristic rules from a parse tree perspective to identify negation signals, taking advantage of syntactic parsing, and then located negated terms in the parse tree using a corresponding negation grammar. As an alternative to the rule-based methods, various machine learning methods have been proposed for finding negated terms. As a representative, Goldin and Chapman (2003) adopted both Naïve Bayes and decision trees to distinguish whether an observation is negated by the negation signal “not” in hospital reports. Negation scope finding Morante et al. (2008) pioneered the research on negation scope finding, largely due to the availability of a large-scale annotated corpus, the Bioscope corpus. They approached the negation scope finding task as a chunking problem which predicts whether a word in the sentence is inside or outside of the negation scope, with proper post-processing to ensure consecutiveness of the negation scope. Morante and Daelemans (2009) further improved the performance by combing several classifiers. Similar to SoN learning, there are some efforts in the NLP community on learning the scope of speculation. As a representative, Özgür and Radev (2009) divided speculation
learning into two subtasks: speculation signal finding and speculation scope finding. In particular, they formulated speculation signal finding as a classification problem while employing some heuristic rules from the parse tree perspective on speculation scope finding.
3
For preprocessing, all the sentences in the Bioscope corpus are tokenized and then parsed using the Berkeley parser 2 (Petrov and Klein, 2007) trained on the GENIA TreeBank (GTB) 1.0 (Tateisi et al., 2005) 3 , which is a bracketed corpus in (almost) PTB style. 10-fold cross-validation on GTB1.0 shows that the parser achieves the performance of 86.57 in F1-measure. It is worth noting that the GTB1.0 corpus includes all the sentences in the abstracts subcorpus of the Bioscope corpus.
Negation in the BioScope Corpus
This paper employs the BioScope corpus (Szarvas et al., 2008; Vincze et al., 2008) 1 , a freely downloadable negation resource from the biomedical domain, as the benchmark corpus. In this corpus, every sentence is annotated with negation signals and speculation signals (if it has), as well as their linguistic scopes. Figure 1 shows a self-explainable example. In this paper, we only consider negation signals, rather than speculation ones. Our statistics shows that 96.57%, 3.23% and 0.20% of negation signals are represented by one word, two words and three or more words, respectively. Additional, adverbs (e.g., not, never) and determiners (e.g., no, neither) occupy 45.66% and 30.99% of negation signals, respectively.
4
In this section, we first formulate the negation scope finding task as a shallow semantic parsing problem. Then, we deal with it using a simplified shallow semantic parsing framework. 4.1
Formulating Negation Scope Finding as a Shallow Semantic Parsing Problem
Given a parse tree and a predicate in it, shallow semantic parsing recognizes and maps all the constituents in the sentence into their corresponding semantic arguments (roles) of the predicate. As far as negation scope finding considered, the negation signal can be regarded as the predicate 4 , while the scope of the negation signal can be mapped into several constituents which are negated and thus can be regarded as the arguments of the negation signal. In particular, given a negation signal and its negation scope which covers wordm, …, wordn, we adopt the following two heuristic rules to map the negation scope of the negation signal into several constituents which can be deemed as its arguments in the given parse tree. 1) The negation signal itself and all of its ancestral constituents are non-arguments. 2) If constituent X is an argument of the given negation signal, then X should be the highest constituent dominated by the scope of wordm, …, wordn. That is to say, X’s parent constituent must cross-bracket or include the scope of wordm, …, wordn.
These findings indicate that corticosteroid resistance in bronchial asthma can not be explained by abnormalities in corticosteroid receptor characteristics. Figure 1: An annotated sentence in the BioScope corpus.
The Bioscope corpus consists of three subcorpora: the full papers and the abstracts from the GENIA corpus (Collier et al., 1999), and clinical (radiology) reports. Among them, the full papers subcorpus and the abstracts subcorpus come from the same genre, and thus share some common characteristics in statistics, such as the number of words in the negation scope to the right (or left) of the negation signal and the average scope length. In comparison, the clinical reports subcorpus consists of clinical radiology reports with short sentences. For detailed statistics about the three subcorpora, please see Morante and Daelemans (2009). 1
Negation Scope Finding via Shallow Semantic Parsing
2
http://code.google.com/p/berkeleyparser/ http://www-tsujii.is.s.u-tokyo.ac.jp/GENIA 4 If a negation signal consists of multiply words (e.g., rather than), the last word (e.g., than) is chosen to represent the negation signal. 3
http://www.inf.u-szeged.hu/rgai/bioscope
673
S0,11 arguments VP2,11
NP0,1 These findings
SBAR3,11
VBP2,2
predicate S4,11
indicates IN3,3 that
VP6,11
NP4,5
VP8,11
corticosteroid resistance MD6,6 RB7,7 can
not
VB8,8 be
VP9,11 explained by abnormalities
Figure 2: An illustration of a negation signal and its arguments in a parse tree.
The first rule ensures that no argument covers the negation signal while the second rule ensures no overlap between any two arguments. For example, in the sentence “These findings indicate that corticosteroid resistance can not be explained by abnormalities”, the negation signal “can not” has the negation scope “corticosteroid resistance can not be explained by abnormalities”. As shown in Figure 2, the node “RB7,7” (i.e., not) represents the negation signal “can not” while its arguments include three constituents {NP4,5, MD6,6, and VP8,11}. It is worth noting that according to the above rules, negation scope finding via shallow semantic parsing, i.e. determining the arguments of a given negation signal, is robust to some variations in parse trees. This is also empirically justified by our later experiments. For example, if the VP6,11 in Figure 2 is incorrectly expanded by the rule VP6,11→MD6,6+RB7,7+VB8,8+VP9,11, the negation scope of the negation signal “can not” can still be correctly detected as long as {NP4,5, MD6,6, VB8,8, and VP9,11} are predicted as the arguments of the negation signal “can not”. Compared with common shallow semantic parsing which needs to assign an argument with a semantic label, negation scope finding does not involve semantic label classification and thus could be divided into three consequent phases: argument pruning, argument identification and post-processing.
674
4.2
Argument Pruning
Similar to the predicate-argument structures in common shallow semantic parsing, the negation signal-scope structures in negation scope finding can be also classified into several certain types and argument pruning can be done by employing several heuristic rules to filter out constituents, which are most likely non-arguments of a negation signal. Similar to the heuristic algorithm as proposed in Xue and Palmer (2004) for argument pruning in common shallow semantic parsing, the argument pruning algorithm adopted here starts from designating the negation signal as the current node and collects its siblings. It then iteratively moves one level up to the parent of the current node and collects its siblings. The algorithm ends when it reaches the root of the parse tree. To sum up, except the negation signal and its ancestral constituents, any constituent in the parse tree whose parent covers the given negation signal will be collected as argument candidates. Taking the negation signal node “RB7,7” in Figure 2 as an example, constituents {MD6,6, VP8,11, NP4,5, IN3,3, VBP2,2, and NP0,1} are collected as its argument candidates consequently. 4.3
Argument Identification
Here, a binary classifier is applied to determine the argument candidates as either valid arguments or non-arguments. Similar to argument
ns1-ns4, NS1-NS2, nsac1-nsac2, and NSAC1 -NSAC7).
identification in common shallow semantic parsing, the structured syntactic information plays a critical role in negation scope finding.
Feature Remarks argument candidate (AC) related ac1 the headword (ac1H) and its POS (ac1P). (resistance, NN) ac2 the left word (ac2W) and its POS (ac2P). (that, IN) ac3 the right word (ac3W) and its POS (ac3P). (can, MD) ac4 the phrase type of its left sibling (ac4L) and its right sibling (ac4R). (NULL, VP) ac5 the phrase type of its parent node. (S) ac6 the subcategory. (S:NP+VP) combined features (AC1-AC2) b2&fc1H, b2&fc1P negation signal (NS) related ns1 its POS. (RB) ns2 its left word (ns2L) and right word (ns2R). (can, be) ns3 the subcategory. (VP:MD+RB+VP) ns4 the phrase type of its parent node. (VP) combined features (NS1-NS2) b1&ns2L, b1&ns2R NS-AC-related nsac1 the compressed path of b3: compressing sequences of identical labels into one. (NPVP>RB) nsac2 whether AC and NS are adjacent in position. “yes” or “no”. (no) combined features (NSAC1-NSAC7) b1&b2, b1&b3, b1&nsac1, b3&NS1, b3&NS2, b4&NS1, b4&NS2 Table 2: Additional features and their instantiations for argument identification in negation scope finding, with NP4,5 as the focus constituent (i.e., the argument candidate) and “can not” as the given negation signal, regarding Figure 2.
Basic Features Table 1 lists the basic features for argument identification. These features are also widely used in common shallow semantic parsing for both verbal and nominal predicates (Xue, 2008; Li et al., 2009). Feature Remarks b1 Negation: the stem of the negation signal, e.g., not, rather_than. (can_not) b2 Phrase Type: the syntactic category of the argument candidate. (NP) b3 Path: the syntactic path from the argument candidate to the negation signal. (NPVP>RB) b4 Position: the positional relationship of the argument candidate with the negation signal. “left” or “right”. (left) Table 1: Basic features and their instantiations for argument identification in negation scope finding, with NP4,5 as the focus constituent (i.e., the argument candidate) and “can not” as the given negation signal, regarding Figure 2.
Additional Features To capture more useful information in the negation signal-scope structures, we also explore various kinds of additional features. Table 2 shows the features in better capturing the details regarding the argument candidate and the negation signal. In particular, we categorize the additional features into three groups according to their relationship with the argument candidate (AC, in short) and the given negation signal (NS, in short). Some features proposed above may not be effective in argument identification. Therefore, we adopt the greedy feature selection algorithm as described in Jiang and Ng (2006) to pick up positive features incrementally according to their contributions on the development data. The algorithm repeatedly selects one feature each time which contributes most, and stops when adding any of the remaining features fails to improve the performance. As far as the negation scope finding task concerned, the whole feature selection process could be done by first running the selection algorithm with the basic features (b1-b4) and then incrementally picking up effective features from (ac1-ac6, AC1-AC2,
4.4
Post-Processing
Although a negation signal in the BioScope corpus always has only one continuous block as its negation scope (including the negation signal itself), the negation scope finder may result in discontinuous negation scope due to independent prediction in the argument identification phase. Given the golden negation signals, we observed that 6.2% of the negation scopes predicted by our negation scope finder are discontinuous. Figure 3 demonstrates the projection of all the argument candidates into the word level. According to our argument pruning algorithm in Section 4.2, except the words presented by
675
the negation signal, the projection covers the whole sentence and each constituent (LACi or RACj in Figure 3) receives a probability distribution of being an argument of the given negation signal in the argument identification phase. m
The evaluation is made using the accuracy. We report the accuracy using three measures: PCLB and PCRB, which indicate the percentages of correct left boundary and right boundary respectively, PCS, which indicates the percentage of correct scope as a whole.
n
5.2 LACm
….
LAC1
RAC1
….
RACn
In order to select beneficial features from the additional features proposed in Section 4.3, we randomly split the abstracts subcorpus into training and development datasets with proportion of 4:1. After performing the greedy feature selection algorithm on the development data, features {NSAC5, ns2R, NS1, ac1P, ns3, NSAC7, ac4R} are selected consecutively for argument identification. Table 3 presents the effect of selected features in an incremental way on the development data. It shows that the additional features significantly improve the performance by 11.66% in PCS measure from 74.93% to 86.59% ( χ 2 ; p < 0.01 ).
Figure 3: Projecting the left and the right argument candidates into the word level.
Since a negation signal is deemed inside of its negation scope in the BioScope corpus, our post-processing algorithm first includes the negation signal in its scope and then starts to identify the left and the right scope boundaries, respectively. As shown in Figure 3, the left boundary has m+1 possibilities, namely the negation signal itself, the leftmost word of constituent LACi (1