tions, since their answers cannot be stated in a sin- gle phrase. ..... gold standard parse trees for training the feature weights ..... Advanced Structural Represen-.
described in (Brown et al., 1993) as well as the. Hidden Markov .... the lexicon probability Pr(fj|f jâ1. 1. ,a j. 1,eI .... Peter F. Brown, Stephen A. Della Pietra, Vin-.
TIMIT and SVArticulatory results showed that for classifiers trained on data that allows for asynchronously changing AFs. (SVArticulatory) the improvement from ...
For instance, (Jones et al., 2002) describes a process of supervised learning .... polıtica estadounidense [Mod] que [Sub] no ha funcionado [V] en. Haitı [Mod].[P].
anism to express NP modifiers in Spanish (as compared to English, where noun compound- ing is equally ..... ASHRAM: Ac- tive Summarization and Markup.
Nov 26, 2009 - Keywords â Opinion Mining, Sentiment Polarity Analysis,. SentiWordNet. ..... âPulse: Mining customer opinions from free textâ, In. Proceedings ...
For non-list questions, 31% (20%) of the highest ranked answers returned by the ..... (composers such as Joonas Kookonen) to find potential answers. As some of ... Who was the chancellor of Germany from 1974 to 1982? To provide the ...
acquisition of production skills, one that accounts for data that reveal how experience ...... Bock et al., 2005) separa
these realizations for a given word leads to the spe- cific meaning of the ... tion for a government pattern dictionary from large text corpora. ... So, animity is a personification in Spanish. For .... ble to give examples of phrases with the same b
Download Best Book Emerging Information Technology: Improving Decisions, Cooperation, and Infrastructure, Download pdf E
Dec 1, 2016 - applied to the results of the clause analysis can show differences in citation and ... The data and R scripts for replicating the validation and.
1993): 'l?he correspondence between the words in the source and the target string is described by aligmuents that assign one target word posi- tion to each ...
message, Hamming code error correction and SHA-1 ... techniques for information security, many techniques ... with the movement bit of the target constituent, the ... would normally transmit this row-by-row. ... capacity checking and syntax transform
Alejandro Figueroa1 and John Atkinson2,â. 1 ... [email protected], [email protected] ..... (there is a place named âSanto domingoâ in Venezuela).
to a company nameâe.g., âthe smartphone giant Apple.â Information ... sectors, providing a link from company names to sector labels in the knowledge base.
Music information retrieval (MIR) software uses signal analysis for feature .... of segments) to pick a song segmentation expressed as a list of âverse/chorusâ ...
conciseness in technical communication leads many tech- nical writers and ... Inserting a syntactic cue is not always th
Apr 16, 2014 - ... Sabarkantha; 2) Center district zone: Ahmadabad, Kheda, Panchmahals; 3) South district zone: Anand, Vadodara and Bharuch (Figure 1).
uation metric Rouge. 1 Introduction .... 3 yields the majority of PAL's pilots staged
a devastating strike ... using the Rouge automatic scoring metric (Lin and. Hovy ...
of real data, namely the Reuters RCV1/RCV2 Multilingual ... machine translation into document clustering is constrained
However, the semantic hierarchy can also be modeled by employing standard AI languages, like the. NeoClassic language [18], in order to provide the robot.
Using Syntactic Information for Improving why-Question ... - Google Sites
Using Syntactic Information for Improving why-Question Answering Suzan Verberne Lou Boves Nelleke Oostdijk Peter-Arno Coppen Radbound University Nijmegen
Presenter: Sai Qian 1
Structure • Introduction & Related work • Paragraph retrival for why-QA • Answer re-ranking • Discussion • Future directions
2
Introduction & Related work • 5% of all questions in the QA system are why•
questions Difference from factoid question
– Can not be stated in a single phrase – Paragraph retrieval instead of named entity retrieval
• Improving the QA system
– Better retrieval technique (search engine) – Better ranking system
• Syntactic knowledge between question and answer helps?
3
Introduction & Related work • A substantial amount of work in improving
QA system by adding syntactic information – Tiedemann, 2005 – Quarteroni et al., 2007 – Higashinaka and Isozaki, 2008
• Syntactic information gives a small but significant improvement on top of the traditional bag-of-words approach
4
Paragraph retrival for why-QA • Baseline system – Wumpus Search Engine – Question analysis • Remove stop words • Remove punctuation • Remains: set of question content words – Ranking: QAP algorithm (passage scoring algorithm) 5
Answer re-ranking • QAP algorithm (baseline system) – Term overlap between query and passage – Passage length – Total corpus frequency for each term
• Example – Why do people sneeze? – Why do women live longer than men on average? – Why are mountain tops cold?
• The aim: The syntactic information that discourses a relation between the question and its answer! 7
Answer re-ranking • Re-ranking system
– Idea: term overlap – Term: a subset of question terms – Feature: a set of question items and a set of answer items – Proportion:
• Defined Features: 32 in total
– F1: head; F2: modifier; F3: noun phrase; – F4: subject; F6: main verb; F10: direct object; – …… 8
Answer re-ranking • Feature extraction
– Parser • Pelican Parser: more detailed • EP4IR Dependency Parser: more robust – Lemmatization • “sailors of the old” • Only to verbs
• Re-ranking
– Scoring: 0-10 for each feature – Feature selection: genetic algorithm (optimize MRR) 9
Answer re-ranking • Result
• Features that substantially contribute to the ranking score
10
Discussion • Error analysis – No effect: 35/93 • 25/35 no relevant answer • 10/35 RR=1 – Improve: 40/93 – Deteriorate: 18/93 – 11 drops out of top 10, 22 enters top 10
11
Discussion • Example of deteriorated QA pairs
– Why do neutral atoms have the same number of protons as electrons? (answer in “Oxidation number”) – Why do flies walk on food? (answer in “Insect Habitat”) – Why is Wisconsin called the Badger State? (answer in “Wisconsin”)
• Reason
– No lexical overlap between the question focus and the document title – Feature 28 & Feature 13 12
Discussion • Feature selection analysis
– QAP: baseline system – Cue words: because, since, therefore, in order to, due to…… – Main verbs: lemmatization leads to more matches – Question focus & Document title
• Parser comparison
– Only EP4IR is applied to the answer documents 13
Future directions • Improving retrieval • Collecting a larger data collection: improve feature selection • Investigating extra information for why-Q other than syntactic description • Improving the EP4IR parser in constituent extraction 14