Interactive Chinese Question Answering System in ... - Semantic Scholar

Interactive Chinese Question Answering System in Medicine Diagnosis Xipeng Qiu School of Computer Science Fudan University [email protected] Abstract In this paper, we propose a general framework for the interactive question answering system in medical diagnosis, which can interact simply with user to get more refined question descriptions and return answers. The system first gets FAQ pairs from cQA website, and builds the medical ontology with incremental methods. Then it analyzes the question and enquires user for the lacking information. After getting user’s feedbacks, it performs question retrieval and extracts answers. The experiment shows our system has better performance with user’s feedbacks.

1. Introduction Automatic question answering (QA) is an important research topic in information retrieval and natural language processing fields [39, 40], which is an alternative to the keyword based information retrieval system, like Google1 , Baidu2 . The input of a QA system is a question, and the output is the corresponding answers extracted from a large corpus or web[20]. However, these system cannot deal with some complicated questions which are related with domain knowledge, such as medical domain. Fig. 1 shows the general framework of question answering in the open domain. To alleviate this problem, we can resort to online large scale FAQ archive for specific domains. In recent years, the community-based question answering services (cQA) have become very popular, such as Baidu Zhidao3 . Instead of finding answer by forums or search engines , users can post their question on cQA websites and wait the other people to answer it. While forums focus on the discussion and communication between users, cQA services focus on answering the questions of users. Therefore, users can get a faster response in cQA websites. These cQA websites also provide an interface to retrieval the answered questions, which 1 http://www.google.com 2 http://www.baidu.com

3 http://zhidao.baidu.com

Jiatuo Xu Shanghai University of Traditional Chinese Medicine [email protected] are almost based on keyword search engine. So it is not still enough to offer the exact information to user. The user also need consider some appropriate keywords to represent his needs. Besides, the good answers are often mingle with large of bad or wrong answers. Therefore, the major issue is to find the exact one when the answers of many complicated question already exist. There are some related works, including question suggestion, answers qualities, question answering pairs extraction, etc[14, 18, 23, 26, 32, 9, 27]. In this paper, we propose a general framework for the interactive chinese question answering system in medical diagnosis, which can interact simply with user to get refined question descriptions and return user the extracted answers. The system first gets FAQ pairs from cQA website, and collects the medical ontology with incremental method. Then it analyzes the question and enquires user about the lacking information. After getting user’s feedbacks, it performs question retrieval and extracts answers. In the rest of the paper, we first describe our system in section 2, and evaluate it by the experiments in section 3. Finally, we give the conclusions in section 4.

2. System Framework In this section, we introduce our system for the interactive chinese question answering system in medical diagnosis.

2.1. Topical Crawler Topical crawlers play an important role in domain search engines. Topical crawlers can start with some seed keywords or urls and gather the web pages which have similar content with seeds [35] [28, 5]. The context is one of the most useful features, which can guide crawler to locate highly relevant target pages. In our system, we collect medical webpages by analyzing the anchor text attached to hyperlinks. We first collect the anchor texts with the corresponding categories from

Question

Question Analysis

Answer

Question Classification

Answer Ranking

Answer Extraction

Query Generation

Semantic Analysis

Web Retrieval

WWW

Figure 1. The flowchart of the open domain question answering system

two chinese cQA websites, which provide the question categories. Then we select the anchor text with categories related to the medical keywords, such as 医疗/疾病. Then we build a two-class classifier to classify the anchor texts to medical or non-medical texts. The classifier we used is naive Bayes with multinomial distribution[17].

features includes: Answerer’s Acceptance Ratio, Answer Length, Questioner’s Self Evaluation, Answerer’s Activity Level, Answerer’s Category Specialty, Users’ Recommendation, Number of Answers.

2.2. Medical Ontology Construction

In the general question analysis system, the first step is question classification[29, 24, 10, 44]. The categories is consistent with entity extraction in the latter steps. However, there are some difficulties in chinese medical QA system. First, it is very different between English and Chinese question sentence. Second, most questions are not factbased and are complex to be categoried. In our system we build an the question analysis model with medical ontology[13]. It first analyzes the focus words in the question, and finds the related concepts in medical ontology. Then it classifies the question to a category and decides what is missing information for the question.

To take advantage of the medical domain knowledges, we need to establish the ontology about the medical terms, concepts, entities and their relations. Due to the difficulty in collect knowledges manually, we use an automatic method to collect them. There are already some works to extract information within the collected corpus automatically[7, 37]. The objective of information extraction (IE) is to extract certain pieces of information from text that are related to a prescribed set of related concepts. We first collect some initial information, which includes names of drug, symptom, disease and the relations between them manually. Then we build the medical ontology with information extraction methods.

2.3. QA Pairs Extraction Since there are many methods to extract the best answers for a question in cQA websites [26, 15]. The answer quality problem is important when there are many duplicated questions, or wrong questions. These questions have answers with varying quality levels, therefore it is not enough to measure relevance alone and the quality of answers must be considered together. We use the features to predicate to exact the best answer, which are described in [15]. These

2.4. Question Analysis

2.5. Interactive Feedback A user often input a question with just mainly symptom, but it is not often enough to get cause of disease. For example, 有什么方法能治疗头晕？There are many reasons to lead to dizziness, and the corresponding treatments vary greatly with different reasons or state of health. To get the exact answers, the user are asked to provide some extra information, such as his age, other symptoms, etc. With the user provided symptoms, the system firstly gets the related symptoms from the collected medical knowledge. Then the system interact with user to ensure all signs of his disease.

Ranking Answers

Feedback

Lacking Information

Filtering Question Question Type Question Analysis

Answer Candidates

Extraction Question Focus

QA Pairs Auxiliary Information Medical Ontology

WWW

cQA Websties Topic Web Crawler

Figure 2. The flowchart of the interactive chinese question answering system

There are also some researches on interactive question answering[11, 12, 25].

2.6. FAQ Retrieval Giving a FAQ corpus, there is still a problem to retrieve useful information for the user’s questions. There are many works to improve the performance of FAQ retrieval[41, 22, 2, 19, 4, 3, 14, 16, 43, 4]. An importance problem is how to calculate the similarity between user’s question and a FAQ pair, which requires some semantic analysis. However, measuring semantic similarities between questions is not trivial. Sometimes, two questions that have the same meaning use very different wording. For example, “Q1:糖尿病患者长期服用什么药比较有效,副作用比较小? ” and Q2:“有什么能有效降低血糖并且对身体无害的药?” have almost the identical meaning but they are lexically very different. Similarity measures developed for document retrieval work poorly when there is little word overlap. Thus, if there is the QA pair of Q2 in FAQ corpus, but the user ask the question Q1. Then, he could not get answer

because Q1 and Q2 are almost different with traditional information retrieval method. A solution for this issue is query expansion[31, 38, 42]. In our system, we expand the query by the domain ontology. For the name of disease, we add some keywords about its corresponding symptoms.

2.7. Answer Extraction In cQA websites, the repliers often provide background or related informations for the questions, which are useful to help questioner to find out the fact himself. But sometimes, especially for the factoid and list questions, the user need the exact answers instead of the related pieces of answers. For example, “请问糖尿病的症状有哪些？”. So we need extract the answers from the related informations[8, 33, 34, 45]. We first extract the entities from these informations, and classify them to the different entity categories, such as Person, Location, Organization, Durations, Quantities and Dates, etc[1]. Then we score the entities and filtering them with a threshold. Entity scores have two components. The first component is whether or not the entity’s category matches the query’s

category. The second component of the entity score is based on the frequency and position of occurrences of a given entity within the retrieved passages[1]. In our system, we use conditional random fields [21, 30] to label the entities and its corresponding categories.

5. Acknowledgement This work was supported by the National High Technology Research and Development Program of China (863 Program)(No.2007AA02Z429, the Natural Science Foundation of China (No.30300443 and 60435020).

2.8. Answer Re-ranking Before return answers to user, the system need re-rank the answers to improve the system performance. For example, removing redundancy answers [6]. We can use more features to [36] to judge the scores for each answers candidates.

3. Experiments We implement our system and collect about 84,000 QA pairs in medical domain from cQA websites: Baidu Zhidao4 , WenWen5 . We evaluate our results with mean precision at rank 1 (P@1), which is the percentage of questions with the correct answer on the first position. We use the keywords query as the baseline system. These keywords are just the terms in question. We select randomly 100 questions and evaluate the qualities of answers manually.

Table 1. Results of different systems with P@1 Systems Baseline No feedback Feedback

P@1 79% 82% 87%

Table 1 shows the results of our system. The feedback of user can improve the answer quality greatly.

4. Conclusion In this paper, we propose a framework of the interactive question system in medical domain. It integrates the question analysis, query expansion, ontology construction, answer extraction and answer ranking. We also address the difficulties in each part and the preliminary solutions. The proposed framework is also applied for the other domain, such as music, travel. 4 http://zhidao.baidu.com

5 http://wenwen.soso.com/

References [1] S. Abney, M. Collins, and A. Singhal. Answer extraction. Proceedings of the sixth conference on Applied natural language processing, pages 296–301, 2000. [2] R. Baeza-Yates, B. Ribeiro-Neto, et al. Modern information retrieval. Addison-Wesley Harlow, England, 1999. [3] R. Burke, K. Hammond, and J. Kozlovsky. Knowledgebased information retrieval from semi-structured text. AAAI Fall Symposium on AI Applications in Knowledge Navigation and Retrieval, pages 19–24, 1995. [4] R. Burke, K. Hammond, V. Kulyukin, S. Lytinen, N. Tomuro, and S. Schoenberg. Question answering from frequently asked question files: Experiences with the faq finder system. AI Magazine, 18(2):57–66, 1997. [5] S. Chakrabarti, K. Punera, and M. Subramanyam. Accelerated focused crawling through online relevance feedback. Proceedings of the 11th international conference on World Wide Web, pages 148–159, 2002. [6] C. Clarke, G. Cormack, and T. Lynam. Exploiting redundancy in question answering. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 358–365, 2001. [7] J. Cowie and W. Lehnert. Information extraction. Communications of the ACM, 39(1):80–91, 1996. [8] D. Demner-Fushman and J. Lin. Knowledge extraction for clinical question answering: Preliminary results. Proceedings of the AAAI-05 Workshop on Question Answering in Restricted Domains, pages 9–13, 2005. [9] S. Ding, G. Cong, C.-Y. Lin, and X. Zhu. Using conditional random fields to extract contexts and answers of questions from online forums. In Proceedings of ACL-08: HLT, pages 710–718, Columbus, Ohio, June 2008. Association for Computational Linguistics. [10] J. Ely, J. Osheroff, P. Gorman, M. Ebell, M. Chambliss, E. Pifer, and P. Stavri. A taxonomy of generic clinical questions: classification study, 2000. [11] T. Hao, D. Hu, L. Wenyin, and Q. Zeng. Semantic patterns for user-interactive question answering. CONCURRENCY AND COMPUTATION, 20(7):783, 2008. [12] S. Harabagiu, A. Hickl, J. Lehmann, and D. Moldovan. Experiments with interactive question-answering. Ann Arbor, 100, 2005. [13] U. Hermjakob. Parsing and question classification for question answering. Proceedings of the Workshop on Question Answering at the Conference ACL-2001, 2001. [14] J. Jeon, W. Croft, and J. Lee. Finding similar questions in large question and answer archives. Proceedings of the 14th ACM international conference on Information and knowledge management, pages 84–90, 2005.

[15] J. Jeon, W. Croft, J. Lee, and S. Park. A framework to predict the quality of answers with non-textual features. Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 228–235, 2006. [16] V. Jijkoun and M. de Rijke. Retrieving answers from frequently asked questions pages on the web. Proceedings of the 14th ACM international conference on Information and knowledge management, pages 76–83, 2005. [17] M. Jordan. Learning in Graphical Models. Kluwer Academic Publishers, 1998. [18] P. Jurczyk and E. Agichtein. Discovering authorities in question answer communities by using link analysis. Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, pages 919–922, 2007. [19] H. Kim and J. Seo. High-performance faq retrieval using an automatic clustering method of query logs. Information Processing and Management, 42(3):650–661, 2006. [20] C. Kwok, O. Etzioni, and D. Weld. Scaling question answering to the web. Proceedings of the 10th international conference on World Wide Web, pages 150–161, 2001. [21] J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pages 282–289, San Francisco, CA, USA, 2001. Morgan Kaufmann Publishers Inc. [22] C. Lee. Intention Extraction and Semantic Matching for Internet FAQ Retrieval. PhD thesis, Master Thesis, Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan, ROC, 2000. [23] C. LENGELER, D. SAVIGNY, H. MSHINDA, C. MAYOMBANA, S. TAYARI, C. HATZ, A. DEGRÉMONT, and M. TANNER. Community-based questionnaires and health statistics as tools for the cost-efficient identification of communities at risk of urinary schistosomiasis. International Journal of Epidemiology, 20(3):796–807, 1991. [24] X. Li and D. Roth. Learning question classifiers. Proceedings of the 19th International Conference on Computational Linguistics, pages 556–562, 2002. [25] J. Lin, D. Quan, V. Sinha, K. Bakshi, D. Huynh, B. Katz, and D. Karger. What makes a good answer? the role of context in question answering. Human-Computer Interaction, 2003. [26] X. Liu, W. Croft, and M. Koll. Finding experts in community-based question-answering services. In Proceedings of the 14th ACM international conference on Information and knowledge management, pages 315–316. ACM New York, NY, USA, 2005. [27] Y. Liu and E. Agichtein. You’ve got answers: Towards personalized models for predicting success in community question answering. In Proceedings of ACL-08: HLT, Short Papers, pages 97–100, Columbus, Ohio, June 2008. Association for Computational Linguistics. [28] F. Menczer, G. Pant, P. Srinivasan, and M. Ruiz. Evaluating topic-driven web crawlers. Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 241–249, 2001. [29] D. Metzler and W. Croft. Analysis of statistical question classification for fact-based questions. Information Retrieval, 8(3):481–504, 2005.

[30] F. Peng, F. Feng, and A. McCallum. Chinese segmentation and new word detection using conditional random fields. Proceedings of the 20th international conference on Computational Linguistics, 2004. [31] Y. Qiu and H. Frei. Concept based query expansion. Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, pages 160–169, 1993. [32] B. Smyth, E. Balfe, J. Freyne, P. Briggs, M. Coyle, and O. Boydell. Exploiting query repetition and regularity in an adaptive community-based web search engine. User Modeling and User-Adapted Interaction, 14(5):383–423, 2004. [33] R. Srihari and W. Li. A question answering system supported by information extraction. Proceedings of the sixth conference on Applied natural language processing, pages 166–172, 2000. [34] R. Srihari, W. Li, and N. CYMFONY. Information extraction supported question answering. NIST SPECIAL PUBLICATION SP, pages 185–196, 2000. [35] P. Srinivasan, F. Menczer, and G. Pant. A General Evaluation Framework for Topical Crawlers. Information Retrieval, 8(3):417–447, 2005. [36] M. Surdeanu, M. Ciaramita, and H. Zaragoza. Learning to rank answers on large online QA collections. In Proceedings of ACL-08: HLT, pages 719–727, Columbus, Ohio, June 2008. Association for Computational Linguistics. [37] J. Turmo, A. Ageno, and N. Català. Adaptive information extraction. ACM Computing Surveys (CSUR), 38(2), 2006. [38] E. Voorhees. Query expansion using lexical-semantic relations. Springer-Verlag New York, Inc. New York, NY, USA, 1994. [39] E. Voorhees. The trec-8 question answering track report. NIST SPECIAL PUBLICATION SP, pages 77–82, 2000. [40] E. Voorhees. Overview of the trec 2003 question answering track. Proceedings of the Twelfth Text REtrieval Conference (TREC 2003), 142, 2003. [41] C. Wu, J. Yeh, and Y. Lai. Semantic segment extraction and matching for internet faq retrieval. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, pages 930–940, 2006. [42] J. Xu and W. Croft. Query expansion using local and global document analysis. Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pages 4–11, 1996. [43] S. Yang, F. Chuang, and C. Ho. Ontology-supported faq processing and ranking techniques. Journal of Intelligent Information Systems, 28(3):233–251, 2007. [44] W. Zhang and T. Chen. Classification based on symmetric maximized minimal distance in subspace (SMMS). In Proc. of IEEE Conf. on Comput. Vision and Pattern Recogn. (CVPR), 2003. [45] Z. Zheng. Answerbus question answering system. Proceedings of the second international conference on Human Language Technology Research, pages 399–404, 2002.