A Constraint based Question Answering over Semantic Knowledge Base Magesh Vasudevan1, B.K.Tripathy2 School of Computing Science and Engineering,VIT University, Vellore-632014,Tamil Nadu, India. Mail id: 1
[email protected] ,2
[email protected].
Abstract. The proposed system aims at extracting meaning from the natural language query for querying the semantic knowledge sources. Semantic knowledge sources are systems conceptualized with ontology. Characterization of a concept is through other concepts as a constraint over other. This very method to extract meaning from the natural language query has been experimented in this system. Constraints and entities from the query and the relationship between the entities is capable of transforming natural language query to a SPARQL (a query language for Semantic Knowledge sources). Further the SPARQL query is generated through recursive procedure from the intermediate query which is more efficient that mapping with patterns of the question. The system is compared with other systems of QALD (Question Answering over Linked Data) standard. Keywords Answer Extraction, Information retrieval, Natural Language Query, Ontology, Linked data.
1 Introduction The information retrieval systems identify a set of documents that match with the user query. Question answering systems must return the exact answer rather than retrieving set of documents based upon the ranking. The proposed system is queried over the Linked data. The linked data is built upon ontology, which conceptualizes the knowledge available over any domain. Ontologies are represented using OWL (Web Ontology Language) [13] and RDF (Resource Description Framework) [14] is used to achieve the linked data with the semantics added by OWL. The ontologies are designed to meet the requirements of particular domain, or auto generate from the large open domain knowledge. The latter method is used to generate DBPedia [15] a large semantic knowledge base generated from Wikipedia. The DBPedia classes, Individuals and the relationship between them have to be mapped to the natural language question. Ontology is a form of conceptualization of a domain and a domain can be either closed or open. In this system, we are using an existing open domain knowledge source DBpedia. Ontology enables meaning of the information through concepts, roles and individuals; concepts are analogous to class, individuals to instances and
cases. The functional box is highly needed in case of performing operations on numeric or date. Here, the age from date of birth, numeric unit conversions have to be performed. The complexity in identifying whether entity represents a concept in ontology or a relationship between entities to represent the concept system searches is very high. This dynamically changes with the change of policies at different layers in the system. The mean response time is 19 seconds; the mean query generation time is 14 seconds. The response time must be increased through computing the semantic similarity of phrases in distributed environment. This constitutes 73% of the time for generating the query. Agent based global state is the most suitable solution for question answering systems; where the global state of the representation of the query is preserved. Each agent designated with particular identification communicates with global structure and resolves for approach. Here, in case the entity may not be a direct concept in ontology it can relate to a predicate. The agents collaborate to resolute the final structure of the process, which will lead to more accurate identification of question structure with the knowledge base. We are currently designing the agent based question answering system for linked data, with generalization of clue vectors as global state. In this case each entity is associated with a frame containing the possibilities of KB representation and finally resolute based upon the clue vector.
6 Conclusion Question answering can be seen as a tool to evaluate the efficiency in retrieving information from a knowledge base. Semantic knowledge bases have been in consideration for artificial intelligence systems, due to its structure which makes information meaningful. Future of web will be one such global access; it could be achieved by evolution of semantic web. Human like computing system, has to perceive the world through their sensors, due to the strong dependency of natural language and semantic knowledge bases, the information perceived by systems have to be on natural language and the communication to the user have to be in natural language. This gives a way, where question answering systems can be used to evaluate the efficiency of knowledge base in conceptualizing or perceiving the given natural language input. The system has achieved better F` measure than the existing systems. However, the response time must be increased and dynamic change of approach based upon global state must be adopted.
References 1. Unger, C., Bühmann, L., Lehmann, J., Ngonga Ngomo, A. C., Gerber, D., & Cimiano, P: Template-based question answering over RDF data, in: Proceedings of the 21st international conference on World Wide Web.ACM (2012, April) 639-648. 2. Gerber, D., & Ngomo, A. C. N.: Bootstrapping the linked data web, in: 1st Workshop on Web Scale Knowledge Extraction@ ISWC (Vol. 2011).
3. Walter, S., Unger, C., Cimiano, P., & Bär, D. (2012): Evaluation of a layered approach to question answering over linked data, in: The Semantic Web–ISWC 2012 (pp. 362-374). Springer Berlin Heidelberg. 4. Cabrio, E., Aprosio, A. P., Cojan, J., Magnini, B., Gandon, F., & Lavelli, A.: Qakis@ qald-2. Proceedings of Interacting with Linked Data (ILD 2012)[37], 87-95. 5. Hakimov, S., Tunc, H., Akimaliev, M., & Dogdu, E.: Semantic question answering system over linked data using relational patterns, in: Proceedings of the Joint EDBT/ICDT 2013 Workshops (pp. 83-88). ACM. 6. He, S., Liu, S., Chen, Y., Zhou, G., Liu, K., & Zhao, J.: Casia@ qald-3: A question answering system over linked data, in: Proceedings of the Question Answering over Linked Data lab (QALD-3) at CLEF 2013. 7. Shekarpour, S., Ngonga Ngomo, A. C., & Auer, S.: Question answering on interlinked data, in: Proceedings of the 22nd international conference on World Wide Web (pp. 1145-1156). International World Wide Web Conferences Steering Committee 2013 May. 8. Dima, C.: Intui2: A prototype system for question answering over linked data, in: Proceedings of the Question Answering over Linked Data lab (QALD-3) at CLEF 2013. 9. Lopez, V., Unger, C., Cimiano, P., & Motta, E.: Evaluating question answering over linked data, in: Web Semantics: Science, Services and Agents on the World Wide Web, 2013, 21, 3-13. 10. De Marneffe, M. C., & Manning, C. D.: The Stanford typed dependencies representation, in: Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation (pp. 1-8). Association for Computational Linguistics. 11. Kibriya, A. M., Frank, E., Pfahringer, B., & Holmes, G.: Multinomial naive bayes for text categorization revisited, in: AI 2004: Advances in Artificial Intelligence (pp. 488-499). Springer Berlin Heidelberg. 12. Varelas, G., Voutsakis, E., Raftopoulou, P., Petrakis, E. G., & Milios, E. E. (2005, November): Semantic similarity methods in wordNet and their application to information retrieval on the web, in: Proceedings of the 7th annual ACM international workshop on Web information and data management (pp. 10-16). ACM. 13. McGuinness, D. L., & Van Harmelen, F.: OWL web ontology language overview, in: W3C recommendation, 10(10), 2004. 14. Klyne, G., & Carroll, J. J.: Resource description framework (RDF): Concepts and abstract syntax, 2006. 15. Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., & Ives, Z.: Dbpedia: A nucleus for a web of open data (pp. 722-735), 2007, in: Springer Berlin Heidelberg.