Semantic Query-manipulation and Personalized ... - CyberLeninka

0 downloads 0 Views 590KB Size Report
and food information to be consistent with the user's needs. Moreover, we propose query templates that are used for semantic manipulation and mapping of the ...
Available online at www.sciencedirect.com

Procedia Computer Science 19 (2013) 163 – 170

The 4th International Conference on Ambient Systems, Networks and Technologies (ANT-2013)

Semantic Query-Manipulation and Personalized Retrieval of Health, Food and Nutrition Information Ahmed Al-Nazer, Tarek Helmy* Information and Computer Science Department, College of Computer Science & Engineering, King Fahd University of Petroleum & Minerals, Dhahran 31216, Mail Box 413, Saudi Arabia, * On leave from College of Engineering, Tanta University, Egypt, [g199739540, helmy]@kfupm.edu.sa

Abstract Semantic manipulation of website content is important in many domains, but it is critical in some domains, such as health and nutrition. In such domains, users need to retrieve precise, trusted, and relevant health and food information. Even with a high-quality, semantic, Web-based search engine, it is not enough for retrieving the precise health- and nutrition-related information. That is because the retrieved information might not fit the user’s specific needs due to the huge amount of information scattered throughout the Web. Thus, semantic query manipulation and personalization techniques will help and guide users in retrieving more relevant health and nutrition information consistent with their needs. In this paper, we present our efforts to develop a framework for semantic query manipulation and personalization of health and nutrition information. We propose a user profile ontology based on culture, language, health and nutrition. The profile is used to enrich the query and to personalize the retrieved health and food information to be consistent with the user’s needs. Moreover, we propose query templates that are used for semantic manipulation and mapping of the user’s natural language queries into ontology-based queries. We have implemented the proposed framework, and the empirical evaluations show promising improvements in the relevancy of the retrieved results and of the user’s satisfaction. © 2013 The Authors. Published by Elsevier B.V. © 2011 Published by Elsevier Ltd. Selection and/orM. peer-review Selection and peer-review under responsibility of Elhadi Shakshuki under responsibility of [name organizer] Keywords: Personalization; query manipulation; food and nutrition; semantic Web; ontology

1. Introduction The semantic representation of Web contents and the semantic query manipulation help in retrieving more accurate results. However, they are not the only success factors in retrieving the relevant information for the user, as we have a huge amount of information scattered throughout the Web. Thus, the retrieved information needs to be filtered and personalized to fit the user’s exact needs, i.e., the health advice that fits one user based on his/her age, gender and health conditions might not fit another user with different conditions. Thus, personalization techniques will help and guide users in retrieving high-quality health

1877-0509 © 2013 The Authors. Published by Elsevier B.V. Selection and peer-review under responsibility of Elhadi M. Shakshuki doi:10.1016/j.procs.2013.06.026

164

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

and nutrition-related Web contents. For personalization, we need a personal profile for each user to define his/her interests, preferences, health conditions and culture, in addition to customizing the retrieved results. We all do not share a common cultural background, and each culture has its own tastes [1]. Since we are focusing on the food and nutrition, some foods are accepted in a certain culture, while they are not preferred in a different culture. The remainder of the paper starts with a survey on the related work and is followed by a description of the proposed framework architecture. Then, we present the three main components: user profile ontology, query semantic analysis and results personalization. Next, we describe the experimental results and show some use cases. Finally, we conclude the paper and highlight trends for the future work. 2. Related Work HealthFinland [2] is an intelligent semantic portal that provides relevant health information retrieved from the Web and various governmental, non-governmental, business and other organizations. It helps to find the relevant health content using basic vocabularies without the need for technical medical terminology. The limitation of HealthFinland is that it does not address the personalization retrieval. Personalized Health Information Retrieval System (PHIRS) [3] is a health information recommendation system which addresses the user’s modeling and implements a user-profile matching that customizes the retrieved health information to match the individual’s needs. There are two limitations of PHIRS: 1) it does not have enough features to identify the relevant health information; and 2) the personalization does not touch on the culture or language of the user. CarePlan [4] generates customized, patient-specific healthcare plans in an automatic way and determines the best clinical care plan based on the patient’s medical and personal profiles, the medical knowledge, clinical pathways, and personalized educational healthcare programs. The limitation of CarePlan is the lack of the full implementation details as well as the food and nutrition information that are related to the patient, in addition to the educational focus of the profile and the lack of cultural aspects. The authors in [5] propose an adaptive searching mechanism for medical information that retrieves cardiologic medical information from heterogeneous, distributed medical databases that mediate medical decisions of critical health conditions. The mechanism supports generating a personalized searching process for the users based on their personal profiles, but it lacks the use of semantic Web and the culture attributes in the personalization process. The authors in [6] introduce a trusted model as a one-stop shop access point to personalized health and medical information. The model centralizes personal information management to facilitate specific information aggregation tasks of individual clients. Experiments were conducted to demonstrate trade-off levels between retrieval performance and the degree of privacy preservation in the proposed query mixing strategies. This trade off did not consider the personalization from the user’s cultural point of view. A mixed initiative sociosemantic conversational search and recommendation system for finding health information is presented in [7]. In this system, users can have a live conversation about their health issues where the system connects relevant users together in the same conversation and provides context-based recommendations. The recommendation was to be based on the social context only. Based on this survey, there is a lack of cultural- and lingual-based personalization for the health, food and nutrition domain that will help in giving better recommendations for the users. Hence, we extend the current approaches by building a framework for a cross-cultural and cross-lingual recommendation tool with an ontology-based user profile to retrieve the relevant health and nutrition information that fits the user’s needs. 3. Query Manipulation and Personalization Framework This work is part of a big project that aims to build a framework to help users find semantic health and nutrition information fit to their needs. The architecture of the project’s framework has three main

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

components: 1) the ontology managemennt component, which maintains the health and nutrition domain d ontology; 2) the annotation component, whhich annotates the health and nutrition data sources based d on the domain ontology; and 3) the query manipulation and results personalization component, the focus of this paper, which users interact with and whichh personalizes the retrieved health and food information. Figure 1 shows the details of the proposed fram mework for the query manipulation and results personallization component.

Fig. 1. Proposed framework architecture for query manipulation and results personalization component

In the proposed framework, the user ennters the query using the portal interface in one of two ways: either by going through a wizard to complete thhe query, or by using free text natural language query. The T crosslanguage service is used to translate the usser’s query as needed. Then, the ontology query-templatee reasoning engine is used to select the appropriate quuery template for the user’s query. This requires referenccing to the knowledge repository that hosts the ontollogies and the knowledge base. This helps in enriching the user’s query based on the user profile. Next, the reasoning engine processes the query and returns the peersonalized search results. The user browses the results and all interactions are logged in the action log databasee. The user u the user profile with new preferences. preferences learner analyzes this log and updates 4. User Profile Ontology o we need This section highlights the user profilee ontology and answers the following questions: Why do ontology to represent the user profile? What W are the factors that affect the food preferences? How do we capture the user’s preferences and how do we update them in a timely manner? 4.1. Food and health preferences Choosing the right food depends on many factors that sometimes conflict with each other. These factors can be categorized into three grooups: food preferences, health conditions, and culture/rreligion constraints. The food preferences group recognizes that each user has his/her own taste in food and a that d s/he likes some types of food while disliiking others. The health condition factor involves any diseases and/or allergies that the user may have. Thhe health condition factor therefore restricts some types of o food and encourages other types. The culture/rreligion factor takes into account that each culture has its i own preferred food and that some religions haave food restrictions. This is obvious when someone traavels to countries where s/he sees different tastes and a different recipes outside of his/her own culture. 4.2. Preferences capturing p is to make a form and ask the user to fill it out, o but The basic step to capturing the user’s preferences the fact is that most users do not appreciaate the value of taking the time to thoroughly fill out the profile

165

166

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

forms [8]. So, there should be a mechanism to infer the user’s interest; hence we show three ways that we can use to elicit the profile fields after consulting with the user to assure the user’s privacy. First, we can implement user queries, which are a good source for understanding the user’s needs and interests. For example, if the user is always asking about a certain type of food, we may infer that this food type interests him. Second, we can monitor the user’s behaviors and interactions, explicit or implicit, and the results will enhance the user profile. For example, if the user always clicks on a certain data source, this means that s/he trusts this source more than others, and therefore the results from this source should be prioritized. And third, interfacing with an external system that has the user’s information, such as a medical information system, can be reflected immediately in the user profile. The same ways could be used to dynamically update the user profile and adopt it with the latest user preferences. 4.3. User profile representation User profiles are used to grasp the user needs and understand the query’s meaning as it relates to the user [9]. We capture and represent the preferences and interests of the users in order to personalize the retrieved results and give user-specific recommendations. There are many ways to represent the user profile. The authors of [10] showed three ways to represent the user profile. The first way is the keyword profile, which captures the keywords and assigns a weight for each keyword. The second way is the semantic network profile in which the keywords are added to a network of nodes where we can explicitly model the relationship between specific words and higher-level concepts. The third way is the concept profile in which the nodes represent abstract topics, and this helps in building deeper concept hierarchy, which can be based on taxonomy or ontologies. Since this work is part of a project in which we represent the health and food information as ontological format, we choose to represent the profile with the concept profile, which helps in enriching the user’s query and matching the query with the domain ontology. 4.4. User profile ontology We created ontologies for the user profile, the culture and the religion. Then, we created the necessary relations between them and the food and health ontologies. The user profile ontology is represented as an ontological concept that consists of many properties as shown in the first box of Figure 2. For clarification, we visualize the user profile ontology properties in four categories: one category has the user’s basic information, such as name and age; one category has the user’s basic health information, such as the weight and the blood type; one category has the user’s medical information, such as the diseases and allergies; and finally, one category has the usage statistics, such as previous searches and user feedback. The arrow represents a relationship between two concepts, which is referred to in RDF terminology as “triple” [11].

Fig. 2. The user profile ontology

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

5. Semantic Query Manipulation This section explains how we understand and process the user’s query. We start by explaining the concept of query template, along with an example. Then, we show the query processing steps. After that, we go into more detail with the query enrichment step and show how we utilize the user profile in expanding the user’s query. Finally, we explain how we match the user’s query with the query templates. After the matching, the semantic query is ready to be sent to the reasoning component to retrieve the results. 5.1. Query templates Since we are not doing natural language processing (NLP), it is necessary to define specific query templates in order to scope the user’s queries and match them to the related ontologies. Query templates, in our research, represent all expected queries from the user; define the concepts that could be extracted from the user’s query; correlate different ontologies that are needed to answer the query; and finally, specify the answer template for each query. Each query template consists of attributes shown in Table 1. Table 1. Query template attributes Field

Description

Example

Template-ID Ontology-Lookup Ontology-Entities Confirmation-Question-Template Subjective-Question-Template

Template identification Ontologies needed to answer user’s query Ontologies needed to reason and retrieve the results Template for the confirmation question Template for the listing question

1 Food Relation (disease), user, culture List all {0} that {1} {2} Does {0} {1} {2}?

5.2. Query processing steps After getting the user’s query, we identify the language since each language has its own syntax and way of processing; we consider both English and Arabic languages. Then, a spell checker is used to check the spelling of the query and suggest corrections if needed. After that, the query is classified into either a confirmation question, which has an answer of yes or no, or a subjective question, which has an answer of listing some items. Next, noise words, such as do, does, an, the, etc., are removed in order to have only the words that could be related to the domain ontology. Then, we identify the concepts related to food and health ontology through a populated list of all the ontology’s classes and the knowledge base’s instances. After that, WordNet is used to identify the possible relations between these concepts by finding all that are synonymous with the pre-defined relations. Next, we enrich the query based on the user profile. Then, we match the identified concepts and relations to the best query template. Finally, a semantic annotation that represents the user’s query is produced for retrieval. Figure 3 shows the query processing steps.

Fig. 3. Query processing steps

5.3. Query enrichment Although we are in the query processing phase, the personalization starts from the query processing time utilizing the user profile ontology to enrich the query. The user profile ontology has defined relations to food and health ontologies, in addition to the culture and religion ontologies. The properties of the user

167

168

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

profile ontology can be used not only to enrich and expand the query, but also to fill the required fields for the query template. This leads to more accurate and relevant results by filtering the mass result records based on the user profile, health condition, culture and religion. 5.4. Matching user’s query with query templates Matching the user’s query to the pre-defined query templates is not black-or-white matching; it is more complicated. Identifying the concepts and relations within the user’s query that are related to the domain ontology is not sufficient to match them with any query template. We try to fill in the most appropriate query template concepts and relations, which were identified in the query-processing phase. However, there are some cases where we have incomplete information and hence we need to depend on other sources to fill the query template. After extracting everything we can from the query, we get aid from the domain ontology to detect the missing information based on what is found. Then, we look at the user profile information, if any, and fill in the missing information from the profile properties. Finally, we can go back to the user and ask him/her explicitly for more information in order to be able to match the query template. 6. Results Personalization The personalization helps in getting relevant results for the user’s query. As shown in the queryprocessing steps, the personalization starts with the query enrichment step, where we utilize the user profile to expand the query and to fill in the incomplete query templates. Here, we go into more detail with the results personalization steps and show how we capture the user’s feedback. 6.1. Results personalization steps Personalizing the results involves presenting the results in the most effective way possible through several steps. The first step is answering the user’s query in the same language he asks it in, regardless of the language of the ontology and the knowledge base, which has the annotated data. The second step is answering the user’s query in appropriate syntax based on the question type; a confirmation question is different than a subjective question, as the user expects a “yes” or “no” answer in the first type, while s/he expects a list of items in the second type. So, the answer is personalized to express the understanding of the query and to be familiar to the user. The third step is ranking the results based on the user’s preferences and interests. While many healthy foods are recommended by the system, it is smart to show what the user likes first and what he does not like last. Finally, it filters the non-relevant food or health information based on the user profile. 6.2. User’s feedback Continuous feedback collection is required to sharpen the user’s experiences. Feedback is not only explicit, but also implicit, as it can be collected through different measures. Many measures could help in reflecting the implicit feedback, such as time spent in browsing the results, clicks on the data sources, clicks on the result facets related to the search results, etc. All interactions and feedback are recorded and logged in the usage log which is analyzed after each query to know how effective the results are and how we can improve the future recommendations. This is reflected in the user profile ontology. 7. Experimentation and Evaluation We develop the interface screens and implement the semantic calls for the knowledge base. Figure 4 shows snapshots of the main screen and the user profile form. Next, we present a use case that shows a personalization example of using the system. Then, we show the query manipulation experimental results.

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

Fig. 4. (a) Portal main screen snapshot; (b) User proffile screen snapshot

7.1. Personalization use case The user starts with the registration in the t system and creates a personal profile. Then, the user enters e a query; e.g., “which fruits are suitable for me?” m The system manipulates the query and enriches it with w the user profile. In this example, the user proffile contains the following: age (50 years), gender (male), blood type (O+), health condition (diabetes, irron-deficiency anemia), culture (Middle Eastern) and religion r (Muslim). Then, the system tries to find thhe query template that best matches the user’s query. Neext, the system looks at the knowledge base for frruits that suit the person with a low concentration of iron n, as he has malnutrition, and less sugar, as he is diabetic. d Also, the system matches his age and gender, as some foods are not good for older males, and finnally the system factors in his culture and religion by filterring the inappropriate results. The results of the seearch are refined again to match other inferred user preferrences, such as knowing from previous use that thhe user prefers some specific types of fruits so that we giv ve them precedence. After showing the results, thhe system monitors the user’s interactions while s/he nav vigates through the results and collects the user’s feedback f to update the profile. 7.2. Experiment evaluation for semantic quuery manipulation We have collected 100 questions from m different users and tested these questions in our sysstem to evaluate how effectively we could semantically interpret the questions. Our target is to calculate how w many name entities we can find in these queestions by comparing the system performance to the manual m annotation of these questions. Table 2 show ws the experiment statistics regarding the discovered conccepts. It shows the number of relations found betweeen food and health condition. Then, it shows the number of o food items and nutrition items found in the querries. Next, it shows the number of diseases, body function ns (e.g., improve vision) and body parts (e.g., heaart) discovered. Then, we calculate the Precision, which h is the number of correct concepts found by the system divided by the total number of concepts found by the system. Finally, we calculate the Recall, which is the number of correct concepts found by the system divided by the total number of correct conccepts found manually. The overall recall is 82.13%, which means that we are able to discover most of the concepts c in the questions. Also, the overall precision is 97 7.15%,

169

170

Ahmed Al-Nazer and Tarek Helmy / Procedia Computer Science 19 (2013) 163 – 170

which is high because we pre-populate all of the concepts, except the relation, from the knowledge base. This explains the smaller Precision percentage in the relation, which is 91.36%, because we use WordNet in discovering the synonymous of the relations. Table 2. Experimental results statistics for query manipulation Concept Relation Food items Nutrition items Diseases Body functions Body items Total

Total found concepts 81 71 16 53 10 15 246

Found correct concepts 74 71 16 53 10 15 239

Correct concepts manually 92 83 19 65 13 19 291

Precision

Recall

91.36% 100.00% 100.00% 100.00% 100.00% 100.00% 97.15%

80.43% 85.54% 84.21% 81.54% 76.92% 78.95% 82.13%

8. Conclusion and Future Work In this paper, we propose a framework for semantic query manipulation and personalization of health and nutrition information. We present the user profile ontology and its relation to other domain ontologies. Then, we explain the semantic query processing steps and present the result personalization steps. A complete scenario is illustrated to visualize the framework followed by experimental results. The empirical evaluation shows promising improvements in the relevancy of the retrieved results and of the user’s satisfaction. As a future work, and in order to validate the efficiency of the proposed framework, we will publicize the portal and collect users’ satisfaction feedback. Acknowledgements The authors would like to acknowledge the support provided by King Abdulaziz City for Science and Technology (KACST) through the Science & Technology Unit at King Fahd University of Petroleum & Minerals (KFUPM) for funding this work through project No.10-INF1381-04 as part of the National Science, Technology and Innovation Plan. Thanks are extended to the project’s consultants, Dr. Jeffrey M. Bradshaw, Dr. Yuri Tijerino and Dr. Andrzej Uszok. References [1] D. Matsumoto and L. Juang. Culture and Psychology. Cengage Learning, Inc. 5th edition, United States; 2012. [2] O. Suominen, E. Hyvönen, K. Viljanen and E. Hukka. HealthFinland-a national semantic publishing network and portal for health information. Web Semantics: Science, Services and Agents on the World Wide Web, ch.7; 4: 2009, pp. 287-297. [3] Y. Wang and Z. Liu. Personalized health information retrieval system. AMIA Annual Symposium Proceedings. Washington: DC; 2005, p. 1149. [4] S. R. R. Abidi and H. Chen. Adaptable personalized care planning via a semantic web framework. 20th International Congress of the European Federation for Medical Informatics (MIE 2006), Maastricht: Netherlands; 2006. [5] S. Chessa, E. de la Vega, C. Vera, M. T. Arredondo, M. Garcia, A. Blanco and R. de las Heras. Adaptive searching mechanisms for a cardiology information retrieval system. Computers In Cardiology 2005; 32, pp. 147-150. [6] Y. Li, J. Mostafa and X. Wang. A privacy enhancing infomediary for retrieving personalized health information from the Web. Personal Information Management A SIGIR 2006 Workshop; 2006, pp. 82-85. [7] S. Sahay and A. Ram. Socio-semantic health information access. AAAI 2011 Spring Symposium; 2011. [8] F. Carmagnola and F. Cena. User identification for cross-system personalisation. Information Sciences: an International Journal; 2009; 179(1-2), pp. 16-32. [9] X. Tao, Y. Li and N. Zhong. A personalized ontology model for web information gathering. IEEE Transactions on Knowledge and Data Engineering, IEEE Computer Society Digital Library; 2011; 23(4), pp. 496–511. [10] S. Gauch, M. Speretta, A. Chandramouli and A. Micarelli. User profiles for personalized information access. The Adaptive Web, Methods and Strategies of Web Personalization, P. Brusilovsky, A. Kobsa, and W. Nejdl, Eds. Verlag, 2007, pp. 54–89.

[11] RDF: Resource Description Framework, http://www.w3.org/RDF/, last visited on: 21-01-2013.

Berlin,

Germany:

Springer-