Data Visualization for a Geographic Information System of Schistosomiasis Mass Drug Administration Lemuel Clark P. Velasco
Joshua Dave E. Gomora
MSU – Iligan Institute of Technology Iligan City, The Philippines
MSU – Iligan Institute of Technology Iligan City, The Philippines
[email protected]
[email protected]
Leonard N. Aguanta
Joannah Marie D. Acibar
MSU – Iligan Institute of Technology Iligan City, The Philippines
MSU – Iligan Institute of Technology Iligan City, The Philippines
[email protected]
[email protected]
ABSTRACT Salvador, Lanao del Norte is one of the municipalities in the Philippines endemic with schistosomiasis, a parasitic disease that leads to chronic illness. Altair GIS was developed in order for the Municipal Health Office (MHO) of Salvador to manage the data of the community’s mass drug administration compliance. To understand the community’s compliance to mass drug administration, this study enhanced Altair GIS by developing visualizations using Google Visualization API in order to generate spatial and visual analysis of the community’s compliance to mass drug administration. Data visualization analysis was conducted and the developed data visualization architecture design was implemented with results showing the successful possibility of incorporating the mapping, visualization and statistical models of Altair GIS. With these visualizations, Altair GIS can assist in the decision making process and strategy formulation of the MHOs by understanding data in coming up with intervention decisions, contextualized information and education campaign to increase awareness and compliance to schistosomiasis mass drug administration.
CCS Concepts Applied computing → Life and medical sciences → Health care information systems
Keywords Data visualization, geographic information system, schistosomiasis
1. INTRODUCTION Mass drug administration (MDA) is a public health strategy which involves administering certain drugs to the entire population of a determined geographic area for the purpose of eradicating a certain disease. Despite opposing views on the effectiveness of MDA, it is still widely used to solve the spread of diseases like malaria, lymphatic filariasis and other tropical diseases [1, 2, 3]. One of these tropical diseases that has received wide reception of implementing MDA is Schistosomiasis, a parasitic disease that leads to chronic illness [4, 5]. Schistosomiasis is common on communities with bodies of water that has an ecosystem that facilitates the completion of the disease’s transmission cycle. The transmission cycle starts with the parasitic eggs in fresh water from infested excreta with larvae that infects snails of which its waters have human contact able to transmit schistosomes.
First documented in the Philippines in 1975, the country has 24 identified endemic provinces with more than five million people were at risk, with approximately one million infected [6]. Salvador, Lanao del Norte is among the municipalities in the Philippines endemic with schistosomiasis. It has an exposed population of 3,722 in 4 endemic barangays: Curva Miagao, Daligdigan, Inasagan, and Sudlon and has an estimated prevalence rate of 3.75% as of 2014 [7]. With objectives focused on reducing the the Prevalence Rate by 50% in endemic provinces, the municipality is a beneficiary of the Schistosomiasis Control Program, by the Department of Health-National Center for Disease Prevention and Control [8, 9]. Part of the Schistosomiasis Control Program is conducting MDA with a goal of increasing the mass treatment of population in endemic provinces to 85%. The Municipal Health Office (MHO) of Salvador, Lanao del Norte aims to increase the population’s participation to MDA. Household participation data to the mass drug administration was collected from the four barangays of Salvador. A total of 540 records of responses was collected by the MHO in 2015 and includes Knowledge, Attitudes, and Practices (KAP) of representatives of residents in Schistosomiasis endemic barangays of the municipality. For as long as there is participation by the population, the MDA of the Schistosomiasis Control Program which utilizes the drug praziquantel contributes to the significant of the disease prevalence since 1980 [6, 8]. Although the program has made significant progress in schistosomiasis control, literature suggest that in order for the MDA to be successful, an effective information system that processes data of the population’s participation to MDA can aid decision makers to come up with appropriate strategies to increase participation [4, 8, 10, 11 ]. With this, Altair Geographic Information System (GIS) was developed and deployed to the MHO of Salvador by implementing a mapping model to process spatial MDA data [12, 13]. For the MHO to understand the data, additional data analysis features which include descriptive and inferential statistical tools were also integrated into Altair GIS [14]. While initial attempts of the MHO was conducted to develop map and chart visualizations of the MDA data, results of these attempts were not integrated into the statistical subsystem of Altair GIS [15]. Software applications like Altair GIS that analyzes and communicates information through visual representation can make sense of the population’s geographic data and data regarding MDA participation. Data visualization in Altair GIS that integrate principles in information resource management as
well as appropriate data representation and management has the potential to process visual analyses for the decision makers involved in the Schistosomiasis Control Program’s MDA. The goal of this study is to design and develop data visualizations for Altair GIS that will provide spatial and visual analysis of a community’s participation to MDA that will aid decision makers craft public health strategies.
3. RESULTS AND DISCUSSION 3.1 Data Visualization Analysis Results The entity relationship diagram of the GIS, as shown in Figure 1, is composed of nine (9) entities from household, group occupation, answer, batch, doctor, choice, staff, question and group.
2. METHODOLOGY 2.1 Data Visualization Analysis The first phase of the study involves data analysis which aims to understand the characteristics of the data and determine proper ways of data presentation. The household’s basic information, their drug intake and the household’s knowledge, attitudes and practices pertaining to mass drug administration were analyzed. The structure of the database was then developed. After the sections are identified, an analysis of these sections was conducted to identify possible entities and attributes and the relationship among them. The output is an Entity-Relationship (ER) diagram that described the relationship among entities and the attributes of each entity. Primary key and foreign key constraints was also be defined. After the ERD is created, it is implemented to the MySQL database through the schemas generated from the ERD. MySQL is an open source database management system that offers high performance, scalability and flexibility especially for web-based applications. With the user requirements given by the MHO of Salvador, the researchers were able to identify the needed spatial and visual features to be implemented by Altair GIS. A review of the data variables related to MDA was conducted in order to ensure that the statistical features of Altair GIS can be properly linked to the proposed visualizations. Cleaning, verifying, and transforming the MDA data was done to make sure that all but the data of interest were removed. The data analysis features already integrated in Altair GIS were inferential results such as data correlation and independence as well as descriptive results such as percentages of male/female, age groups, income and educational level per cluster, purok, barangay or municipality. Data was then organized based on its variable type and the results it can generate, whether it belongs to descriptive or inferential statistics. Assignment was then done for the corresponding data visualizations of each data and data analysis results. The identified visualizations was then integrated to the results generated by the data analysis features of Altair GIS containing descriptive and inferential statistical tools.
2.2 Data Visualization Implementation The visualization model architecture describes the integration of data with appropriate visualizations. The design of the visualization model includes, identification of visualization model components, designing the visualization model, and implementation of the architecture components using scripts. The components of the visualization model were determined by identifying the tools that will be used such as Google Visualization API, data tables, and the mechanism of integrating the data. The user requirements determined the components of Google Visualization API that were included upon the implementation of the visualization model. The visualization model architecture consists of the identified components and the processes between these components.
Fig. 1 Entity relationship diagram of the GIS In order to avoid data update, insertion, and deletion anomalies in the design, normalization was observed throughout the database design process. The Third Normal Form of normalization and was applied to the database to achieve atomicity of data. Atomicity prevents data inconsistency by preventing insertion of two or more data into a single entry which causes complications when modifying the data due to non-atomic values. To achieve this, normalization is done throughout the database design of the MDA data. The Household entity represents the components of the basic information section and the fields from this section are all incorporated in the Household entity as attributes. Batch entity was created to allow the user to add a new set of survey answers. Moreover, the Batch entity contains the name attribute and is linked to the Doctor entity. This was done to keep track of the assigned doctor. The spatial data visualized in Maps are the households’ coordinates as well as the attributes from the household entity. The MHO of Salvador specified the scoring system to be used for the answers on Knowledge, Attitudes and Practices (KAP). As defined in the requirements of the MHO, the answers in the Knowledge section of the MDA survey were classified as either positive or negative answers. Separate scoring for positive answers and negative answers was implemented. The overall knowledge score of a household was determined by comparing the absolute values of positive score and negative score. For instance, if the negative score is greater than the positive score then the household will have a negative overall score. In this study, the locations of the households filtered by the results of the query that only involved basic information of the households were visualized using markers. This would allow identification of the exact locations of the households. Due to the large number of MDA records, the researchers have decided to focus on enhancing the grouping feature of the system which basically creates a grouping mechanism to group data of similar meaning. It was found necessary to group similar answers and choices of a specific question so that the visualizations would not be overwhelmed with too much data and for the visualizations to be displayed properly across the user interface. Thus, a feature
was implemented in the user interface in the form of modals, as shown in Figure 2. The members’ field for adding groups is implemented using a custom Javascript file called Select2 that queries and displays existing answers or choices from the database in that field and displays the selected choices as tags inside the field. The user will supply the fields, a group name, and score. Scores are provided to give weight to the acceptability of the answer. A positive score means the answer is acceptable while a negative score means the answer is rejected. All the scores of the answers or choices under that group will then be updated.
into JSON format using AJAX requests in order to load the data on the web application. The researchers used the open source Google Visualization API, a library which is the main core in creating the visualization of the MDA data. The Google Visualization API allows the creation of charts and reporting applications over structured data and helps integrate these directly into the web. The Google Chart is used with JavaScript that is embedded in the web page. The system loads the Google Chart libraries, the data to be charted, select options to customize the chart, and finally creates the chart object with the id that serves as the container of the chart. Then, later in the web page, it was loaded in the created
with the id to display the Google Chart.
Fig. 2 Grouping of similar data The MDA contains the basic information of the households along with their knowledge, attitude, and practice, towards Schistosomiasis. The MHO categorized the data into two groups. They either fell under descriptive statistics and inferential statistics. Table 1 shows that visualizations for the descriptive reports are pie, bar and tree map charts. With these, the MHO can easily monitor the population in a specific description. Visualizations for the inferential reports are bar, line, pie, and candlestick chart. The inferential analysis intends to return visualizations and values that are essential for concluding whether these pairs are correlated and dependent. Table 1. Summary of MDA data and assigned visualization Descriptive Reports Pie Chart
Inferential Reports
Age
Bar Chart,
Sex
Line Chart
Compliance to MDA Bar Chart
Education
Pie Chart, Line Chart
Income Tree Map Chart
Grouped Data
Candlestick Chart
Sex, Age, Occupation, Education, Income vs. Intake Knowledge, Attitude and Practice vs. Compliance to MDA Age, Income, Knowledge, Attitude and Practice vs. Compliance to MDA
2.2 Data Visualization Implementation Results The visualization model architecture design shown in Figure 3 shows that data is retrieved from the database and transformed
Fig. 3 Visualization model architecture design Descriptive reports visualize the basic information of the respondents in Salvador, Lanao del Norte. This includes sex, age, educational attainment, monthly income, medicine intake, and the reasons why respondents were not able to comply with the medication prescription. There are numerous factors that need to be taken into consideration in order to produce a well-designed and meaningful visualization. These factors are the legends, labels, axes, colors, chart types, and the overall appearance [16]. Even the smallest detail can alter the outcome of a chart visualization. The researchers have not overlooked the details such as axes, colors, labels, and legends. It was made sure that numerical axis would always start at zero. Comparing two shortened bar graphs may cause inaccurate conclusions. The charts were well labeled and each slice, bar or line was designated with proper legends. Different colors should be used for different categories but it is not preferred to have more than six colors in one visualization [17]. Google Chart Tools has the option to enable 3D effect on its charts. But the researchers chose not to use this feature because 3D effects reduce comprehension. And most of all, adding too much information to a single chart is avoided as it eliminates the advantages of processing the data visually [18]. As shown in Figure 4, pie charts can be applied to sex, age, and compliance of the respondents to MDA as they have six or less categories which make them easier to read. Unfortunately, due to the limited interactivity options of Google Chart Tools, the suggestion of arranging the wedges clockwise could not be applied. Having dynamic data as input also added the difficulty assigning shades from dark to light. The Google Chart Tools has its own sorting of slices which hindered the researchers from further customizing the visualization.
Fig. 6 Line chart visualization
Fig. 4 Pie chart visualization Bar graphs have more advantage over pie charts since it can be converted into four different forms, namely, vertical, horizontal, stacked, and grouped bar charts. Horizontal bar graphs are recommended when categories have long names that would be hard to fit in a vertical bar. When it is needed to show how different sub-groups answer, stacked bar graphs can be used. Grouped bar graphs are similar with stacked bar graphs, the only difference is that they get their own bars. The distribution of mass drug administration participation shown in Figure 5, uses grouped bar graph over stacked bar graph simply to identify clearly the participation count differences between the barangays.
The survey answers with its corresponding household locations were visualized using heatmaps. The scores of the survey answers were indirectly represented by heatmaps gradients on the map. The MHO of Salvador required that survey answers will be classified as positive and negative. As shown in Figure 7, households that complied with the MDA have heatmaps gradient colored sky blue while those who did not complied with MDA were colored pink. Heatmaps made it possible to visualize on the map the locations of households and their answers to the survey. The technique of using different heatmaps gradients for different answers will allow the users to see and analyze patterns and relationships among the survey answers [12, 15]. Through these visualizations, the MHO was able to identify that household who reside near health centers are more likely to comply in the MDA. The data visualization results helped the MHO of Salvador in making judgments about the difference between groups of data which gave insights and led to formulation of strategies to improve their cause in increasing the rate of schistosomiasis MDA compliance.
Fig. 7 Heatmap visualization Fig. 5 Bar chart visualization Line charts can easily be used to display multiple data sets while the area chart can only be used when there is summation relationship between data sets [19]. With this, the researchers chose line chart since the data to be visualized are continuous and do not need the summation relationship between these data. Figure 6 shows a linear chart, horizontal axis corresponds to the batches and vertical axis corresponds for the size. Through this, it allows the user to oversee the answers of the respondents. The line chart gives the user a fairly good idea of where the answers have grown across batch answers.
4. CONCLUSION AND RECOMMENDATIONS This study attempted to develop visualizations that will process the mass drug administration data for schistosomiasis in Salvador, Lanao del Norte. Google Visualization was successfully used to generate spatial and data visualizations that can aid the decision makers of the Municipal Health Office come up with interventions and programs that will optimize the population’s participation to MDA. Data visualization analysis was conducted and similar data were grouped for visualization along with assignment of identified charts to specific data variables and reports. The possibility of implementing a data visualization architecture design that will incorporate visualizations to the existing Altair GIS was also
explored in this study with the aim to incorporate maps, charts and statistical tools for decision makers to craft action plans for mass drug administrations. The visualizations implemented in Altair GIS served as the basis for decision making in elevating the compliance of the community to the mass drug administration of the schistosomiasis control program. The use of other visualizations tools can still be explored to address the limitations of Google Visualization API. Additional statistical tools along with their assigned visualization may also be added to Altair GIS for specific investigation of various relationships of data. Furthermore, the data visualization presented design opportunities for other developers to replicate the GIS and its visualizations on other endemic diseases such as dengue and malaria.
5. ACKNOWLEDGEMENTS This research would not have been made possible if not without the assistance of Dr. Jordan C. Pinzon, the Municipal Health Officer of Salvador, Lanao del Norte.
6. REFERENCES [1] World Health Organization, “Mass drug administration, mass screening and treatment and focal screening and treatment for malaria,” Malaria Policy Advisory Committee Meeting, 16–18 September 2015, Geneva, Switzerland. [2] C. Lahariya & A. Mishra, “Strengthening of mass drug administration implementation is required to eliminate lymphatic filariasis from India: an evaluation study,” Journal of Vector Borne Diseases, Issue 45, December 2008, pp. 313–320. [3] Global Health Services, “Review of Mass Drug Administration and Primaquine Use,” Global Health Group Background Paper, January 2014 [4] G. F. Chami, A. A. Kontoleon, E. Bulte, A. Fenwick, N. B. Kabatereine, E. M. Tukahebwa, and D. W. Dunne, “Profiling Nonrecipients of Mass Drug Administration for Schistosomiasis and Hookworm Infections: A Comprehensive Analysis of Praziquantel and Albendazole Coverage in Community-Directed Treatment in Uganda”, Clinical Infectious Diseases by Oxford University Press for the Infectious Diseases Society of America, January 2015, DOI: 10.1093/cid/civ829 [5] F. O. Richards Jr., A. Eigege, E. S. Miri, M. Jinadu, D. R. Hopkins, “Integration of mass drug administration programmes in Nigeria: the challenge of schistosomiasis,” Bulletin of the World Health Organization, August 2006. 84(8) [6] M. T. Inobaya, R. M. Olveda, V. Tallo, D. P. McManus, G. M. Williams, D. A. Harn, Y. Li, T. N. Chau, D. U. Olveda,A. G. Ross, “Schistosomiasis mass drug administration in the Philippines: lessons learnt and the global implications,” Institut Pasteur. Published by Elsevier Masson SAS, 2015 Jan;17(1):6-15. doi: 10.1016/j.micinf.2014.10.006. [7] J. C. Pinzon, “Effects of selected strategies on mass drug administration participation for schistosomiasis in Salvador, Lanao del Norte”, unpublished action plan and project presented to the faculty of Master in Public Management major in Health Systems and Development of the Development Academy of the Philippines, 2015
[8] X. Zhou, R. Bergquist, L.Leonardo, R. Olveda, “Schistosomiasis: The Disease and its Control,” Regional Network for Research, Surveillance and Control for Asian Schistosomiasis, 2008 [9] F. O. Richards Jr., A. Eigege, E. S. Miri, M. Jinadu, D. R. Hopkins, “Integration of mass drug administration programmes in Nigeria: the challenge of schistosomiasis,” Bulletin of the World Health Organization, August 2006. 84(8) [10] A. G. Ross, R. M. Olveda, Y. Li Y, “An audacious goal: the elimination of schistosomiasis in our lifetime through mass drug administration.”, Lancet, 2015 May 30;385(9983):2220-1. doi: 10.1016/S0140-6736(14)614173. [11] A. G. Ross, R. M. Olveda, L. Acosta, D. A. Harn, D. Chy, Y. Li, D. J. Gray, C. A. Gordon, D. P. McManus, G. M. Williams, “Road to the elimination of schistosomiasis from Asia: the journey is far from over.”, Microbes Infect. 2013 Nov;15(13):858-65. doi: 10.1016/j.micinf. 2013.07.010. [12] L. C. Velasco, J. P. Postrano, L. C. Diaz, L. A. Catane, “A Geographic Information System Using Google Maps for Schistosomiasis Survey Data” Asia Pacific Journal of Science, Mathematics and Engineering, June 2015 [13] United States Geologic Survey. Global Geographic Information Systems. 2007, February 14. Retrieved September 20, 2015, from USGS Planetary GIS Web Server: http://webgis.wr.usgs.gov/globalgis/tutorials/ what_is_gis.htm [14] J. M. D. Acibar, L.N. Aguanta, J. D. E. Gomora, and L.C. P. Velasco. “Data Analysis with Visualization for a Geographic Information System of Schistosomiasis Community Health Data”. Pre-proceeding of the 6th Workshop on Computation: Theory and Practice WCTP, September 2016 [15] L. C. Velasco, J. D. E. Gomora, L. N. Aguanta, J. M. D. Acibar, J. P. Postrano, L. C. Diaz, L. A. Catane. “Data Visualization of Schistosomiasis Community Health Data Using Google Maps and Google Charts”. Poster presented during the joint meetings of the 41st Asia Pacific Advanced Network and 30th Pacific Rim Application and Grid Middleware Assembly, January 2016 [16] Duke University Libraries. Introduction to Data Visualization: Chart Dos and Don’ts. 2015. Retrieved from http://guides.library.duke.edu/datavis/ [17] D. Borland, & R. M. Taylor II, Rainbow color map still considered harmful. IEEE Computer Graphics and Applications, 2007, 27(2), 14-17. [18] N. Yau, N., World Happiness Report make statisticians unhappy. 2012. Retrieved from http://flowingdata.com/2012/04/25/world-happiness-reportmakes-statisticians-unhappy/ [19] S. Choudhury, Choosing the right chart type: Line vs. Area Charts. 2014, Retrieved from http://www.fusioncharts.com/blog/2013/06/line-charts-vsarea-charts/