City University London MSc in Health Informatics

City University London MSc in Health Informatics Project Report 2017 How can verbal autopsy data be visualized on a global scale to provide insights into patterns of causes of death and their associated uncertainty?

Ewoma Obaro Supervised by: Dr. Jon Bird Submitted: 31st March 2017

By submitting this work, I declare that this work is entirely my own except those parts duly identified and referenced in my submission. It complies with any specified word limits and the requirements and regulations detailed in the assessment instructions and any other relevant programme and module documentation. In submitting this work I acknowledge that I have read and understood the regulations and code regarding academic misconduct, including that relating to plagiarism, as specified in the Programme Handbook. I also acknowledge that this work will be subject to a variety of checks for academic misconduct.

Signed: Ewoma Obaro

II

Abstract Background: Every year 37 Million deaths are not registered or reported, creating a gap in global mortality data in low income (LIC) and middle income countries (MIC). Verbal Autopsy (VA) is a potential approach to close the gap. The causes of death (CoD) data stems from fourteen INDEPTH HDSS sites in sub-Saharan Africa and eight sites in Asia. Methods: The initial design stage is covered by using paper prototyping. The data is sorted and then manipulated in Google Fusion Tables. Google Maps JavaScript API and Google Fusion Layer Wizard help build the custom map. Adobe Dreamweaver is the web development tool to implement the website. Results: A functioning web based application is available to use for population analysts and data scientists. The map markers show CoD in the INDEPTH regions and represents uncertainty according to marker size. Summary: Visualizing VA data on a global scale and their associated uncertainty can provide trends in levels of mortality and CoD patterns. For those engaged in health population science the project could enable with future work routine analysis of mortality data. Key Words: Verbal Autopsy, Causes of Death, Uncertainty, Data Visualization, High Mortality, Google Fusion Tables

III

Acknowledgments I am using this opportunity to express my gratitude to everyone who supported me throughout the course of my master’s degree in Health Informatics. I am thankful for aspiring guidance, invaluably constructive criticism and friendly advice during the project work. I am sincerely grateful to Dr. Jon Bird, my supervisor, for sharing his truthful and illuminating views on a number of issues related to the project. I would like to show my appreciation and thank Kwame Baah, Colour Scientist, for his advice and input towards the website design and colorism. I want to thank my family and friends who have supported me in my pursuit of completing my Master’s degree in Health Informatics. Their encouragements to work daily helped me finish my project in time.

Thank you all, Ewoma Obaro

IV

Table of Contents Abstract................................................................................................................................ III Acknowledgments ................................................................................................................ IV Table of Contents.................................................................................................................. V 1

2

Introduction and Objectives ............................................................................................ 1 1.1

Research question .................................................................................................. 2

1.2

Objectives ............................................................................................................... 2

1.2

Beneficiaries ........................................................................................................... 2

1.3

Outline of methods .................................................................................................. 3

1.3.1

Design and Implementation ............................................................................. 3

1.3.2

Testing & Evaluation ........................................................................................ 3

1.4

Work Plan ............................................................................................................... 3

1.5

Report structure ...................................................................................................... 4

Context .......................................................................................................................... 5 2.1

Literature review ..................................................................................................... 5

2.2

Verbal Autopsy Data ............................................................................................... 5

2.2.1

PCVA ............................................................................................................... 6

2.2.2

InterVA............................................................................................................. 6

2.2.3

Uncertainty ...................................................................................................... 6

2.2.4

Mobile VA Data ................................................................................................ 6

2.3

Data visualization .................................................................................................... 6

2.3.1 2.4

Paper prototyping ................................................................................................... 7

2.5

Google Fusion Tables ............................................................................................. 8

2.6

Google Maps JavaScript API .................................................................................. 8

2.6.1

3

Use of colour.................................................................................................... 6

HealthMap ....................................................................................................... 9

2.7

Website development ............................................................................................. 9

2.8

Evaluation ............................................................................................................... 9

Methods ....................................................................................................................... 11 3.1

Paper prototype .................................................................................................... 11

3.2

Website Implementation........................................................................................ 13

3.3

Data Collection ..................................................................................................... 13

3.3.1

INDEPTH Data Repository............................................................................. 13

3.3.2

UNESCO Institute for Statistics (UIS) ............................................................ 13

3.3.3

KML files ........................................................................................................ 14

3.4

Creating Google Fusion Map ................................................................................ 16 V

3.4.1

4

5

6

Fusion Tables Layer Wizard .......................................................................... 17

3.5

Customizing visualization with Google Maps JavaScript API ................................ 17

3.6

Evaluation ............................................................................................................. 17

Results ......................................................................................................................... 20 4.1

Paper prototyping evaluation ................................................................................ 20

4.2

Data Collection ..................................................................................................... 20

4.3

Maps Creation ...................................................................................................... 21

4.3.1

Layer 0........................................................................................................... 21

4.3.2

Layer 1........................................................................................................... 21

4.3.3

Layer 2........................................................................................................... 22

4.3.4

Fusion Tables Layer Wizard .......................................................................... 24

4.3.5

Legend........................................................................................................... 25

4.4

Website Implementation........................................................................................ 26

4.5

Evaluation ............................................................................................................. 30

4.5.1

Testing ........................................................................................................... 30

4.5.2

Questionnaire VA Map ................................................................................... 34

Discussion ................................................................................................................... 37 5.1

Objectives ............................................................................................................. 37

5.2

Academic Context ................................................................................................. 37

5.3

Generalisation and validity .................................................................................... 38

5.4

Recommendations ................................................................................................ 38

Evaluation, Reflections, and Conclusions..................................................................... 40 6.1

Critical review ....................................................................................................... 40

6.2

Personal Reflections on the project ....................................................................... 40

6.3

Conclusions .......................................................................................................... 41

6.3.1

Contribution to health information science and future work ............................ 41

6.3.2

Summary ....................................................................................................... 41

7

Glossary ...................................................................................................................... 42

8

References .................................................................................................................. 43

Appendices ............................................................................................................................ i Apendix A: Project Proposal for MSc in Health Informatics .................................................... i Appendix B: Data preparation (Excel) ................................................................................... iii Appendix C: Google Fusion Tables .......................................................................................vi Appendix D: HTML Code ...................................................................................................... x Appendix E: Examples of filled in questionnaires ................................................................ xiv Appendix F: Table of Figures……………………………………………………………………..xvii VI

VII

1 Introduction and Objectives In 2015, six million children under five years old still die from preventable diseases (You et al., 2015). Even though the mortality rate has been halved since 1990, it is still “unacceptably high” (The World Bank, 2015). Reducing child mortality and mortality in adults depend on strong national health systems which need to generate sufficient information for statistical analyses and appropriate health coverage. Local and global public health planning and interventions rely on comprehensive and timely data regarding key aspects of health, including levels and trends in fertility, mortality and causes of death. Ideally, civil registration and vital statistics (CRVS) systems encompass information on demography, causes of death and birth registration by age, sex and geographical location. However, birth statistics tend to be more accurate due to the necessity of identification processes. Births are registered to grant individual identity, nationality and rights. Not enough emphasis is put on death and its causes. Around 62% of deaths go uncounted worldwide, especially in low- and middle-income countries (Mikkelsen et al., 2015). Cause of death (CoD) statistics are a cornerstone of health information, yet insufficiently researched and funded (Murray et al., 2014). Therefore, public health decision-making is constrained by ill-defined and unknown causes preventing good public health programs. If vital registration is not given at a community level, alternative methods of ascertaining and estimating cause of death distributions must be used for the intervening time (Soleman et al., 2006). Verbal Autopsy (VA) is an established research method which gathers health information to determine probable causes of death in high-mortality settings. The World Health Organization (WHO) has developed a standardized VA instrument to support the routine use of assigning the cause of death. This research paper addresses the lack of tools which present collected causes of death data derived from VA on a global scale. The variety of VA tools like InterVa, Tariff 2.0 Method make it difficult to compare data over time and places (Soleman et al., 2006). Health population scientists are often put in a position of needing to report statistics: In an attempt of giving health population scientists more insight on cause-specific mortality patterns, this project tries to provide a simple web-based tool which allows exploring geographical information. Trends might be identified regarding epidemic outbreaks or other emerging health problems. This tool benefits non-specialists in information management, to recreate a functioning inexpensive application relatively easy with a steep learning curve. Visualizing data is a different approach to increase capture and availability of data than pure statistic figures. Medical data representation using maps has become popular over time and can be traced back to the 1850s. Dr John Snow studied the mortality rates from a spatial dimension by mapping cases of cholera in the London Soho region and therefore identified the source of the outbreaks (Snow, 1855). Mapping the health outcomes singled out the Broad Street water pump which led to immediate intervention by removing the water handle. Snow improved the health and quality of life of the population. This computing research project documents the common development life cycle to describe analysis, design, implementation and testing. Instead of using published systems development methodologies, prototyping is the alternative approach to this research. After the prototype is analysed, designed and implemented, the researcher can create a revised version by modifying the analysis and design model (Oates, 2005). Designing and building a data visualization tool relies on usability testing of participants (Ingleshwar, 2007). If the tool is implemented to satisfaction, after iteratively modifying it, feedback will be required by the target user group. The use of surveys helps to monitor the

1

strengths and weaknesses of the tool. Therefore, the user’s needs can be adequately meet to reproduce the desired outcome. 1.1 Research question The final project will concentrate on answering the research question: “How can verbal autopsy data be visualized on a global scale to provide insights into patterns of causes of death and their associated uncertainty?” 1.2 Objectives This project is concentrating on giving insights on the health of populations by analysing verbal autopsy data. Epidemiological research should help describe, explain, predict and control the mortality within a population (Young, 2004). The main objective is to design and implement a prototype system which allows the interpretation of mortality data on a global and cross-regional scale by plotting geographic coordinates on a map. This feature can be enabled through the widespread use of mobile VA that facilitates data collection of a large volume posterior to paper based systems (WHO, 2016a). Mapping health outcomes does not purely convey variations of mortality regarding the spatial component due the representation of location data. Linking incidences with the spatial and temporal dimension could benefit (health population) scientists of learning of unseen patterns (Cossman et al., 2003). In order to evaluate the different results, we need to subdivide the main objective into smaller ones so we can analyse the implications of the project findings. Kirk (2012) illustrates four objectives in detail concerning data visualization which facilitate design decisions: 1. Strive for form and function The tool should be “aesthetically inviting and functionally effective”, however functionality should be secured first (Kirk, 2012, p.39). 2. Justifying the selection of everything we do The tool should have fulfilled its intended purpose. Every function is explained and justified. The literature review shows the necessity of the tool and how it was constructed. 3. Creating accessibility through intuitive design The web-based application should be user friendly. Avoid a cluttered interface. 4. Never deceive the receiver The last objective highlights that the tool should represent the actual results even though the output might not be desirable. 1.2 Beneficiaries Acknowledging the global deficit in cause of death registration, health population scientists develop methods to understand why and where people die. These developments form a trivial strategy for addressing health inequalities. The course of health informatics uses information technology to improve health and conduct high quality research. Implementing a web based data visualization tool for causes of death data, can possibly benefit population scientist and even policy makers. The findings from this research paper will be freely available for research purposes to enable a comprehensive view on a global scale. The product will not just be limited to population scientists but members of the public giving them awareness how important the data of the deceased are,

2

However, if the findings should not show significant patterns, the system needs to be redesigned in future work. The project’s time frame and initial outline of three months limits multiple periods of testing and evaluation. 1.3

Outline of methods

1.3.1 Design and Implementation This project within the studies of health informatics demonstrates comprehensive understanding of experimental data visualization web applications and their limitations. Technical issues and parameters have been addressed which have not been apparent prior to the use of the online tools. The researcher illustrates practical knowledge of various functioning technologies based on an informed literature review. As a first step a prototype was designed using paper as a low-cost resource, a widely used method for a user-centred design process (Snyder, 2004). Before visualizing the tabular data, the microdata needed to be analyzed and altered before uploading in the cloud-based service for data management. Then the initial paper design was mirrored in the maps creation and custom filters were created consulting accessible and relevant online databases (UIS.Stat - http://data.uis.unesco.org). A website was the final product where the map has been embedded in. 1.3.2 Testing & Evaluation The nature of Google Fusion and the Google Maps design is interactive asking participants to carry out tasks and give real user interface feedback. Therefore, the design approach was iterative and there were to evaluation stages in form of questionnaires. The first stage occurred after completing the paper prototyping. The second stage followed the testing of the tool and implemented web page. The testing group consisted of several individuals within the computer science department, a background of health population science and non specialists. 1.4 Work Plan The initial work plan timeline (see Anppendix A) changed due to extenuating circumstances. The tasks were more defined regarding the design and implementation. Some new tasks were added after researching more about Google Fusion Tables and its limitations, e.g. adjusting the Google Fusion Layer Wizard with the Google Maps JavaScript API. The proposal stated that Processing would be considered as programming language. However, after consulting the project supervisor, Google Maps JavaScript would present fewer difficulties to understand. In Figure 1, you can observe the project time and the completed tasks.

3

Figure 1: Work Plan Individual Project

1.5 Report structure After the introduction, chapter 2 will examine the academic context how verbal autopsy is significant and review relevant literature to build the data visualization tool. Chapter 3 describes all necessary activities and tasks for this project. Mainly, the prototyping process, the data collection, the maps creation and the website implementation will be discussed. Results are listed in detail in chapter 4 with the products deriving from the methods. Chapter 5 discusses the results in comparison to the objectives and the usability of the tool. The conclusion, chapter 6, describes the project as a whole and intends to answer the original research question. The chapter sums up the knowledge acquired and suggests a proposal for future work.

4

2

Context

2.1 Literature review The literature consulted and reviewed focuses on three major aspects of the project. One aspect is the nature of VA data and its uncertainty and exploring the domain of visualization and analysis for the cause of death. Then this section will provide a brief description of early design processes discussing a form of low-fidelity prototyping. Lastly, this paper will review methods and tools to map and visualise geographical data and how the evaluation could be conducted. 2.2 Verbal Autopsy Data Verbal Autopsy (VA) is a research method used to ascertain probable causes of death. VA has proven to be as efficient as death certification in high-quality hospitals in absence of instruments for efficient collection of cause of death data that occur outside health care settings (Murray et al., 2014). A standardized questionnaire elicits information preceding the death from next of kin or close caregivers. The causes of death have to be carefully determined to ensure that all relevant information for mortality data is recorded and that the certifier does not select some conditions for entry and reject vital others (World Health Organization, 2016). International standard death certification is based on reproducing the single underlying cause of death. Using VA data for establishing the disease initiating the chain of events can be a delicate matter due to various signs and symptoms which have been reported (Fottrell and Byass, 2010). However, assigning the cause of death when multiple causes are present might result in inadequate estimates and a level of uncertainty will be present. The WHO has defined a set of cause of death rules to select the underlying cause of death, but the single underlying cause alone does not provide sufficient information on the cause of death in deaths that are due to the effect of several etiologically independent conditions (Anderson et al., 2001). The health of a population can be distorted because comorbidities especially in children and older age groups will be ruled out (Soleman et al., 2006). There are different methods/tools of analytical assessment defining cause of death (Figure 2).

Figure 2: Verbal autopsy process and factors influencing cause-specific mortality fractions (Soleman et al., 2006) 5

2.2.1 PCVA In general, two or more physicians assess VA data and then assign probable causes of death. Physicians need to be trained in assigning causes of death. In low-income settings, the physician is unfamiliar with the deceased patients, the population health or the local terms referring to the signs, symptoms and the conditions of death (Fantahun et al., 2006). Using physicians to analyse the data is a valid and most traditional analytical method named physician-certified VA (PCVA). However, the method is expensive and inefficient because it relies on the review of physicians who often do not have the time and resources. Sometimes feedback of VA data can take several years leaving the relatives or next of kin waiting for answers for the unknown cause of death of their loved ones. 2.2.2 InterVA InterVA is a probabilistic model, which applies the Bayes' theorem, interpreting communitybased VA interviews. It helps to characterize patterns of cause-specific mortality by defining the probability of a given set of cause of deaths versus their input disease indicators. A study of Byass et al. (2015) highlights that InterVA is effective in cost, time, repeatability and analysis of large data sets. VA data covering 54182 deaths was compared via the InterVa-4 model and PCVA (Byass et al., 2015). It was demonstrated that the VA method with automated interpretations is as equivalently functional as PCVA. Yet, neither PCVA nor InterVA represent absolute findings regarding cause of death due to their related uncertainty. 2.2.3 Uncertainty The collected VA data addresses the prevalence of diseases within a population. However, VA rather collects estimates of cause-specific mortality fractions for populations than the certainty of the cause of death for particular cases (Fottrell and Byass, 2010, p. 43). The Bayesian nature of InterVA is a key reason that there is uncertainty associated with the outputs (Bird and Fottrell, 2014). The InterVA output is influenced by the input of information provided by next of kin and limited categories of causes of death in the application. 2.2.4 Mobile VA Data Increasing awareness of the potential of new information technologies, such as mobile phones, to accelerate data collection and transmission and strengthen data analysis and dissemination has been achieved with projects conducting community-based in-depth verbal investigations. Projects like the Millenium Villages Project (MVP), Ghana in 2008 use mobile applications to gather cause of death data in real-time (WHO, 2016a). Collecting cause of death data enables the spatial component by an automated process which receives location data. The VA data obtained using mobile devices that connect via the web to remote providers must ensure comparable diagnostic accuracy. 2.3 Data visualization Data visualization enables interpreting and processing information to its greatest potential of recognizing communication patterns. Modern data visualization often compromises dynamic interaction with a user-friendly interface instead of static display of complex data sets (Wood, 2016a). 2.3.1 Use of colour Colour representation is not merely arbitrarily but it is part of the design decision. The choice of colour can have a grand impact on the user perceiving the visualized data. Brewer et al. (2003) defined a useful set of colour tables to map different types of data. ColorBrewer (colorbrewer2.org) is a diagnostic tool evaluating colour schemes with the possibility of differentiating similar colours. The quantitative nature of the mortality data and its level of

6

measurement requires a continuous ordered colour scheme (Wood, 2016b) (Figure 3).

Figure 3: Level of Measurement Researchers often prefer grey scales for academic papers because they are inexpensive to print in widespread journals and they are highly effective in showing disease prevalence (Samarasundera et al., 2012), or rather for this project the mortality rate. While the void of colour can present low levels of mortality, high levels are presented with dark grey or black. Samaransundera et al. further caution using white as a representation of a low-level value because within ratio data white can be seen as absolute zero, misinterpreting colour-coded areas as absent of measurement/data on the map. Referencing to Wood’s under- and postgraduate lecture in Data Visualization (2016), divergent colour schemes are common and can be as effective. Diverging schemes emphasize on quantitative data progressions from a critical midpoint (or neutral point). Positive change can be distinguished from negative change by using two hues encompassing two saturated contrasting colours for the extremes of the data. Sequential schemes are preferable if the range of the data is not as relevant. On the other hand diverging schemes accentuate the mean of the collected data (Meyer and Greenberg, 1988). The combination of red-green as colour scheme should be avoided due to user’s suffering from colour vision deficiency, or colour blindness. Even though this impairment concentrates on 10% of the male population, 1% of the female population can be affected as well (Ware, 2013). Making colours distinguishable relies on the size of area which is colour coded. The colour sequences which always will be perceived are as mentioned grey scales but also yellow-blue dimensions (Meyer and Greenberg, 1988b). 2.4 Paper prototyping Prototyping is a creative process that occurs at early stages of the design process. There are different methods to design a user interface and functions of the data visualisation tool for this project. The prototype will be analysed, designed and afterwards implemented with the intent of later modifications and creating a revised system prototype (Oates, 2005). Paper prototyping enables qualitative usability feedback where representative users perform realistic tasks by interacting with a paper version of the interface that is manipulated by the researcher. The users often do not need instructions how the interface is intended to work (Snyder, 2004). Paper is a resource friendly low fidelity material that serves as a base for (incomplete) sketches enabling to rapidly test and rejects broad concepts. In comparison to high fidelity prototyping, paper prototyping is not close to the final product and is not as detailed or functional.

7

The first step includes determining the appropriate user for testing and defining typical tasks the user will try to do. The desired system will be a web-based data visualization tool so that all necessary windows and graphical control elements have to be hand -sketched to perform the task (Snyder, 2004). For more flexibility, every representation of a computer screen and subsequent screen should be drawn separately. The researcher will represent the “human computer” which is activated by the user identifying different tasks. Every click command on the fictional browser triggers the replacement of the current screen with the subsequent. The “human computer” replaces the hand sketched screen with another screen because the user followed a link, tried to make an entry or pressed button (Linek and Tochtermann, 2015). Paper prototyping is popular due to the inexpensive nature of the material and the way it can be tailored flexibly to explore immediate design ideas. Another advantage is the increase of creativity during crafting of the experimentation and communicating with the users. Often users feel more comfortable criticizing the hand crafted system. Disadvantages are also associated with paper prototyping especially concerning validity and bias (Snyder, 2003). Developers often realize that they have made limited assumptions how every user will perceive the prototype. Snyder elaborates that users might criticize how paper can simulate real usability problems and the lack substantial evidence. Bias can reach from unrealistic test settings, impressions of the user interface, to filtering and prioritization of information received from the prototype. An Austrian study from the Center for Usability Research & Engineering (CURE) argues that low-fidelity computer based prototyping can have the same quality of user statements as paper based prototyping (Sefelin et al., 2003). A paper prototype is preferable if you want to include members of the design team who lack sufficient software skills (Sefelin et al., 2003b). Team building even though not practiced during this research can be encouraged.

2.5 Google Fusion Tables Google Fusion Tables present an experimental data visualisation web application which has become popular to share, manage and collect data. Fusion Tables retrieves geocode coordinates from input data by using a combination of free Google services to represent spatial data (Shepard, 2014). Further Shepard reasons that using Google Fusion Tables is an uncomplicated way for intermediate skilled web developers to create a robust data management (Shepard, 2014b). To be able to perform some simple database queries the input data has to be structured in tables before uploading to Google Fusion. Google Forms and Google Spreadsheets are initially created with data containing text fields with latitude and longitude information. Alternatively, the data are generated in Excel and formatted as CSV files. After the spreadsheet is imported into Google Fusion, the application allows the mapping of data points and the automated synchronizing in Google Drive. Essentially, the visualized data can be shared by changing the permission of new tables that are private by default to a public view. After visualizing the desired data, the map properties can be customized by alternating the script code. Kathryn Hurley, Developer Programs Engineer for Fusion Tables, provided a useful guide how to make a map with Google Fusion Tables. Her guide forms the necessary base for this individual project (Hurley, 2011) (See Chapter 3 Methods). 2.6 Google Maps JavaScript API The Google Maps API allows writing an application which directly embeds the Fusion Table content on a web page with JavaScript. While Google Fusion Tables present a highlyinteractive image archive, the API overlays objects from multiple tables to create a more 8

complex map. For example, data journalists made a collection of 16 Google Fusion border files available online which can be overlaid on Google Maps to show various regions and administrative districts (http://datadrivenjournalism.net/resources/ borders_and_boundaries_16_google_fusion_border_files_for_you_to_use). 2.6.1 HealthMap HealthMap (www.healthmap.org) is an example of an established web accessible data visualization system which addresses emerging disease threats on a global scale. The system is free, automated and real-time. A group of researchers, epidemiologists and software developers at Boston’s Children’s University Informatics Program created HealthMap utilizing informal online sources from the WHO, public health officials, nongovernmental organizations, etc (Alexander, 2014). Their goal consisted of monitoring and updating disease outbreaks real-time and automated by aggregating and mapping outbreak data with location data. HealthMap relies on public available open products like Google Maps and the Google Map API for PHP to create this free application. The map markers represent disease alerts over a period of time which can be specified. The marker colours derive from Red-Yellow-Blue colour model and indicate the importance of the disease or respectively the volume of the alerts. The marker size represents if the alert is at a country or local level. Copying the concept the colour scheme displays the different causes of death and the marker size the level of uncertainty. Another function on HealthMap is time series which is looking at the number of alerts in the country of your current location within the past 12 months for a specific disease. This feature could be approved for this project. For the verbal autopsy map, the researcher wants to pinpoint the cause of death in a specific year. 2.7

Website development

Dreamweaver CS5 approaches building pages visually by easily choosing buttons, dialog boxes and panels (McFarland, 2010). McFarland explains in his manual how to write simple HTML pages and offers templates which are immediately fully interactive. By choosing Dreamweaver the GUI Design already exists and only needs to be slightly adjusted. Further the book includes W3Schools (https://www.w3schools.com/) tutorials so that the web developer can get acquainted with the markup language, application tools and code to build a functional web page. 2.8 Evaluation The objectives initially defined prior to developing the visualization tool must be evaluated to state whether the project was successful or not. The researcher and user can have different views of the product and the evaluation process can identify the concise weaknesses. Researchers from various disciplines use survey questionnaires to measure the input and output of their projects (Lavrakas, 2008). Different methods exist to evaluate a functioning web based system. The CoD data are primarily qualitative. The project research will help gain an understanding of data collection in the INDEPTH countries and projecting the information on the customized map. Evaluating qualitative data often uses a mixture of unstructured or semi-structured techniques (Henderson and Segal, 2013). The evaluation will include open-ended survey responses of the interviews. However, the researcher simultaneously wants to evaluate the user’s attitude towards the website. Therefore, a multi-methodical approach is possible using equally means of quantitative data. Questionnaires often follow the format of Likert scales in quantitative analysis due to the effectiveness and easy construction. The respondents will be able to 9

indicate their level of agreement and disagreement while testing the tool. In conclusion, the user’s interest towards the product can be determined and if necessary altered to satisfaction and mass production.

10

3 Methods Chapter 3 describes the methods chosen for design, implementation and evaluation of the web-based verbal autopsy visualization tool. 3.1 Paper prototype The methodical approach of initiating the paper prototyping is based on the multi-method approach for usability testing by Linek and Tochtermann, two German experts in usability and media informatics (Linek and Tochtermann, 2015).The focus lies on paper prototyping while the two supplemental methods (advanced scribbling and a handicraft task) are neglected due to the frame of this project and the initial project proposal. The study was conducted at two different locations, yet mostly at the City University London giCentre. The sample comprised three persons (two males and one female, age between 31 and 50). And the session lasted in average ten minutes. The test material entailed a prepared map and paper snippets of different functions. The test session compromises the system review by interacting with the sketches following the semi-structured interview (Figure 4 and 5).

Figure 4: Paper prototyping Part 1

11

Task 7 Task 2

Task 4

Task 6

Task 5

Info box Task 1 Task 8

Task 3

Legend

Figure 5: Paper Prototyping Part 2

During paper prototyping, the researcher acts as “human computer” and switches between screens with every click of an active field. The presented page was the home screen with a world map. The participants manoeuvred through eight usability tasks without identifying how many tasks have been correctly identified. The main tasks included: Task 1: Zooming (+, -) Task 2: Comparing mortality types by different factors (Country, Region, Education, GDP, Gender) Task 3: Adjusting the time period Task 4: Find address (a text field follows where an address can be put in manually) Task 5: My location (the centre of the map will readjust to your current coordinates) Task 6: Engaging with the language sign Task 7: Engaging with the settings sign Task 8: Interacting with the circles on the map (opening up individual info boxes) Afterwards feedback was requested by the researcher through a quality survey.

12

3.2 Website Implementation Before creating the web-based application, a local environment has been installed which installs up-to-date versions of Apache, PHP5 and MySQL on operating systems. The WampServer, in this case, refers to software for Microsoft Windows systems. The website content has been developed with Adobe Dreamweaver and edited on Notepad++. Dreamweaver includes pre-made CSS templates which are easy to manipulate and which follow the HTML5 and CSS3 standards. 3.3 Data Collection The core data encompasses CoD data which will be manipulated gathered from online databases allowing access to demographic information. 3.3.1

INDEPTH Data Repository

The International Network for the Demographic Evaluation of Populations and Their Health (INDEPTH) Data Repository provides our demographic data which will be manipulated for the visualization tool. By enabling anonymised longitudinal microdata of low- and middleincome countries, INDEPTH is a valuable resource for analysing health issues in different aspects, e.g. causes of death. The web repository is accessible for public use as soon as the registration process has been successful (http://www.indepth-ishare.org/index.php/about). The dataset used for the visualization tool contains CoD data based on VA interviews, released in 2014 from 14 countries within Africa, Asia and Oceania (Sankoh et al., 2014). The study is typed as demographic surveillance, and the collected data encompasses cause-specific mortality information from 1992 to 2012. Vital events were registered by trained lay field-workers who recorded any deaths followed up by VA interviews. The provided data had to be altered before uploading the excel spreadsheet in Google Fusion. The microdata file has been downloaded as .csv so that the data needed to be parsed. The data file CODA_2013_v7_Anonymised contains 17 variables (32 B, Figure). The INDEPTH Centre Code provides information about the site where the data were collected. However, the alphanumeric value will not be recognized by Google Fusion Tables as a location. Therefore, two columns have been inserted "country" and "city", e.g. site "ZA031" = country "South Africa" city "Africa Centre" (Appendix B, Figure 33). Even though Fusion Tables can automatically geocode data from the name of the country, latitude and longitude need to be inserted. To avoid overlapping multiple subjects on one map point, the longitude and latitude data are jittered using a function for Random Numbers in Excel using the following formula Score + (RAND()-1)/10). As a first step, a new column has been added for the modified variable. Then, the variable Score is substituted with the appropriate cells in the COD data where cells E = latitude and F = longitude. The random number represents the jitter and following the instructions of Nikki Marinsek, Dynamical Neuroscience Graduate Student, the location data is spread out and ready to use (Marinsek, 2014). 3.3.2 UNESCO Institute for Statistics (UIS) After preparing the mortality data, it can be merged with data from different online databases so that the data can be explored in detail. The idea is to answer questions like: "How do GDP and mortality connect?" or "If you are illiterate, is the risk of contracting specific diseases higher?” The UN depository UIS enables cross-nationally comparable statistics to data experts and data browsers.

13

Therefore, the UIS GDP data and education statistics present the data source to analyse the cause-specific mortality data. The collected information consists of data from official administrative sources at the national level (UNESCO Institute For Statistics, 2016). The INDEPTH data encompasses the time frame from 1992 to 2012 so that the same years were selected for GDP and literacy. The GDP is measured in current US Dollars (US$) and exported from UIS Stats as .csv file. The literacy data, on the other hand, is concentrated on adult literacy (population 15+ years, both sexes). Different symbols are used within the data set. Empty cells are displayed as (..), because no data was available for those years. The symbol (+) indicates national estimation, while (‡) indicates UIS estimation due to incomplete country coverage. While importing the data to Google Fusion, the symbols are blended out. 3.3.3 KML files While exploring Google Fusion Tables’ functions, Keyhole Markup Language (.kml) can be incorporated as a solution to specify geospatial information. For the project, the researcher wants to outline the 14 INDEPTH countries and the sites where the data has originated from. In order to visualize this specific information, country and city or/and area boundaries need to be uploaded unto Google Fusion. The geometry column would then automatically be detected as a column of type Location. Simon Rogers, a data journalist, provided a collection of 16 Google Fusion border files (Rogers, 2013a). The file “World borders (inc South Sudan)” on Rogers’ website (https://simonrogers.net/2013/01/28/borders-and-boundaries16-google-fusion-border-files-for-you-to-use/) has been merged with the CoD anonymised data (Figure 6).

Figure 6: World borders (inc South Sudan) (Rogers, 2013a) displayed in Google Fusion

Outlining the city borders presented a more complex task due to the lack of available oline .kml data. As Roger describes on his web content, polygons can be created via Google Earth. The researcher located the sites via Google Earth and then drew polygons following the border lines. While e.g. Nairobi (Kenya) has clear border lines, the site Bendafassi (Senegal) will not be shown on regular maps, so that the next greater town Kedougou will be visualized for the purposes of the project (Table 1). After drawing the polygons, they will be saved in Google Earth and exported as .kml file, so that they can be merged with the different acquired data.

14

KML City Boundaries Country Demographic KML Surveillance Site Bangladesh ICDDR-B : Bandarban Bandarban ICDDR-B : AMK Abhoynagar, Mirsarai, Kamalapur ICDDR-B : Matlab Matlab Burkina Nouna Boucle du Moukoun Faso Ouagadougou Ouagadougou Cote Taabo Lagunes d'Ivoire Ethiopia Kilite Awlaelo Tigray Ghana Navrongo Upper East: Navrongo, Bolgatanga, Garu, Bawku Dodowa Greater Accra India Vadu Constanta Ballabgarh Haryana Indonesia Purworejo Purworejo: Regency Kenya Nairobi Nairobi Kilifi Kilifi Kisumu Kisumu Malawi Karonga Karonga Senegal Bandafassi Kedougu South Africa Centre KwaZulu-Natal Africa Agincourt Mpumalanga The Farafenni North Bank Division Gambia Table 1: KML City Boundaries

Next step included drawing a radius around the map markers of the COD data to visualize their uncertainty. Free Map Tools (https://www.freemaptools.com/radius-around-point.htm), a free online resource, helps to overlay the multiple mark-up elements on a map. The idea is the more the cause of death is certain the larger the radius and the circle of the marker. For example, ID: 6, Country: Ghana, City: Navrongo, Cause: 99 Indeterminate has a likelihood (lik) of 0.01%, therefore we assigned the radius (in km) by multiplying lik by 10 and moving the decimal point one place to the left (0.1 km). The weighted likelihood (wt) was ignored for the data demonstration. The map elements can either be individually placed on the map and the radius will be displayed around longitude and latitude of the point. However, the COD data consists of 65536 lines of data which would rather be unpractical. Therefore, the CSV option allows you to upload bulk points to the map by formatting as the convention latitude, longitude, radius(km), label(1 text character / optional) (Free Map Tools, 2007). A separate spreadsheet has been drafted (Table 2) containing the information and uploaded to the website. After creating multiple radii on the map, the tools allow exporting all radii by generating Google Earth KML output.

15

Table 2: CSV Upload Free Map Tools

3.4 Creating Google Fusion Map Three different csv. files and three .kml files were required for the VA Map, to appropriately identify the geographical information. First, the UIS GDP file is uploaded; separately the adult literacy file and then both are merged together by a common criterion, an assigned ID number. Now, anybody can perceive several small red place marks scattered across a map. Then the newly created file is merged with the world borders inc. South Sudan file. As a second step, the city boundary file has been uploaded and visualized on the map. The third and main file CODA_2013_v7_Anonymised has been altered by adding description and geometry. As described in paragraph 3.3.3 KML files, the location data of the COD data has been appointed geometry information that can be merged with the original file by their kml_id. After the three different tables have been designed, the default info window and features style can be altered. We appointed for each of the 60 causes of death a different color. Colorbrewer enables color schemes that are colorblind safe. The nature of the COD data is qualitative, however there are no color schemes that match these criteria for the number of data classes. Therefore, a diverging 11-class RdYlBu color scheme was chosen. The colors for this scheme as a JavaScript array are: ['#a50026','#d73027','#f46d43','#fdae61','#fee090','#ffffbf','#e0f3f8','#abd9e9','#74add1',' #4575b4','#313695']. The reference device profile is sRGB IEC61966-2.1 on the Windows Colour System. 16

Due to the amount of causes, we altered the opacity of each class color 6-7 times so that we have a greater differentiation. Making the tables accessible is the last step before layering the different tables. By default, every crafted Google Fusion Table is private. To call the Fusion Tables data, the visibility option has been changed to "Anyone with the link". 3.4.1 Fusion Tables Layer Wizard The Fusion Tables Layer Wizard automates the creation of multiple layers with Google Fusion Tables. By embedding the individual Fusion Tables links, map layers are added. Each layer can contain one search function that can either be select-based or textbased. Three select-based search options can be enabled with the labels refereeing to the columns of the CoD data. These options allow the map's users to query the results and refine the data shown on the map. The wizard customizes map size, zoom and style base of the map as needed. Then, the wizard creates an HTML code that uses the Google Maps JavaScript API v3 (Appendix C).

3.5 Customizing visualization with Google Maps JavaScript API After using the wizard, Google Maps JavaScript API v3 allowed to alter the embeddable HTML code. We added a legend to the map explaining the colours of the map markers. We have 60 causes of death and 11 classes of the color scheme (Appendix C , Table 6). Each class was divided in 5 or 6 hues by changing the opacity and assigned to the causes of death. The Maps API provided the code to possibly introduce a second select-based search on the third map layer. So the whereClause (see Chapter 4 Results) was altered and the code was run. 3.6 Evaluation For the evaluation two questionnaires were drafted. The first survey with open questions relates to the paper prototyping and the other provides feedback for the implemented website (Table 3 and 4). The first questionnaire with 4 questions purely incorporated open questions to deliberately engage the participant in the creative design process. The future user’s needs and expectations would be reflected in the development of the web based application. The second questionnaire embraced a balance of scaled questions with a 5-point response system summarizing the progress and open questions which will further keep the user thinking. This questionnaire has nine questions in total. After the survey responses have been collected, they will be visualized in Excel and its visualization tools. Questions Paper Prototyping Question 1: What do you think this page is made for? Would you stay at the page? Why (not)? Question 2: What is your opinion about this page? What do you recognize? Question 3: What can you make with this page? 17

What are the single elements made for? Optional: What do you like? What you don't like? What is confusing? Do you miss something? Question 4: Any further comments? Table 3: Questionnaire paper prototyping

18

QUALITY Scale Survey For each item identified below, circle the number to the right that best fits your judgment of its quality. Use the rating scale to select the quality number. Scale N o t

VA Map Questionnaire

a t

l

M o d e

i

r

g h

a

S

t a

l

l

y

l 1. 2. 3.

How good was your knowledge of cause-specific mortality data before being introduced to the verbal autopsy tool? How accurate do you think the information is on the website? How easy is it to understand the information on the website?

t

E x t r e m e l y

V e r y

e l y

1

2

3

4

5

1

2

3

4

5

1

2

3

4

5

4.

Did the visualization tool give you any new insights?

5.

What new knowledge did you gain?

6.

How visually appealing is our website?

1

2

3

4

5

7.

Overall, how well does our website meet your needs?

1

2

3

4

5

8.

How likely is that you would recommend this website to a friend or colleague?

1

2

3

4

5

9.

Do you have any other comments about how we can improve our website?

Yes

No

Table 4: Quality Scale Survey VA Map

19

4

Results

4.1 Paper prototyping evaluation The three participants answered the questionnaire during the session of paper prototyping. Each session took an average of 7 to 10 minutes including answering the questionnaire. We summarized the participants’ answers: All participants recognized the page as an interactive map. However, they missed that the map and the action buttons were not in a container or window which would have given the illusion of a web page. Intuitively, an average of 6 tasks (Task 1-2 and 4-7) was immediately recognized on the paper prototype (see chapter 3, Figure 5 for reference). 1 participant liked the design of the website. 1 other participant liked the proportion of the map. The map should occupy most of the website. 2 out of 3 participants thought task 3 was confusing. It was not clear that it was a drag and drop element along the line which represents time in years. They suggested making a search filter instead of timeline slider. 1 participant did not click on the map markers because it was not obvious that they contained info boxes. The participant only interacted with the buttons. All three testers did not recognize the legend in its function but they tried to interact with it. For further comments, 2 participants suggested fewer buttons because the page looked cluttered and unorganized. 1 participant asked what the settings button included and if the button could be made redundant. 4.2 Data Collection While collecting and sorting the data for the map creation with Google Fusion Tables, two excel spreadsheet files were produced. CODA_2013_v7_Anonymised_jittered_location.xlsm contains the causes of death, their likelihood, jittered location data and geometry information for the Google Fusion Tables’ map. The Adult Literacy and GDP file were merged containing the literacy rate and the GDP in USD (DEMO_DS_Adult_Literacy.xlsm). Input: CODA_2013_v7_Anonymised.csv (Appendix B, Figure 32) EDULIT_DS_09012017045046355.csv: Adult Literacy INDEPTH Countries (Appendix B, Figure 34) DEMO_DS_30012017101402059.csv: GDP INDEPTH Countries (Appendix B, Figure 35)

Output: 153MB CODA_2013_v7_Anonymised_jittered_location_final.xlsm (Appendix B, Figure 33) DEMO_LITERACY_UIS_DATA.xlsm (Appendix B, Figure 36)

20

4.3

Maps Creation

4.3.1 Layer 0 The first layer consists of the World Boundaries (incl. South Sudan) Google Fusion border file that was merged with DEMO _LITERACY_UIS_DATA.xlsm (Figure 7). The border width was increased to 3px and the border colour set to a grey shade (#666666). The fill colour has been made nearly transparent so that the terrain details can still be read.

Figure 7: Merge of DEMO_LITERACY_UIS_DATA and World Boundaries incl. South Sudan

While merging the excel data UIS GDP and Literacy with World Boundaries in Google Fusion, two country shapes were not displayed. Cote d'Ivoire and Vietnam were excluded from the map. As a result, you could not filter the data by these two countries. The name of the country is the common criteria for merging both tables. However, the World boundaries file contained the English name of Cote d'Ivoire (Ivory Coast) and there for did not merge accordingly (Appendix B, Figure ). The UIS data spelt Vietnam as "Viet Nam", so it was not recognized either in Google Fusion.

4.3.2 Layer 1 After creating the polygons of the INDEPTH sites on Google Earth, the file INDEPTH sites.kml has been produced (Figure 8). Then that file was uploaded on Google Fusion Tables (Figure 9). The file will serve as second layer on the map. The border colour and width follows the structure of layer 1.

21

Figure 8: KML demographic surveillance site boundaries Google Earth

Figure 9: KML demographic surveillance site boundaries displayed in Google Fusion Tables

4.3.3 Layer 2 Uploading the main data file CODA_2013_v7_Anonymised_jittered_location on Google Fusion Tables, displayed the causes of death data on the map (Figure 10).

22

Figure 10: Causes of death data markers displayed in Google Fusion Tables

After selecting the geometry in map configurations, the polygons are shown. The size of the polygons refers to the uncertainty of each cause of death (Figure 11). The fill colour of the polygons ranges from red to blue as described in chapter 3. Google Fusion allows you to apply icons based on data. We selected the column “causen” (numeric value of each cause of death) containing data from CODA_2013_v7_Anonymised. So the background colours were divided into 61 buckets where every single bucket represented a different cause of death (Figure 12).

Figure 11: Cause of death polygons in India displayed before changing feature styles in Google Fusion Tables

23

Figure 12: Cause of death polygons in Bangladesh displayed in Google Fusion Tables

4.3.3.1 Markers The map containing the CoD data shows excessive overlapping of the markers due to multiple clusters in a region. The different sizes of the markers are visible, yet not the user would need to zoom in very closely to access every marker and open every info window. Alternatively, more filters need to be implemented to reduce the information on the map or fewer data have to be available to be analyzed. While visualizing the data points on the map, a multitude of markers happen to lie outside of the demographic surveillance site. At least two different reasons are responsible:  

The markers overextend their boundaries due to their radius determining the marker size. The random noise added prevents to some extent the overplotting in statistical graphics (specifically on the designed map). However, the formula jittering the data does not conform to the site boundaries. Some markers are accidentally positioned in inshore waters or surroundings oceans.

4.3.4 Fusion Tables Layer Wizard After creating the layers in Google Fusion, we uploaded them to the layer wizard to build multiple layered map. We needed either the link of the three public accessible maps or their Table ID. We added layers according to the sequence of their creation (Figure 13): 1. Merge of DEMO_LITERACY_UIS_DATA and World Boundaries incl. South Sudan Encrypted Table ID: 1YALUlxvmOVFzqDOLldtxiVuBSbvpEFBrPw1bGu4z 2. INDEPTH surveillance sites Encrypted Table ID: 1tm6tas4tRuAy9A4sphVvpbQI4nFkuXuYSx1DqzBN 3. CODA_2013_v7_Anonymised_jittered_location_corrected_final Encrypted Table ID: 1xFpREagaatky3vHFWC5ndMs6g5uoNFsxyCt-_vwY Then we added for each layer a select-based search feature.

24

Figure 13: Creating multiple layers with Fusion Tables Layer Wizard

4.3.5 Legend After creating the map, a legend providing information about what each marker represents was placed on the map (Figure 14). The JavaScript API helped styling the legend. However, due to the amount of causes of death, we used the numeric cause of death codes and ranged several causes within a hue of colour.

25

Figure 14: Legend on layered map

We grouped the causes of death in descendant order of their numerical value and assigned a colour (Appendix, Table ).

4.4 Website Implementation The local environment WAMP was successfully installed on the Windows 10 operation system that was provided for this project. The header contained information about the template used for the website with a link to the CSS style sheet (Figure 15). A time value was localized on the website presenting the current date.

Figure 15: Website header

26

The body contained a link to another element on the website (Find out more about…) (Figure 16), the customized legend (Figure 17) and the Google Maps JavaScipt API is defined (Figure 18).

Figure 16: Link to website element

Figure 17: Legend style

Figure 18: Defining the Google Maps API JavaScript

Then the variables for map, layer and legend were declared (Figure 19) and the function for the base map was initialized in the script (Figure 20). The map was centred at a random INDEPTH site (Ballabgarh, India), with a zoom level and map type.

27

Figure 19: Declaration of variables

Figure 20: Initialize function

Each layer function queries the specific Google Fusion Tables column with the correspondent Table ID (Figure 21).

28

Figure 21: Layer function

Then each layer encompasses a changeMap function to search the layer and filter the information from the Fusion Tables (Figure 22).

Figure 22: Select based search function

After loading the functioning code, the website was ready to use. However it is not launched and not made publically accessible (Figure 23 and 24 ).

29

Figure 23: VA Map v1

Figure 24: VA Map v2

4.5

Evaluation

4.5.1

Testing

After creating the interactive map and implementing the website, the application was tested in the City University GUI Centre under the supervision of Dr Jon Bird. We concluded to change two components: •

the overplotting of the markers in the demographic surveillance sites

•

and adding a filter regarding the time period of each cause of death at specific sites.

30

4.5.1.1 Solution overplotting: Successful The goal is to evenly distribute the markers within a polygon boundary. In trigonometry, if you know the length and width of the polygon, you can calculate the surface area and circumference. However, the shapes of the demographic site polygons are irregular. Therefore we fill in or rather overlay them with known shapes like a rectangle (Figure 25). The rectangle (x,y, width, height) would be identified as a new kml boundary and the marker points (px, py) would lie within the boundary if (px > x && px < x + w) && (py > y && py < y + h). width

x,y

x+w,y

height x,y+h

(px,py )

x+w,y+h

Figure 25: Geometrical concept of dispersing the data randomly within a polygon

In Google Earth, we drew manually multiple rectangle polygons covering the most of the area surface within each site (Figure 26). Some rectangles cover more area surface than others so that we had to evenly distribute the data of each site. Therefore, we created a pivot table in Excel showing the number of causes in each country and as a subcategory in each surveillance site (Figure 27). Then we spread the count of causes in each rectangle. For example, Nouna in Burkina Faso contains 3782 causes and we drew 4 rectangles. We estimated that rectangle 1 and 2 each contain 1500 markers, while rectangle 3 contains 700 and the last rectangle 83 markers (Table 5).

Figure 26: Rectangle polygons within KML boundary of Burkina Faso created in Google Earth

31

Figure 27: Pivot table count of causes of death in INDEPTH countries

Table 5: Pivot table for rectangle polygons

We jittered according width and height and inserted them in Excel as longitude and latitude with the formula: rand()*(b-a)+a.

32

This formula constrains the distribution of the markers to the rectangle shape. A random real number between a and b is generated. For the purposes of our project we inserted into the formula these values: For latitude: rand()*((x+w,y)-(x,y))+(x,y) For longitude: rand()*((x+w,y+h)-(x+w,y))+(x+w,y)

Lastly, the coordinates were created in the excel file CODA_2013_v7_Anonymised and the geometry was added before uploading the file to Google Fusion Tables (Figure 28 and 29).

Figure 28: Distributed Markers in India

Figure 29: Geometrical view: Markers jittered according to RAND() formula and rectangle kml boundaries

33

4.5.1.2 Solution for searching layer by two columns: unsuccessful The search function in Fusion Tables Layer Wizard uses a JavaScript function to query one of the columns in the Fusion Tables. To query and additional column, we altered the where clause in line 150 of the code. Here is an excerpt of the JavaScript code: function changeMap_2() { var whereClause; var searchString = document.getElementById('search-string_2').value.replace(/'/g, "\\'"); if (searchString != '--Select--') { whereClause = "'year' = '" + searchString + "'"; } layer_2.setOptions({ query: { select: "col23", from: "1xFpREagaatky3vHFWC5ndMs6g5uoNFsxyCt-_vwY", where: whereClause } }); } However, the search functionality broke down completely. Any points on the map that should be there were not shown or the map disappeared from the webpage. Google Fusion does not allow an OR-where-clause which restricts the initial idea of the combined filter search. An alternative was creating a fourth layer, making two separate queries and combining results. Unfortunately, if one filter was applied the map would not change because the second layer covered the results.

4.5.2 Questionnaire VA Map While developing the prototype involved three testers, the final evaluation was held in front of 8 users. Three participants were students, two with a health science background and one a computer scientist. One participant was a colour scientist. Lastly, the four others were non specialists who work with the researcher in the NHS. The questionnaire was anonymised without taking in any demographic information. The researcher’s laptop was provided with the implemented website. Each user was placed in front of the screen and was left to explore the page without any further instructions. However, questions could be asked. If every function has been found and tested, the questionnaire would be filled out. The evaluation took at least 2 minutes and at most 5 minutes. The questionnaire results were analysed and visualized and summarized with Excel Charts (Figure 30 and 31). While stating the results of the questionnaire, first the Likert Scale questions will be shown referencing Figure and then open-ended responses will be summarized. For question 1, two said their knowledge was not good at all; four had slightly more knowledge, on had moderate knowledge and one had extremely good knowledge of causespecific mortality data. For question 2, three users said the information on the website was moderately accurate, while four users experienced the information very and one extremely accurate. For question 3, one said the information was not easy to understand, two said slightly, two moderately and one very easy.

34

For question 6, one said the website is slightly visually appealing while seven others said it is very visually appealing. For question 7, one said overall the website slightly meet his or her needs, four said moderately and three very well. For question 8, one would not recommend the website, two would slightly, three moderately and two extremely recommend.

Questionnaire VA Map 10 9 8 Not at all

7

Slightly 6

Moderately

Count of 5 Answers

Very Extremely

4 3 2 1 0 1. How good 2. How 3. How easy 6. How 7. Overall, 8. How likely was your accurate do is it to visually how well does is that you knowledge of you think the understand appealing is our website would cause-specific information is the our website? meet your recommend mortality data on the information needs? this website to before being website? on the a friend or introduced to website? colleague? the verbal autopsy tool? Questions Figure 30: Responses from the questionnaire Questions 1-3 and 6-8 Part 1

Question 4 was not Likert scaled to get a more general consensus if any knowledge was gained from the website. 7 out of 8 said yes they gained new insights and one said no they had not.

35

10 9 8 7 6 Count of 5 Answers 4

Yes No

3 2 1 0 4. Did the visualization tool give you any new insights? Figure 31: Responses from the questionnaire Question 4 Part 2

Question 5 coped with the knowledge gained. Five participants learned how the causes of were broken down in the country researched. They said the VA tool helped analyse the cause and likelihood of deaths in low-income countries. Two users discovered in particular that even though the data marks out to have been collected between 1992 and 2012, only South Africa had information from 1992 until 1997. From 1998, West Africa followed, then year 2000 Oceania and Asia from 2002. One participant therefore said the data collection in Sub-Saharan Africa was not consistent. Summarizing, the answers of question 9, most wished to filter the causes of death individually apart from the colour scheme separating each cause. All users mentioned that the info boxes and the legend were not as informative, only after explanation from the researcher. They wished for an additional page explaining the key indicators. One individual expressed that if the filters of country and site would actually jump to the location instead of just highlighting, the app would be friendlier to use. Another user mentioned that the filter of the surveillance site does not indicate where the sites are. If the country was chosen, then sites within should be only highlighted for selection.

36

5

Discussion

5.1 Objectives Revising the objectives introduced in chapter 1, they will be examined in comparison to the results. Every objective will be individually examined. Some objectives were mirrored in the questions asked and rated successful or unsuccessful on the number of responses. The criteria of success can be directly taken from the results of the quality survey. Other achievements of objectives stem from the literature review and the knowledge gained through stages of the project. Objective 1, “Strive for form and function” refers to the aesthetic appearance of the tool and how well it functions. Question 6 asks “How visually appealing is our website?” and 87.5% responded very appealingly. Objective 2, “Justifying the selection of everything we do”, explains every function on the website and that every of its intentions was fulfilled. A detailed literature review helped delimit what the application should consist of. Developing the prototype was the first indicator of avoiding redundant or unnecessary functions. The Google Maps JavaScript API regarding Google Fusion Tables summarized the possibilities of map customization by adding multiple layers. The Fusion Layers Wizard assisted in setting up the map and the desired filter before embedding on the web page. The functions which could not be done by the researcher were explained in chapter 4. Objective 3, “Creating accessibility through intuitive design”, shows the user friendliness of the application. During the prototyping, 8 tasks were initially proposed for the website. However, the feedback from the testers scaled down the tasks to the amount of four. The common consensus showed that the interface appeared cluttered and confusing. So the implemented website encompassed the home page, the “Find out more about” link, 3 search drop-down buttons and map markers with info boxes containing the mortality data. The 4th and last objective “Never deceive the receiver” is essential so that the visualization ethics are obeyed (Kirk, 2012). The display of the markers was not completely desirable as mentioned in chapter 4.5.1.1 and 4.5.1.2. Yet, the entire data set was published within the application. Not every marker can be individually seen. And the search function cannot narrow the data even more due to the restrictions of Google Maps JavaScript API. The survey responses showed that 62.5% of the respondents think that the information on the website is very accurate and extremely accurate. 50% said that the website only meets their needs moderately. The result means that there is space for improvement. Even though for most users the information seems valuable, a better web design would increase the overall rating. In summary, the four objectives led to achieving the goal of interpreting mortality data on a global and cross-regional scale by plotting latitude and longitude coordinates on a map. Even though the user group was small, the sample was generally welcoming the tool. 5.2 Academic Context Within the academic context studying verbal autopsy and its data seemed useful to get insight on the matter before building the tool and where the difficulties lay of displaying. Mobile VA Data developed the idea of simulating GPS coordinates which can easily be collected by the mobile’s network. Reviewing the power of data visualization led to gaining knowledge about the use of colour which has been reflected in the map marker and legend colours. Different users should be able to analyse the data even though impaired. The literature suggested making the application colour-blind safe and using yellow-blue dimensions. If the evaluation would be 37

repeated, emphasis should be made of users with vision deficiency and if the map could be therefore improved. Paper prototyping was very effective because users would criticize constructively and show the researcher how limited the initial design idea was. However, literature found did not mention how specific the drawings would need to be and if that would have an effect on the user. Researching Google Fusion Tables and the Google Maps JavaScript API helped developed most functions of the current application. Yet, finding more information on coding problems were not all found in the documentation. Therefore Google Fusion Help groups and Google’s issue tracker were consulted. Fusion Tables had a large community of users that helped understanding which steps would need to be altered. The literature review on evaluation methods gained a rather positive feedback from the users through the right questioning. However, it is arguable how bias the feedback was if a thematic analysis would give a different view on the whole project. 5.3 Generalisation and validity Building the web based tool did not demand the use of expensive resources. The prototyping required paper and creativity and a test group to develop ideas. The preparation, collection and final analysis of the data used the common spreadsheet application Excel which is preinstalled on many operation systems nowadays or available in the Microsoft Office package. Google Fusion served as data management platform without needing training in creating and managing databases. Creating a Fusion Table enabled gathering and visualizing data easily. Yet, the whole process was rather time consuming because we were plotting a large data set on the map. The CODA_2013_v7_Anonymised.csv file contained 65536 rows of CoD data without any indication of specific location data. Making the process automated could be considered in software production. Latitude, longitude and geometry were allocated to each row. The coordinates were jittered to minimise overplotting and actually showing the distribution within the surveillance site. The geometry was created by using a different web tool drawing the radius around each point and marking the uncertainty. As mentioned in chapter 2.2.4 if the VA data can be collected via mobile phones, the GPS coordinates could be included and immediately visualized. The provided data has been verified by the INDEPTH network and made accessible in their data repository. But for some users seeing the data on the map did not mean that the data is automatically valid and accurate. Linking the data files on the website might improve the perception of validity and authenticity of the data. Additionally, all key indicators need to be explained. If the accuracy of the map markers was secured, the next step would be rebuilding the website. Throughout the project, objective 1 was considered vital for success. However, function was more prioritized than aesthetics. Even though most consumers found the website to be appealing, more could be implemented without cluttering the page. While the non specialist testers were easily pleased by the tool, the experts wanted some more attributes. One health population scientist who liked the tool wished that maps could be saved and eventually shared with colleagues or social media platforms. In order to save maps, creating user accounts should be considered where personal account information would be registered and entered into a separate database. 5.4 Recommendations Health organisations would gain advantage from the tool and should be contacted for sponsoring. However, the organisations in high-income countries push to get the necessary

38

data for their health interventions. Middle- and low-income countries benefit the most from VA and cause of death data. The designed tool could be an alternative method to expensive software products to visualize data for health population scientists. The data used was specifically customized for the INDEPTH Network cause-specific mortality data released in 2014. Before recommending the application, a different data set might need to be plotted on the map and then evaluated. And a larger testing group would be necessary to get a normal distribution.

39

6

Evaluation, Reflections, and Conclusions

6.1 Critical review The literature review proved to be very effective starting the project keeping the goals in mind which wanted to be achieved. The knowledge about mortality and verbal autopsy was important to explain the purpose of this project to the users. Especially participants with no background of health population sciences or epidemiology were quite interested to know why this tool should matter to them. One participant, a color scientist who was an expert in his field, knew doctors, dentists and former colleagues who were specialized in social demographics and diseases in tropical and low-income countries. He would recommend the visualization tool to them with prior alterations. Paper prototyping as a usability testing technique was a good choice of method because there were no limits to imagination and communication was key with the observers/testers. By sketching the interfaces, buttons and action boxes on paper the researcher had to visualize the future website which was the first design process. Propriety software might facilitate building the website later but it would restrict certain aspects of design. You could not move the different elements around on the interface. Even though the Google Maps JavaScript API was part of the literature review, the limitations should have been known before hand before attempting to build the tool. The researcher did not know initially how many Fusion Tables layers can be added to a map. In general, one can add up to five, one of which can be styled with up to five styling rules. Before using Fusion Tables Layer Wizard, advance techniques were considered using Google Maps API FusionTablesLayer to display the map. However, the map would either only accept styling for points or polygons but not both. Therefore, the wizard was a good choice to go beyond the built-in embeddable code. Simon Rogers (2013) made the decision easy with his step by step tutorial of making the map. The Google Fusion Tables documentation helped altering the script code but it was not as well established. Google Fusion is an experimental application from Google Research which will run accordingly to funding. Alternatively, when drafting the proposal for the project, flexible sketchbook software Processing could have been considered. But yet, Processing was quickly ruled out because it would take an amount of time of the project learning the programme language. Google Fusion was more appealing due to the easiness of creating a map and the familiarity of Google Maps. For mobile use of the app, Google Fusion was the preferred choice as well because Google Maps is already supported on Android, iOS and Windows. For the evaluation, the output of qualitative data answers could have been quantified in using thematic analysis as a method instead of summarizing the responses. However, the multi-methodical approach seemed more appealing for testing group with no expert knowledge. Filling out the Likert scale questions made the process of analysing the data uncomplicated. If the evaluation would be repeated, a larger user group would be involved in analysing the website and giving feedback. Two groups, one specialist and one non-specialist, would evaluate the tool and the results would be compared. Yet, the time frame of three months was not enough to build the tool and get adequate testers. 6.2 Personal Reflections on the project The initial submission deadline for this project was 23. September 2016. Due to extenuating circumstances around illness and hospital stay; the project process was paused multiple times which caused a disruption in the design process. Even though, one would say there was more time to achieve more goals, the problem lay in finding yourself around the project 40

again. The last steps or stage were forgotten and steps needed to be repeated. A project diary would have been useful, also writing down relevant JavaScript API code and continue coding from each exit point. Building the tool gave insight to web development and design. Instead of using a template, the process of conceptualizing electronic files, and determining the layout could improve the site’s interactive feature. However, functionality of the tool was prioritized 6.3

Conclusions

6.3.1 Contribution to health information science and future work During the process of the project proposal, verbal autopsy was declared from various academic resources as a valid and reliable method. The difficulty consisted of creating a comprehensive picture of epidemic threats on a cross-regional level. Anyone wanting to contribute to public health interventions would see a necessity of such a tool. And not just specialists but lay persons as well would be interested in knowing what happens to their next of kin information and eventually their own one day. As Bird and Fottrell (2014)explained it is challenging communicating cause of death and their levels of uncertainty to different groups. The VA Map evaluation showed that visualizing the data on a map helped to bridge the differences. Apart from HealthMap.org which rather functions as an alert system, there are not many existing web visualization tools plotting mortality data on a map with geographical coordinates. In order to improve the website, manpower would be necessary. The researcher built the tool with not much knowledge about web design and JS coding. Recreating the web-based tool for a broader audience would be easier in a multi-disciplinary project group with different professional backgrounds (e.g. in web design, epidemiology, human centred design, et cetera.) and a longer project deadline period. For future projects, larger data sets could be analysed. Another tool would need to be developed of automatically assigning the geometry to each marker. However, data storage, hardware and performance speed need to be considered. The researcher used a Windows 10 laptop to manipulate all the data which consumed a lot of space.. 6.3.2 Summary The entire project can serve as a guideline building an interactive visualization tool with vital information and low resources. Referring to the initial research question of "How can verbal autopsy data be visualized on a global scale to provide insights into patterns of causes of death and their associated uncertainty?" the report could be a possible course of action. The feedback was rather positive with almost all participant understanding sensitive mortality patterns.

41

7 Glossary CoD Cause of Death ICD

the International Statistical Classification of Diseases and Related Health Problems

JS

JavaScript

VA

Verbal Autopsy

42

8 References Alexander, C., 2014. Healthmap. Ref. Rev. 28, 30–31. doi:10.1108/RR-06-2013-0162 Anderson, R.N., Miniño, A.M., Hoyert, D.L., Rosenberg, H.M., 2001. Comparability of cause of death between ICD-9 and ICD-10: preliminary estimates. Natl. Vital Stat. Rep. Cent. Dis. Control Prev. Natl. Cent. Health Stat. Natl. Vital Stat. Syst. 49, 1–32. Bird, J., Byass, P., Kahn, K., Mee, P., Fottrell, E., 2013. A Matter of Life and Death: Practical and Ethical Constraints in the Development of a Mobile Verbal Autopsy Tool, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’13. ACM, New York, NY, USA, pp. 1489–1498. doi:10.1145/2470654.2466198 Bird, J., Fottrell, E., 2014. The Challenge of Communicating Causes of Death and their Uncertainty [WWW Document]. URL http://cognitivegiscience.psu.edu/uncertainty2014/papers/bird_communicating.pdf (accessed 4.19.16). Brewer, C.A., Hatchard, G.W., Harrower, M.A., 2003. ColorBrewer in Print: A Catalog of Color Schemes for Maps. Cartogr. Geogr. Inf. Sci. 30, 5–32. doi:10.1559/152304003100010929 Brownstein, J., Freifeld, C., European Centre for Disease Prevention and Control (ECDC)Health Comunication Unit- Eurosurveillance editorial, 2007. HealthMap: the development of automated real-time internet surveillance for epidemic intelligence [WWW Document]. URL http://www.eurosurveillance.org/viewarticle.aspx?articleid=3322 (accessed 4.24.16). Byass, P., Chandramohan, D., Clark, S.J., D’Ambruoso, L., Fottrell, E., Graham, W.J., Herbst, A.J., Hodgson, A., Hounton, S., Kahn, K., Krishnan, A., Leitao, J., Odhiambo, F., Sankoh, O.A., Tollman, S.M., 2012. Strengthening standardised interpretation of verbal autopsy data: the new InterVA-4 tool. Glob. Health Action 5. doi:10.3402/gha.v5i0.19281 Byass, P., Herbst, K., Fottrell, E., Ali, M.M., Odhiambo, F., Amek, N., Hamel, M.J., Laserson, K.F., Kahn, K., Kabudula, C., Mee, P., Bird, J., Jakob, R., Sankoh, O., Tollman, S.M., 2015. Comparing verbal autopsy cause of death findings as determined by physician coding and probabilistic modelling: a public health analysis of 54 000 deaths in Africa and Asia. J. Glob. Health 5. doi:10.7189/jogh.05.010402 Byass, P., Kahn, K., Fottrell, E., Collinson, M.A., Tollman, S.M., 2010. Moving from Data on Deaths to Public Health Policy in Agincourt, South Africa: Approaches to Analysing and Understanding Verbal Autopsy Findings. PLOS Med 7, e1000325. doi:10.1371/journal.pmed.1000325 Cossman, R.E., Cossman, J.S., Jackson, R., Cosby, A., 2003. Mapping high or low mortality places across time in the United States: a research note on a health visualization and analysis project. Health Place 9, 361–369. doi:10.1016/S1353-8292(03)00017-0 Fantahun, M., Fottrell, E., Berhane, Y., Wall, S., Högberg, U., Byass, P., 2006. Assessing a new approach to verbal autopsy interpretation in a rural Ethiopian community: the InterVA model. Bull. World Health Organ. 84, 204–210. Fottrell, E., Byass, P., 2010. Verbal Autopsy: Methods in Transition. Epidemiol. Rev. 32, 38– 55. doi:10.1093/epirev/mxq003 Free Map Tools, 2007. Radius Around a Point on a Map [WWW Document]. URL https://www.freemaptools.com/radius-around-point.htm (accessed 3.20.17). Fry, B., 2007. Visualizing data, in: The Seven Stages of Visualizing Data. O’Reilly, Sebastol, Calif, Farnham, p. 366. Google, 2016. Create: a map (classic) - Fusion Tables Help [WWW Document]. URL https://support.google.com/fusiontables/answer/1244603?hl=en (accessed 4.24.16). Henderson, S., Segal, E.H., 2013. Visualizing Qualitative Data in Evaluation Research. New Dir. Eval. 2013, 53–71. doi:10.1002/ev.20067 Ingleshwar, V.V., 2007. Usablity Testing for the Web. Queue 5, 34–37. doi:10.1145/1281881.1281891 43

Joshi, R., Praveen, D., Jan, S., Raju, K., Maulik, P., Jha, V., Lopez, A.D., 2015. How Much Does a Verbal Autopsy Based Mortality Surveillance System Cost in Rural India? PLoS ONE 10. doi:10.1371/journal.pone.0126410 King, C., Hall, J., Banda, M., Beard, J., Bird, J., Kazembe, P., Fottrell, E., 2014. Electronic data capture in a rural African setting: evaluating experiences with different systems in Malawi. Glob. Health Action 7. doi:10.3402/gha.v7.25878 Kirk, A., 2012. Data Visualization. Packt Publishing. Lavrakas, P., 2008. Encyclopedia of Survey Research Methods. Sage Publications, Inc., 2455 Teller Road, Thousand Oaks California 91320 United States of America. Leitao, J., Desai, N., Aleksandrowicz, L., Byass, P., Miasnikof, P., Tollman, S., Alam, D., Lu, Y., Rathi, S.K., Singh, A., Suraweera, W., Ram, F., Jha, P., 2014. Comparison of physician-certified verbal autopsy with computer-coded verbal autopsy for cause of death assignment in hospitalized patients in low- and middle-income countries: systematic review. BMC Med. 12, 22. doi:10.1186/1741-7015-12-22 Linek, S.B., Tochtermann, K., 2015. Paper Prototyping: The Surplus Merit of a Multi-Method Approach. Forum Qual. Soc. Res. 16, 1–26. Marinsek, N., 2014. How to jitter overlapping data points in Excel. Nikki Mar. McFarland, D.S., 2010. Dreamweaver CS5. O’Reilly. Meyer, G.W., Greenberg, D.P., 1988. Color-defective vision and computer graphics displays. IEEE Comput. Graph. Appl. 8, 28–40. doi:10.1109/38.7759 Mifsud, J., 2012. Paper Prototyping As A Usability Testing Technique [WWW Document]. Usability Geek. URL http://usabilitygeek.com/paper-prototyping-as-a-usability-testingtechnique/ (accessed 4.24.16). Mikkelsen, L., Phillips, D.E., AbouZahr, C., Setel, P.W., de Savigny, D., Lozano, R., Lopez, A.D., 2015. A global assessment of civil registration and vital statistics systems: monitoring data quality and progress. The Lancet 386, 1395–1406. doi:10.1016/S0140-6736(15)60171-4 Murray, C.J., Lozano, R., Flaxman, A.D., Serina, P., Phillips, D., Stewart, A., James, S.L., Vahdatpour, A., Atkinson, C., Freeman, M.K., Ohno, S.L., Black, R., Ali, S.M., Baqui, A.H., Dandona, L., Dantzer, E., Darmstadt, G.L., Das, V., Dhingra, U., Dutta, A., Fawzi, W., Gómez, S., Hernández, B., Joshi, R., Kalter, H.D., Kumar, A., Kumar, V., Lucero, M., Mehta, S., Neal, B., Praveen, D., Premji, Z., Ramírez-Villalobos, D., Remolador, H., Riley, I., Romero, M., Said, M., Sanvictores, D., Sazawal, S., Tallo, V., Lopez, A.D., 2014. Using verbal autopsy to measure causes of death: the comparative performance of existing methods. BMC Med. 12, 5. doi:10.1186/17417015-12-5 Newsom, S.W.B., 2006. Pioneers in infection control: John Snow, Henry Whitehead, the Broad Street pump, and the beginnings of geographical epidemiology. J. Hosp. Infect. 64, 210–216. doi:10.1016/j.jhin.2006.05.020 Oates, B.J., 2005. Researching Information Systems and Computing. SAGE. Ohemeng-Dapaah, S., Pronyk, P., Akosa, E., Nemser, B., Kanter, A.S., 2010. Combining vital events registration, verbal autopsy and electronic medical records in rural Ghana for improved health services delivery. Stud. Health Technol. Inform. 160, 416–420. Refsnes, H., Refsnes, S., Refsnes, K.J., 2010. Learn HTML and CSS with w3Schools [WWW Document]. URL https://www.dawsonera.com/readonline/9780470880876 (accessed 3.14.17). Rogers, S., 2013a. Borders and boundaries: 16 Google Fusion border files for you to use. Simon Rogers. Rogers, S., 2013b. How to make a map with Google Fusion tables. Simon Rogers. Samarasundera, E., Walsh, T., Cheng, T., Koenig, A., Jattansingh, K., Dawe, A., Soljak, M., 2012. Methods and tools for geographical mapping and analysis in primary health care. Prim. Health Care Res. Dev. 13, 10–21. doi:10.1017/S1463423611000417 Sankoh, O., Byass, P., 2012. The INDEPTH Network: filling vital gaps in global epidemiology. Int. J. Epidemiol. 41, 579–588. doi:10.1093/ije/dys081

44

Sefelin, R., Tscheligi, M., Giller, V., 2003. Paper Prototyping - What is It Good for?: A Comparison of Paper- and Computer-based Low-fidelity Prototyping, in: CHI ’03 Extended Abstracts on Human Factors in Computing Systems, CHI EA ’03. ACM, New York, NY, USA, pp. 778–779. doi:10.1145/765891.765986 Serina, P., Riley, I., Stewart, A., James, S.L., Flaxman, A.D., Lozano, R., Hernandez, B., Mooney, M.D., Luning, R., Black, R., Ahuja, R., Alam, N., Alam, S.S., Ali, S.M., Atkinson, C., Baqui, A.H., Chowdhury, H.R., Dandona, L., Dandona, R., Dantzer, E., Darmstadt, G.L., Das, V., Dhingra, U., Dutta, A., Fawzi, W., Freeman, M., Gomez, S., Gouda, H.N., Joshi, R., Kalter, H.D., Kumar, A., Kumar, V., Lucero, M., Maraga, S., Mehta, S., Neal, B., Ohno, S.L., Phillips, D., Pierce, K., Prasad, R., Praveen, D., Premji, Z., Ramirez-Villalobos, D., Rarau, P., Remolador, H., Romero, M., Said, M., Sanvictores, D., Sazawal, S., Streatfield, P.K., Tallo, V., Vadhatpour, A., Vano, M., Murray, C.J.L., Lopez, A.D., 2015. Improving performance of the Tariff Method for assigning causes of death to verbal autopsies. BMC Med. 13, 291. doi:10.1186/s12916-015-0527-9 Snow, J., 1855. On the Mode of Communication of Cholera. John Churchill. Snyder, C., 2004. Chapter 1 - Introduction, in: Paper Prototyping, Interactive Technologies. Morgan Kaufmann, Burlington, pp. 3–23. Snyder, C., 2003. Paper Prototyping: The Fast and Easy Way to Design and Refine User Interfaces. Morgan Kaufmann. Soleman, N., Chandramohan, D., Shibuya, K., 2006. Verbal autopsy: current practices and challenges. Bull. World Health Organ. 84, 239–245. doi:10.1590/S004296862006000300020 Soy, A., Check, A.L.H., News, B.W., 2013. Phone app offers “verbal autopsies” to improve death records [WWW Document]. BBC News. URL http://www.bbc.co.uk/news/health-24164824 (accessed 4.4.16). Stansfield, S.K., Walsh, J., Prata, N., Evans, T., 2006. Information to Improve Decision Making for Health. The World Bank, 2015. The World Bank - Millennium Development Goals - Reduce Child Mortality by 2015 [WWW Document]. URL http://www.worldbank.org/mdgs/child_mortality.html (accessed 8.9.16). Ware, C., 2013. Chapter Four - Color, in: Information Visualization (Third Edition), Interactive Technologies. Morgan Kaufmann, Boston, pp. 95–138. WHO, 2016a. WHO | Using mobile technology to support vital registration and verbal autopsy in community: Bonsaaso Millennium Villages Project, Ghana [WWW Document]. WHO. URL http://www.who.int/maternal_child_adolescent/epidemiology/maternal-deathsurveillance/case-studies/ghana/en/ (accessed 4.19.16). WHO, 2016b. WHO | World Health Organization [WWW Document]. WHO. URL http://www.who.int/about/en/ (accessed 4.22.16). WHO, 2014. WHO | Verbal autopsy standards: ascertaining and attributing causes of death [WWW Document]. WHO. URL http://www.who.int/healthinfo/statistics/verbalautopsystandards/en/ (accessed 4.22.16). Wood, J., 2016a. MDL_IN3030-INM402_PRD2_2015-16: Lecture notes [WWW Document]. Sess. 1 Build. Data Vis. Appl. URL http://moodle.city.ac.uk/mod/page/view.php?id=578877 (accessed 4.22.16). Wood, J., 2016b. MDL_IN3030-INM402_PRD2_2015-16: Lecture notes [WWW Document]. Sess. 3 Represent. Data Colour. URL http://moodle.city.ac.uk/mod/page/view.php?id=580569 (accessed 8.23.16). World Health Organization, 2016. International statistical classification of diseases and related health problems. You, D., Hug, L., Ejdemyr, S., Idele, P., Hogan, D., Mathers, C., Gerland, P., New, J.R., Alkema, L., 2015. Global, regional, and national levels and trends in under-5 mortality between 1990 and 2015, with scenario-based projections to 2030: a systematic 45

analysis by the UN Inter-agency Group for Child Mortality Estimation. The Lancet 386, 2275–2286. doi:10.1016/S0140-6736(15)00120-8 Young, T.K., 2004. Population Health. Oxford University Press.

46

Appendices Apendix A: Project Proposal for MSc in Health Informatics Name: Ewoma Obaro E-mail address: [email protected] Contact Phone number: 07462529790 Project Title: How can verbal autopsy data be visualized on a global scale to provide insights into patterns of causes of death and their associated uncertainty? Supervisor: Dr. Jon Bird Introduction Understanding causes of death (CoD) in a population is crucial for public health planning and interventions (Soleman et al., 2006). Yet, mostly in low- and middle-income countries (LMICs) there are no official records of deaths outside of the hospital. Lack of CoD knowledge makes it difficult to allocate resources and financial aid for improvements (Joshi et al., 2015). Verbal Autopsy (VA) is a method widely used to determine probable causes of death in high-mortality settings where access to accurate and comprehensive medical certification is restricted (Serina et al., 2015). Even though VA is a reliable and established method, and the World Health Organization (WHO) has developed standardized tools for VA (Stansfield et al., 2006), there is no existing tool for viewing the data collected in different regions worldwide (Soleman et al., 2006). In 2014 the author joined the Classifications, Terminologies and Standards team (CTS) of WHO for 6 months and became familiar with classification work and tooling. One aspect of the work dealt with updating the current version of WHO VA instrument (WHO, 2014). In recent discussions with my former supervisor, Dr. Robert Jakob, WHO Medical Officer, there was interest shown in a visualization tool for probable causes of death. He supports the proposal and will be providing feedback as the project goes long. The main objective is attempting to provide a comprehensive picture of epidemic threats to global health by analysing areas with high-mortality via collected VA data. Representing complex datasets geographically helps studying health related aspects of population and recognizing infectious agents in specific epidemiological findings. John Snow, pioneer of epidemiology, demonstrated by plotting cases of cholera on a map that a water pump in Broadwick, London was the source of the disease (Snow, 1855). 700 deaths were located within a 250-yard radius of a well (Newsom, 2006). As a consequence the water handle was removed and the spread of cholera was prevented.

The project proposes to generate a data visualization tool and possibly present a solution for immediate comparison of cross-regional and global CoD. The tool would be useful to spot i

outbreaks of a disease and how CoD changes on a large scale over time. The other essential aspect will be showing uncertainty with mobile VA Data. A level of uncertainty is associated with each report of probable CoD (Bird et al., 2013). If the VA data is physiciancoded case-by-case, indeterminate results can happen between assessing physicians because of incomplete information or difficulties on reaching a consensus (Byass et al., 2010). VA interpretation tools like InterVA-4, developed by the Umea University in Sweden, is a source of uncertainty with information provided by next of kin and limited categories of CoD in the application (Bird and Fottrell, 2014). The input can lead to false calculation of Cod and as consequence to the preparation of unsuitable health interventions. The locations of death can be plotted on a map, to design more sophisticated programs for targeting specific interventions. Emerging diseases and conditions can be visually identified, then tracking of changes in the burden of diseases can be monitored in different population groups and alerts can be sent out. Functionalities like filters will be part of the design. For example, you could view causes of death in different regions on a map and filter the socioeconomic situation. If areas with deprived resources have more events of a specific disease, than a wealthier population, resources can be allocated from the governmental body or organizations. Simple queries will be developed to present correlations between demographic and socio-economic variables. The aggregated data will possibly show disease progression over time. This project is directed to health population scientists due to the lack of a complete coverage of population data for epidemiological research (Sankoh and Byass, 2012). The products and results of this project would be useful for decision-makers to identify patterns of epidemic outbreaks and interventions could be carefully planned. Since then, data visualization has become more interactive and user friendly encouraging users from different disciplines to engage with and explore data (Wood, 2016a). Data visualization has been described as a "kind of narrative providing a clear answer to a question without extraneous details” (Fry, 2007, p. 4). The research question this project addresses is: How can verbal autopsy data be visualized on a global scale to provide insights into patterns of causes of death and their associated uncertainty? The following proposal will explain in detail why the 3-month-project is relevant, who it is addressed and which approaches will be considered to create the visualization. Critical Context This section describes briefly the need for a VA data visualisation tool and the beneficiaries. Verbal autopsy can be used to provide CoD information in a population. Two thirds of global deaths, approximately 35 million per year, occur at home unmonitored by the health care system due to a lack of civil registration and medical certification (Soy et al., 2013). Designated field agents interview next of kin or close caregivers (relatives, friends or witnesses) following specific guidelines that specify the questions they should ask to identify the signs and symptoms that enable the cause of death to be determined. The interviewer uses a standardised VA questionnaire and the data are used to identify the cause of death,

ii

either by a medical expert or using automated probabilistic approaches (Fottrell and Byass, 2010). Limitations apply to analysing VA information that will be addressed in this visualisation project. “[P]hysician-interpreted and probabilistically modelled” CoD data can differ in interpretations which could lead to “substantially different public health policy programmes” (Byass et al., 2010, p.1). Even though probabilistic approaches are preferred due to cost, time and consistency, there is an uncertainty associated with each CoD (Bird and Fottrell, 2014). Verbal autopsy is a method that has been established for over 20 years. There are different approaches of coding the data to predict the probable causes of death (Figure 1). In the last 5-6 years, time-consuming paper-based systems have been replaced with mobile-phone based systems which send data via SMS text messages to a data repository. Mobile InterVA (MIVA) is compromised of more than 200 interview questions with skip patterns (King et al., 2014). In comparison to PCVA which can take years to be analysed after information is colleted, the VA data is instantly analysed after collection with MIVA (Ohemeng-Dapaah et al., 2010). Real-time CoD data can be visualised by collecting geocodes with longitude and latitude from where the death occurred.

Figure 32: Classification of verbal autopsy interpretation methods (Leitao et al., 2014)

The World Health Organization, in its mission as a leader of global health matters, is dedicated to monitoring global health trends and coordinating people-centred health services (WHO, 2016b). WHO establishes global health policies and collaborate with different bodies to aid them in their attempt in detecting epidemic threats to global health security. WHO reports major outbreaks by gathering global epidemic intelligence from formal and informal sources (Brownstein et al., 2007). Health organizations and health population scientists could gain visualizations of CoD on a global scale with the emerging use of mobile technologies for collecting VA data in a standardised way. The dataset used for this tool stems from five countries in Africa and Asia

iii

after analysed can be forwarded to those institutions which have initially gathered the data (Byass et al., 2015).

Approaches Principles of system development are the foundation of planning and conducting a design and build project (Oates, 2005). The proposal is designated for a “Design and Build” project which is a problem-solving approach. Five components will be addressed and afterwards carried out during the process. These are Literature review Requirements and analysis Design Implementation and Testing Evaluation Literature review The search for this literature review initially involved finding academic papers and projects on VA and which work has been done and needs to be done in the domain of visualization for CoD. Reviewing the literature and speaking to my supervisor defined the research question. City University Library, ACM Digital Library and Google Scholar are the prime search engines. Further, articles and public statements were gathered from the WHO website. Literature and manuals were consulted on how to research information systems and computing (Oates, 2005). During the project the papers related to the topic will be reviewed further. Search: Verbal Autopsy, VA tool, Causes of Death, WHO, InterVA, Data Visualization, User Interfaces, Design and Build Requirements and analysis The first stage involves setting up requirements for the visualization tool by getting usability feedback from target groups. Those groups consist of health organizations like the WHO, health policy makers, health population scientists and epidemiologists. Robert Jakob and individuals from UCL Global Health Institute will be interviewed and their feedback will be incorporated for the prototyping. The data visualization will be based on 54182 deaths from five African and Asian countries (Byass et al., 2015). These records have been both physician-coded and analysed with a Bayesian probabilistic model (InterVA) (Byass et al., 2012). The data are complete and have been validated and are available for use in this project. From a technical point of view, data visualization specialists will be consulted by showing the prototype in the giCentre at City University London. The goal is to gain expert feedback as well as to translate the demands of the target group and represent it accordingly on a map afterwards. iv

Design The prototype will be developed iteratively by repeating sequences of operations to receive the desired result. A small portion of the data set will be visualized and an initial prototype user interfaces created. The domain experts that were consulted for the requirements will evaluate the prototype(s). Learning to develop a strategy is best achieved by making and that leads so that the problem will be understood well (Oates, 2005). There are different methods to design a user interface and functions of the future systems. Most methods are computer-based but exploring paper-prototyping is another option to focus on users needs (Snyder, 2003) and will be part of the design process for this project. Paper-prototyping creates user interfaces quickly and it is inexpensive (Mifsud, 2012). Hand-drawings of user interfaces are used to simulate and test computer-based systems. A lot of input can be gathered by users, so that time management needs to be considered so that the project time frame of three months is not exceeded. Initially, a part of the data set will be visualised in Google Fusion Tables which provides the functionality of data management. Possibly, data can be merged to view connections between CoD and other factors which represent the interactive filters in the developed tool. The information of the filters is gathered from online databases allowing access to demographic information (for example the African Development Bank) and literature review. The filters could include: time period population density wet season – dry season urban – suburban – rural wealthy – poor educated – uneducated vaccine status Google Fusion Tables is an experimental application linked to Google Maps that allows generating a map with place markers of the location where the data was collected (Google, 2016). A more elaborate system will be constructed upon testing. Disadvantages and advantages will be listed to decide whether to programme a bespoke visualization system (for example with Processing, a Java based programme) or to use an existing programme (Google Maps) and adding features to relevant to the health domain. Implementation and Testing The results of the prototyping are the base for an overview of the large data set on a map. A database is created with all the relevant information gathered where the programme will have access to. Further research will be concluded how to make the information available on web server. The created programme will be able to visualize CoD locations plotted on a map and to filter. The domain experts will be consulted and test the tool repeatedly with different input to minimise errors and encourage debugging. Evaluation v

The data visualization tool will be assessed on functionality because the finished product is tangible; the end users will give feedback in form of a survey referencing functions and interface. They can suggest tentative ideas of problems which occurred. At the end of the project, the final report will be created including a substantial analysis.

Work Plan The plan of work consists of the preparation phase and the actual delivery of the project. The plan does not contain meetings with the supervisor and domain expert because they have not been scheduled until the proposal will be accepted and graded.

Data Visualizatio n Project

Literature review

Literature for research strategies

Literature for system developme nt

Requireme nts &

Implementi ng & Testing

Design

analysis

interviewin g target group

Prototypin g

Review of current solutions (ex. Google Fusion)

Building the tool

Plotting locationSur on maps

Filtering

Evaluation

White box/black box testing

Defining evalution criterias

Survey for (end) user

Report

Programmi ng

Figure 33: Work Breakdown Structure

Figure 34: Gantt Chart

vi

Risks Envisaged risks are listed in a risk register to show possible implications with the project and mitigations, so that the project can still be accomplished successfully. ID

Description

1

Domain experts have Medium limited time or they are not physically present

2

3

4

5

Likelihood

Impact (1-5) 4

Policy

Mitigation

Accept

Not having the Medium necessary technical skills No access to data Low

3

Accept

3

Avoid

Biased evaluation High because experts have been following development process Code is accidentally Medium overwritten

4

Avoid

5

Avoid

Making appointments early enough; feedback over email and skype Work with pre-existing software taking information from different papers; requesting data from health institutions Different group for evaluation: (end) users Use of version control software

Table 6: Risk Register

References Alexander, C., 2014. Healthmap. Ref. Rev. 28, 30–31. doi:10.1108/RR-06-2013-0162 Anderson, R.N., Miniño, A.M., Hoyert, D.L., Rosenberg, H.M., 2001. Comparability of cause of death between ICD-9 and ICD-10: preliminary estimates. Natl. Vital Stat. Rep. Cent. Dis. Control Prev. Natl. Cent. Health Stat. Natl. Vital Stat. Syst. 49, 1–32. Bird, J., Byass, P., Kahn, K., Mee, P., Fottrell, E., 2013. A Matter of Life and Death: Practical and Ethical Constraints in the Development of a Mobile Verbal Autopsy Tool, in: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’13. ACM, New York, NY, USA, pp. 1489–1498. doi:10.1145/2470654.2466198 Bird, J., Fottrell, E., 2014. The Challenge of Communicating Causes of Death and their Uncertainty [WWW Document]. URL http://cognitivegiscience.psu.edu/uncertainty2014/papers/bird_communicating.pdf (accessed 4.19.16). Brewer, C.A., Hatchard, G.W., Harrower, M.A., 2003. ColorBrewer in Print: A Catalog of Color Schemes for Maps. Cartogr. Geogr. Inf. Sci. 30, 5–32. doi:10.1559/152304003100010929 Brownstein, J., Freifeld, C., European Centre for Disease Prevention and Control (ECDC)Health Comunication Unit- Eurosurveillance editorial, 2007. HealthMap: the development of automated real-time internet surveillance for epidemic intelligence [WWW Document]. URL http://www.eurosurveillance.org/viewarticle.aspx?articleid=3322 (accessed 4.24.16). vii

Byass, P., Chandramohan, D., Clark, S.J., D’Ambruoso, L., Fottrell, E., Graham, W.J., Herbst, A.J., Hodgson, A., Hounton, S., Kahn, K., Krishnan, A., Leitao, J., Odhiambo, F., Sankoh, O.A., Tollman, S.M., 2012. Strengthening standardised interpretation of verbal autopsy data: the new InterVA-4 tool. Glob. Health Action 5. doi:10.3402/gha.v5i0.19281 Byass, P., Herbst, K., Fottrell, E., Ali, M.M., Odhiambo, F., Amek, N., Hamel, M.J., Laserson, K.F., Kahn, K., Kabudula, C., Mee, P., Bird, J., Jakob, R., Sankoh, O., Tollman, S.M., 2015. Comparing verbal autopsy cause of death findings as determined by physician coding and probabilistic modelling: a public health analysis of 54 000 deaths in Africa and Asia. J. Glob. Health 5. doi:10.7189/jogh.05.010402 Byass, P., Kahn, K., Fottrell, E., Collinson, M.A., Tollman, S.M., 2010. Moving from Data on Deaths to Public Health Policy in Agincourt, South Africa: Approaches to Analysing and Understanding Verbal Autopsy Findings. PLOS Med 7, e1000325. doi:10.1371/journal.pmed.1000325 Cossman, R.E., Cossman, J.S., Jackson, R., Cosby, A., 2003. Mapping high or low mortality places across time in the United States: a research note on a health visualization and analysis project. Health Place 9, 361–369. doi:10.1016/S1353-8292(03)00017-0 Fantahun, M., Fottrell, E., Berhane, Y., Wall, S., Högberg, U., Byass, P., 2006. Assessing a new approach to verbal autopsy interpretation in a rural Ethiopian community: the InterVA model. Bull. World Health Organ. 84, 204–210. Fottrell, E., Byass, P., 2010. Verbal Autopsy: Methods in Transition. Epidemiol. Rev. 32, 38– 55. doi:10.1093/epirev/mxq003 Free Map Tools, 2007. Radius Around a Point on a Map [WWW Document]. URL https://www.freemaptools.com/radius-around-point.htm (accessed 3.20.17). Fry, B., 2007. Visualizing data, in: The Seven Stages of Visualizing Data. O’Reilly, Sebastol, Calif, Farnham, p. 366. Google, 2016. Create: a map (classic) - Fusion Tables Help [WWW Document]. URL https://support.google.com/fusiontables/answer/1244603?hl=en (accessed 4.24.16). Henderson, S., Segal, E.H., 2013. Visualizing Qualitative Data in Evaluation Research. New Dir. Eval. 2013, 53–71. doi:10.1002/ev.20067 Ingleshwar, V.V., 2007. Usablity Testing for the Web. Queue 5, 34–37. doi:10.1145/1281881.1281891 Joshi, R., Praveen, D., Jan, S., Raju, K., Maulik, P., Jha, V., Lopez, A.D., 2015. How Much Does a Verbal Autopsy Based Mortality Surveillance System Cost in Rural India? PLoS ONE 10. doi:10.1371/journal.pone.0126410 King, C., Hall, J., Banda, M., Beard, J., Bird, J., Kazembe, P., Fottrell, E., 2014. Electronic data capture in a rural African setting: evaluating experiences with different systems in Malawi. Glob. Health Action 7. doi:10.3402/gha.v7.25878 Kirk, A., 2012. Data Visualization. Packt Publishing. Lavrakas, P., 2008. Encyclopedia of Survey Research Methods. Sage Publications, Inc., 2455 Teller Road, Thousand Oaks California 91320 United States of America. Leitao, J., Desai, N., Aleksandrowicz, L., Byass, P., Miasnikof, P., Tollman, S., Alam, D., Lu, Y., Rathi, S.K., Singh, A., Suraweera, W., Ram, F., Jha, P., 2014. Comparison of physician-certified verbal autopsy with computer-coded verbal autopsy for cause of death assignment in hospitalized patients in low- and middle-income countries: systematic review. BMC Med. 12, 22. doi:10.1186/1741-7015-12-22 Linek, S.B., Tochtermann, K., 2015. Paper Prototyping: The Surplus Merit of a Multi-Method Approach. Forum Qual. Soc. Res. 16, 1–26. Marinsek, N., 2014. How to jitter overlapping data points in Excel. Nikki Mar. McFarland, D.S., 2010. Dreamweaver CS5. O’Reilly. Meyer, G.W., Greenberg, D.P., 1988. Color-defective vision and computer graphics displays. IEEE Comput. Graph. Appl. 8, 28–40. doi:10.1109/38.7759 Mifsud, J., 2012. Paper Prototyping As A Usability Testing Technique [WWW Document]. Usability Geek. URL http://usabilitygeek.com/paper-prototyping-as-a-usability-testingtechnique/ (accessed 4.24.16). viii

Mikkelsen, L., Phillips, D.E., AbouZahr, C., Setel, P.W., de Savigny, D., Lozano, R., Lopez, A.D., 2015. A global assessment of civil registration and vital statistics systems: monitoring data quality and progress. The Lancet 386, 1395–1406. doi:10.1016/S0140-6736(15)60171-4 Murray, C.J., Lozano, R., Flaxman, A.D., Serina, P., Phillips, D., Stewart, A., James, S.L., Vahdatpour, A., Atkinson, C., Freeman, M.K., Ohno, S.L., Black, R., Ali, S.M., Baqui, A.H., Dandona, L., Dantzer, E., Darmstadt, G.L., Das, V., Dhingra, U., Dutta, A., Fawzi, W., Gómez, S., Hernández, B., Joshi, R., Kalter, H.D., Kumar, A., Kumar, V., Lucero, M., Mehta, S., Neal, B., Praveen, D., Premji, Z., Ramírez-Villalobos, D., Remolador, H., Riley, I., Romero, M., Said, M., Sanvictores, D., Sazawal, S., Tallo, V., Lopez, A.D., 2014. Using verbal autopsy to measure causes of death: the comparative performance of existing methods. BMC Med. 12, 5. doi:10.1186/17417015-12-5 Newsom, S.W.B., 2006. Pioneers in infection control: John Snow, Henry Whitehead, the Broad Street pump, and the beginnings of geographical epidemiology. J. Hosp. Infect. 64, 210–216. doi:10.1016/j.jhin.2006.05.020 Oates, B.J., 2005. Researching Information Systems and Computing. SAGE. Ohemeng-Dapaah, S., Pronyk, P., Akosa, E., Nemser, B., Kanter, A.S., 2010. Combining vital events registration, verbal autopsy and electronic medical records in rural Ghana for improved health services delivery. Stud. Health Technol. Inform. 160, 416–420. Refsnes, H., Refsnes, S., Refsnes, K.J., 2010. Learn HTML and CSS with w3Schools [WWW Document]. URL https://www.dawsonera.com/readonline/9780470880876 (accessed 3.14.17). Rogers, S., 2013a. Borders and boundaries: 16 Google Fusion border files for you to use. Simon Rogers. Rogers, S., 2013b. How to make a map with Google Fusion tables. Simon Rogers. Samarasundera, E., Walsh, T., Cheng, T., Koenig, A., Jattansingh, K., Dawe, A., Soljak, M., 2012. Methods and tools for geographical mapping and analysis in primary health care. Prim. Health Care Res. Dev. 13, 10–21. doi:10.1017/S1463423611000417 Sankoh, O., Byass, P., 2012. The INDEPTH Network: filling vital gaps in global epidemiology. Int. J. Epidemiol. 41, 579–588. doi:10.1093/ije/dys081 Sefelin, R., Tscheligi, M., Giller, V., 2003. Paper Prototyping - What is It Good for?: A Comparison of Paper- and Computer-based Low-fidelity Prototyping, in: CHI ’03 Extended Abstracts on Human Factors in Computing Systems, CHI EA ’03. ACM, New York, NY, USA, pp. 778–779. doi:10.1145/765891.765986 Serina, P., Riley, I., Stewart, A., James, S.L., Flaxman, A.D., Lozano, R., Hernandez, B., Mooney, M.D., Luning, R., Black, R., Ahuja, R., Alam, N., Alam, S.S., Ali, S.M., Atkinson, C., Baqui, A.H., Chowdhury, H.R., Dandona, L., Dandona, R., Dantzer, E., Darmstadt, G.L., Das, V., Dhingra, U., Dutta, A., Fawzi, W., Freeman, M., Gomez, S., Gouda, H.N., Joshi, R., Kalter, H.D., Kumar, A., Kumar, V., Lucero, M., Maraga, S., Mehta, S., Neal, B., Ohno, S.L., Phillips, D., Pierce, K., Prasad, R., Praveen, D., Premji, Z., Ramirez-Villalobos, D., Rarau, P., Remolador, H., Romero, M., Said, M., Sanvictores, D., Sazawal, S., Streatfield, P.K., Tallo, V., Vadhatpour, A., Vano, M., Murray, C.J.L., Lopez, A.D., 2015. Improving performance of the Tariff Method for assigning causes of death to verbal autopsies. BMC Med. 13, 291. doi:10.1186/s12916-015-0527-9 Snow, J., 1855. On the Mode of Communication of Cholera. John Churchill. Snyder, C., 2004. Chapter 1 - Introduction, in: Paper Prototyping, Interactive Technologies. Morgan Kaufmann, Burlington, pp. 3–23. Snyder, C., 2003. Paper Prototyping: The Fast and Easy Way to Design and Refine User Interfaces. Morgan Kaufmann. Soleman, N., Chandramohan, D., Shibuya, K., 2006. Verbal autopsy: current practices and challenges. Bull. World Health Organ. 84, 239–245. doi:10.1590/S004296862006000300020

ix

Soy, A., Check, A.L.H., News, B.W., 2013. Phone app offers “verbal autopsies” to improve death records [WWW Document]. BBC News. URL http://www.bbc.co.uk/news/health-24164824 (accessed 4.4.16). Stansfield, S.K., Walsh, J., Prata, N., Evans, T., 2006. Information to Improve Decision Making for Health. The World Bank, 2015. The World Bank - Millennium Development Goals - Reduce Child Mortality by 2015 [WWW Document]. URL http://www.worldbank.org/mdgs/child_mortality.html (accessed 8.9.16). Ware, C., 2013. Chapter Four - Color, in: Information Visualization (Third Edition), Interactive Technologies. Morgan Kaufmann, Boston, pp. 95–138. WHO, 2016a. WHO | Using mobile technology to support vital registration and verbal autopsy in community: Bonsaaso Millennium Villages Project, Ghana [WWW Document]. WHO. URL http://www.who.int/maternal_child_adolescent/epidemiology/maternal-deathsurveillance/case-studies/ghana/en/ (accessed 4.19.16). WHO, 2016b. WHO | World Health Organization [WWW Document]. WHO. URL http://www.who.int/about/en/ (accessed 4.22.16). WHO, 2014. WHO | Verbal autopsy standards: ascertaining and attributing causes of death [WWW Document]. WHO. URL http://www.who.int/healthinfo/statistics/verbalautopsystandards/en/ (accessed 4.22.16). Wood, J., 2016a. MDL_IN3030-INM402_PRD2_2015-16: Lecture notes [WWW Document]. Sess. 1 Build. Data Vis. Appl. URL http://moodle.city.ac.uk/mod/page/view.php?id=578877 (accessed 4.22.16). Wood, J., 2016b. MDL_IN3030-INM402_PRD2_2015-16: Lecture notes [WWW Document]. Sess. 3 Represent. Data Colour. URL http://moodle.city.ac.uk/mod/page/view.php?id=580569 (accessed 8.23.16). World Health Organization, 2016. International statistical classification of diseases and related health problems. You, D., Hug, L., Ejdemyr, S., Idele, P., Hogan, D., Mathers, C., Gerland, P., New, J.R., Alkema, L., 2015. Global, regional, and national levels and trends in under-5 mortality between 1990 and 2015, with scenario-based projections to 2030: a systematic analysis by the UN Inter-agency Group for Child Mortality Estimation. The Lancet 386, 2275–2286. doi:10.1016/S0140-6736(15)00120-8 Young, T.K., 2004. Population Health. Oxford University Press.

x

Ethics Review Form: BSc, MSc and MA Projects Computer Science Research Ethics Committee (CSREC) A.1 If your answer to any of the following questions (1 – 3) is YES, you must apply to an appropriate external ethics committee for approval.

Delete as appropriat e

1.

Does your project require approval from the National Research Ethics Service (NRES)? For example, because you are recruiting current NHS patients or staff? If you are unsure, please check at http://www.hra.nhs.uk/research-community/before-you-apply/determinewhich-review-body-approvals-are-required/.

No

2.

Does your project involve participants who are covered by the Mental Capacity Act? If so, you will need approval from an external ethics committee such as NRES or the Social Care Research Ethics Committee http://www.scie.org.uk/research/ethics-committee/.

No

3.

Does your project involve participants who are currently under the auspices of the Criminal Justice System? For example, but not limited to, people on remand, prisoners and those on probation? If so, you will need approval from the ethics approval system of the National Offender Management Service.

No

A.2 If your answer to any of the following questions (4 – 11) is YES, you must apply to the City University Senate Research Ethics Committee (SREC) for approval (unless you are applying to an external ethics committee). 4.

Does your project involve participants who are unable to give informed consent? For example, but not limited to, people who may have a degree of learning disability or mental health problem, that means they are unable to make an informed decision on their own behalf?

Delete as appropriat e

No

5.

Is there a risk that your project might lead to disclosures from participants concerning their involvement in illegal activities?

No

6.

Is there a risk that obscene and or illegal material may need to be accessed for your project (including online content and other material)?

No

7.

Does your project involve participants disclosing information about sensitive subjects? For example, but not limited to, health status, sexual behaviour, political behaviour, domestic violence.

No

8.

Does your project involve you travelling to another country outside of the UK, where the Foreign & Commonwealth Office has issued a travel warning? (See http://www.fco.gov.uk/en/)

No

9.

Does your project involve physically invasive or intrusive procedures? For example, these may include, but are not limited to, electrical

No

i

stimulation, heat, cold or bruising. 10.

Does your project involve animals?

No

11.

Does your project involve the administration of drugs, placebos or other substances to study participants?

No

A.3 If your answer to any of the following questions (12 – 18) is YES, you Delete as must submit a full application to the Computer Science Research Ethics appropriat Committee (CSREC) for approval (unless you are applying to an external e ethics committee or the Senate Research Ethics Committee). Your application may be referred to the Senate Research Ethics Committee. 12.

Does your project involve participants who are under the age of 18?

No

13.

Does your project involve adults who are vulnerable because of their social, psychological or medical circumstances (vulnerable adults)? This includes adults with cognitive and / or learning disabilities, adults with physical disabilities and older people.

No

14.

Does your project involve participants who are recruited because they are staff or students of City University London? For example, students studying on a specific course or module. (If yes, approval is also required from the Head of Department or Programme Director.)

No

15.

Does your project involve intentional deception of participants?

No

16.

Does your project involve participants taking part without their informed consent?

No

17.

Does your project pose a risk to participants or other individuals greater than that in normal working life?

No

18.

Does your project pose a risk to you, the researcher, greater than that in normal working life?

No

A.4 If your answer to the following question (19) is YES and your answer to all questions 1 – 18 is NO, you must complete part B of this form. 19.

Does your project involve human participants or their identifiable personal data? For example, as interviewees, respondents to a survey or participants in testing.

No

ii

Appendix B: Data preparation (Excel)

Figure 32: COD 2013 anonymised data raw

Figure 33: CODA 2013 anonymised separated and altered

iii

Figure 34: Adult Literacy, population 15+ years, in INDEPTH countries

Figure 35: GDP (in USD) in INDEPTH Countries

iv

Figure 36: Translating country before merging data with boundary file

v

Appendix C: Google Fusion Tables

Figure 35: Fusion Table Layer Wizard

vi

Explanation of legend CoD 01.01 Sepsis (non-obstetric) 1.02 Acute resp infect incl pneumonia 01.03 HIV/AIDS related death 01.04 Diarrhoeal diseases 01.05 Malaria 01.06 Measles 01.07 Meningitis and encephalitis 01.08 & 10.05 Tetanus 01.09 Pulmonary tuberculosis 01.10 Pertussis 01.11 Haemorrhagic fever 01.99 Other and unspecified infect dis 02.01 Oral neoplasms 02.02 Digestive neoplasms 02.03 Respiratory neoplasms 02.04 Breast neoplasms 02.05 & 02.06 Reproductive neoplasms MF 02.99 Other and unspecified neoplasms 03.01 Severe anaemia 03.02 Severe malnutrition 03.03 Diabetes mellitus 04.01 Acute cardiac disease 04.03 Sickle cell with crisis 04.02 Stroke 04.99 Other and unspecified cardiac dis 05.01 Chronic obstructive pulmonary dis 05.02 Asthma 06.01 Acute abdomen 06.02 Liver cirrhosis 07.01 Renal failure 08.01 Epilepsy 98 Other and unspecified NCD 10.06 Congenital malformation 10.01 Prematurity 10.02 Birth asphyxia 10.03 Neonatal pneumonia 10.04 Neonatal sepsis 10.99 Other and unspecified neonatal CoD 11.01 Fresh stillbirth 11.02 Macerated stillbirth 12.01 Road traffic accident 12.02 Other transport accident

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Count of causes 296 4418 4354 996 2924 97 479 2 5005 71 7 451 76 1772 1031 105

26

476

27

753

28 29 30 31 32 33

102 390 670 717 109 2427

34

1581

35

725

36 37 38 39 40 41 42 43 44 45 46

429 1583 303 267 182 350 71 284 388 483 162

47

246

48 49 50 51

0 0 691 38

Numerical CoD code

Legend range

Color

10-14

#a50026

15-19

#d73027

20-24

#f46d43

25-29

#fdae61

30-35

#fee090

36-41

#ffffbf

42-47

#e0f3f8

48-54

#abd9e9

vii

12.03 Accid fall 12.04 Accid drowning and submersion 12.05 Accid expos to smoke, fire & flame 12.06 Contact with venomous plant/animal 12.10 Exposure to force of nature 12.07 Accid poisoning and noxious subs 12.08 Intentional self-harm 12.09 Assault 12.99 Other and unspecified external CoD 09.01 Ectopic pregnancy 09.02 Abortion-related death 09.03 Pregnancy-induced hypertension 09.04 Obstetric haemorrhage 09.05 Obstructed labour 09.06 Pregnancy-related sepsis 09.07 Anaemia of pregnancy 09.08 Ruptured uterus 09.99 Other and unspecified maternal CoD 99 Indeterminate XX Va not completed

52

253

53

281

54

99

55

68

56

12

57

33

58 59

314 12

60

136

61 62

8 25

63

44

64 65 66 67 68

91 8 80 16 2

69

34

70 71

21406 6525

55-60

#74add1

61-66

#4575b4

67-71

#313695

Table 7: CoD and the according color hue VA Map Legend

viii

Figure 36: Data Dictionary

ix

Appendix D: HTML Code

x

xi

xii

Figure 37: HTML Code

xiii

Appendix

E:

Examples

of

filled

in

questionnaires

xiv

xv

xvi

Appendix F: Table of Figures Figure 1: Work Plan Individual Project .................................................................................. 4 Figure 2: Verbal autopsy process and factors influencing cause-specific mortality fractions (Soleman et al., 2006) ........................................................................................................... 5 Figure 3: Level of Measurement ............................................................................................ 7 Figure 4: Paper prototyping Part 1 ...................................................................................... 11 Figure 5: Paper Prototyping Part 2 ...................................................................................... 12 Figure 6: World borders (inc South Sudan) (Rogers, 2013a) displayed in Google Fusion ... 14 Figure 7: Merge of DEMO_LITERACY_UIS_DATA and World Boundaries incl. South Sudan ........................................................................................................................................... 21 Figure 8: KML demographic surveillance site boundaries Google Earth .............................. 22 Figure 9: KML demographic surveillance site boundaries displayed in Google Fusion Tables ........................................................................................................................................... 22 Figure 10: Causes of death data markers displayed in Google Fusion Tables .................... 23 Figure 11: Cause of death polygons in India displayed before changing feature styles in Google Fusion Tables ......................................................................................................... 23 Figure 12: Cause of death polygons in Bangladesh displayed in Google Fusion Tables ..... 24 Figure 13: Creating multiple layers with Fusion Tables Layer Wizard.................................. 25 Figure 14: Legend on layered map...................................................................................... 26 Figure 15: Website header .................................................................................................. 26 Figure 16: Link to website element ...................................................................................... 27 Figure 17: Legend style....................................................................................................... 27 Figure 18: Defining the Google Maps API JavaScript .......................................................... 27 Figure 19: Declaration of variables ...................................................................................... 28 Figure 20: Initialize function................................................................................................. 28 Figure 21: Layer function .................................................................................................... 29 Figure 22: Select based search function ............................................................................. 29 Figure 23: VA Map v1 ......................................................................................................... 30 Figure 24: VA Map v2 ......................................................................................................... 30 Figure 25: Geometrical concept of dispersing the data randomly within a polygon .............. 31 Figure 26: Rectangle polygons within KML boundary of Burkina Faso created in Google Earth ................................................................................................................................... 31 Figure 27: Pivot table count of causes of death in INDEPTH countries ............................... 32 Figure 28: Distributed Markers in India ................................................................................ 33 Figure 29: Geometrical view: Markers jittered according to RAND() formula and rectangle kml boundaries.................................................................................................................... 33 Figure 30: Responses from the questionnaire Questions 1-3 and 6-8 Part 1 ....................... 35 Figure 31: Responses from the questionnaire Question 4 Part 2 ........................................ 36 Figure 32: Classification of verbal autopsy interpretation methods (Leitao et al., 2014)......... iii Figure 33: Work Breakdown Structure ..................................................................................vi Figure 34: Gantt Chart ..........................................................................................................vi Figure 35: Fusion Table Layer Wizard ..................................................................................vi Figure 36: Data Dictionary ....................................................................................................ix Figure 37: HTML Code........................................................................................................ xiii

xvii