Visualizing public health data for communicable disease management and control. Anamaria Crisan. PhD Candidate, Computer
Visualizing public health data for communicable disease management and control Anamaria Crisan PhD Candidate, Computer Science The University of British Columbia
Data visualization in the GenEpi current paradigm of scientific research
= Communication
2
Yes.
Do all the
Science!
Inform
the masses!
Do you have an
Outbreak? No.
Duh. But you want to
Monitor right? 3 https://www.ratbotcomics.com/comics/pgrc_2014/1/1.html
Yes.
Do all the
Science!
Do you have an
Outbreak?
Inform
the masses!
Infographics are pretty
Duh. Maybe data
No.
But you want to
Visualization?
Monitor right? 4
Yes.
Inform
Do all the
Science!
the masses!
Infographics are pretty
Do you have an
Outbreak? No.
Duh. But you want to
Did it work?
Maybe data
Visualization?
Monitor right? 5
Yes.
Inform
Do all the
Science!
the masses!
Do you have an
Outbreak? No.
Duh. But you want to
Did it work?
No : (
Different Infographics?
Maybe data
Visualization?
Monitor right? 6
Yes.
Inform
Do all the
Science!
the masses!
Do you have an
Outbreak? No.
Duh. But you want to
Monitor right?
Did it work?
Yes! (maybe?)
No : (
Different Infographics?
Maybe data
Visualization?
Declare Victory
7
Challenge : Multiple Alternatives
•
Many different visualization designs •
•
Design impacts data interpretation
How to choose which is best? • • •
Feelings (ad hoc) Impressions (ad hoc) Systematic assessment (lacking)
Data Visualization!
8
Challenge : Multiple Alternatives OPTION A
OPTION B
OPTION C
• Same objective (understand treatment efficacy), same data, different visualizations • Tested accuracy, timeliness, and preference with 2,038 participants • Option A was most accurate, easier (faster) to read, and preferred www.vishealth.org
9
Challenge : Multiple Alternatives
10
Lack of Systematic Thinking about Data Visualization Bioinformatics Methods
GenEpi Data Visualization
§ Peer-reviewed, systematic approaches
§ Ad hoc – visualization mainly by intuition, trial & error
§ Automated systems & packages
§ Some automated systems & packages
§ Benchmark comparisons / evaluation
§ No real comparison / evaluation
§ Attempts to standardize
§ No attempts to standardize
§ Formal instruction
§ No formal instruction
§ Community dialogue
§ Some community dialogue
§ papers, reviews, blog posts
§ blog posts (kind of), twitter 11
Introducing
GEviT 12
Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do?
13
Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do? GEviT provides a base
GEviT does not evaluate
§ Deliverables : 1. Typology 2. Interactive Gallery
§ Massive undertaking that would take many years § Needs GEviT to conduct evaluations
Preliminary
Systematic Completion: Fall 2017
Evaluation Completion: Fall 2117?
14
Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do? GEviT provides a base
GEviT does not evaluate
§ Deliverables : 1. Typology 2. Interactive Gallery
§ Massive undertaking that would take many years § Needs GEviT to conduct evaluations
Preliminary
Systematic Completion: Fall 2017
Evaluation Completion: Fall 2117?
15
Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do? GEviT provides a base
GEviT does not evaluate
§ Deliverables : 1. Typology 2. Interactive Gallery
§ Massive undertaking that would take many years § Needs GEviT to conduct evaluations
Preliminary
Systematic Completion: Fall 2017
Evaluation Completion: Fall 2117?
16
How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification
17
How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification
§ Why are data being visualized? § i.e. show transmission in a hospital
18
How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification
§ Why are data being visualized? § i.e. show transmission in a hospital
§ What data are being visualized? § i.e. patient location, duration in hospital, test outcomes, SNPs, clusters
19
How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification
§ Why are data being visualized? § i.e. show transmission in a hospital
§ What data are being visualized? § i.e. patient location, duration in hospital, test outcomes, SNPs, clusters
§ How are data being visualized? § i.e. timeline, phylogenetic tree (high-level) § i.e. test outcome = shape ; patient location = colour; cluster = spatial arrangement (low-level) 20
GEviT
Development
21
GEviT Development (Example) WHY: Show within hospital transmission OPTION A
OPTION B
22
GEviT Development (Example) Same why, different high-level how OPTION A HOW (high-level): Timeline
OPTION B HOW (high-level): Timeline
HOW (high-level): Phylogeny
HOW (high-level): Node-link graph
23
GEviT Development (Example) Same why, same what , same how OPTION A
WHAT: Location [ HOW: Colour ]
OPTION B
WHAT: Location [ HOW : Colour ]
24
GEviT Development (Example) Same why, sameish what , different how OPTION A
WHAT: Test Performed [ HOW : Glyph ]
OPTION B
WHAT: Test Performed [ HOW : Line ]
WHAT: Test Result [ HOW : Colour] 25
GEviT Development (Example) Same why, different what and how OPTION A
OPTION B
WHAT: Clusters [ HOW : Colour ]
26
GEviT Development (Example) Same why, different what and how OPTION A
OPTION B
WHAT: Transmission Confidence [ HOW: Colour ]
27
GEviT Development (Example) What High-level Admin Genomic
How Low-level Patient ID P. Sample ID SNP Distance Clusters
How is what Location you see
Spatial
Laboratory
Temporal
Test Performed
High-Level Timeline Phylogeny Timeline Timeline
Low-level
A
B
Annotation
X
X
Annotation, Colour
X
Annotation, Position
X
Position, Colour
X
Colour
X
Glyph
X
Line, Colour
X X
Test Result
Colour
X
Admission Date
Position
X
X
Bar
X
X
Position
X
X
Episode Duration
Timeline
Test Date Transmission
Options
Transmission Confidence
Node-link graph
Colour
X
28
GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory
Temporal
How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed
High-Level Timeline Phylogeny Timeline Timeline
Low-level
A
B
Annotation
X
X
Annotation, Colour
X
Annotation, Position
X
Position, Colour
X
Colour
X
Glyph
X
Line, Colour
X X
Test Result
Colour
X
Admission Date
Position
X
X
Bar
X
X
Position
X
X
Episode Duration
Timeline
Test Date Transmission
Options
Transmission Confidence
Node-link graph
Colour
X
29
GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory
Temporal
How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed
High-Level Timeline Phylogeny Timeline Timeline
Low-level
A
B
Annotation
X
X
Annotation, Colour
X
Annotation, Position
X
Position, Colour
X
Colour
X
Glyph
X
Line, Colour
X X
Test Result
Colour
X
Admission Date
Position
X
X
Bar
X
X
Position
X
X
Episode Duration
Timeline
Test Date Transmission
Options
Transmission Confidence
Node-link graph
Colour
X
30
GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory
Temporal
How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed
High-Level Timeline Phylogeny Timeline Timeline
Low-level
A
B
Annotation
X
X
Annotation, Colour
X
Annotation, Position
X
Position, Colour
X
Colour
X
Glyph
X
Line, Colour
X X
Test Result
Colour
X
Admission Date
Position
X
X
Bar
X
X
Position
X
X
Episode Duration
Timeline
Test Date Transmission
Options
Transmission Confidence
Node-link graph
Colour
X
31
GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory
Temporal
How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed
High-Level Timeline Phylogeny Timeline Timeline
Low-level
A
B
Annotation
X
X
Annotation, Colour
X
Annotation, Position
X
Position, Colour
X
Colour
X
Glyph
X
Line, Colour
X X
Test Result
Colour
X
Admission Date
Position
X
X
Bar
X
X
Position
X
X
Episode Duration
Timeline
Test Date Transmission
Options
Transmission Confidence
Node-link graph
Colour
X
32
GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory
Temporal
How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed
High-Level Timeline Phylogeny Timeline Timeline
Low-level
A
B
Annotation
X
X
Annotation, Colour
X
Annotation, Position
X
Position, Colour
X
Colour
X
Glyph
X
Line, Colour
X X
Test Result
Colour
X
Admission Date
Position
X
X
Bar
X
X
Position
X
X
Episode Duration
Timeline
Test Date Transmission
Options
Transmission Confidence
Node-link graph
Colour
X
33
What is GEviT’s status? Systematic Portion of Project Anticipation Completion: Fall 2017
Prototype Gallery: https://amcrisan.shinyapps.io/gevit_gallery_prototype/
34
GEviT Takeaways § Data visualization rigor is possible § Rigor is also desirable § Analytic, rather than ad hoc, process to data visualization
§ Rigor will be necessary § Clinical application of GenEpi § Ensuring visualization is interpretable
Data Visualization!
§ GEviT provides a basis § Evolving framework § Enables evaluation 35
The Bigger Picture Same stats, different graphs
Autodesk Research (2017). Same Stats, Different Graphs: https://www.autodeskresearch.com/publications/samestats
36
Contact @amcrisan http://cs.ubc.ca/~acrisan
Thanks! Dr. Gardy, Dr. Munzner, & the UBC infovis group
37
Extra Slides 38
Missed EDA Opportunity •
EDA = Exploratory Data Analysis
•
Incorporate data visualization as a part of scientific process • •
•
Do all the
Science!
Inform
the masses!
Science != just statistics + wetlab Visualization != just pretty pictures
Visualization can … • • •
Check methdological assumptions Show unanticipated patterns Generate new hypothesis
Data Visualization! 39
How does GEviT add Rigor to Data Visualization in GenEpi? § Ad hoc – visualization mainly by intuition, trial & error
GEviT introduces design alternatives, systematic reasoning
§ No real comparison / evaluation
GEviT enables comparison for design & evaluation
§ No attempts to standardize
GEviT standardizes descriptions of elements in visualization design
§ No formal instruction
GEviT bridges GenEpi with infovis pedagogy
§ Some community dialogue
GEviT intended to start a bigger dialogue about data visualization practices
40
Design Trajectory of GEviT Gallery: Treevis.net
EXAMPLE: Info about vis EXAMPLE: Info about vis
Setviz.net
Vishealth.org
41