Talk Slides

2 downloads 203 Views 2MB Size Report
Visualizing public health data for communicable disease management and control. Anamaria Crisan. PhD Candidate, Computer
Visualizing public health data for communicable disease management and control Anamaria Crisan PhD Candidate, Computer Science The University of British Columbia

Data visualization in the GenEpi current paradigm of scientific research

= Communication

2

Yes.

Do all the

Science!

Inform

the masses!

Do you have an

Outbreak? No.

Duh. But you want to

Monitor right? 3 https://www.ratbotcomics.com/comics/pgrc_2014/1/1.html

Yes.

Do all the

Science!

Do you have an

Outbreak?

Inform

the masses!

Infographics are pretty

Duh. Maybe data

No.

But you want to

Visualization?

Monitor right? 4

Yes.

Inform

Do all the

Science!

the masses!

Infographics are pretty

Do you have an

Outbreak? No.

Duh. But you want to

Did it work?

Maybe data

Visualization?

Monitor right? 5

Yes.

Inform

Do all the

Science!

the masses!

Do you have an

Outbreak? No.

Duh. But you want to

Did it work?

No : (

Different Infographics?

Maybe data

Visualization?

Monitor right? 6

Yes.

Inform

Do all the

Science!

the masses!

Do you have an

Outbreak? No.

Duh. But you want to

Monitor right?

Did it work?

Yes! (maybe?)

No : (

Different Infographics?

Maybe data

Visualization?

Declare Victory

7

Challenge : Multiple Alternatives



Many different visualization designs •



Design impacts data interpretation

How to choose which is best? • • •

Feelings (ad hoc) Impressions (ad hoc) Systematic assessment (lacking)

Data Visualization!

8

Challenge : Multiple Alternatives OPTION A

OPTION B

OPTION C

• Same objective (understand treatment efficacy), same data, different visualizations • Tested accuracy, timeliness, and preference with 2,038 participants • Option A was most accurate, easier (faster) to read, and preferred www.vishealth.org

9

Challenge : Multiple Alternatives

10

Lack of Systematic Thinking about Data Visualization Bioinformatics Methods

GenEpi Data Visualization

§ Peer-reviewed, systematic approaches

§ Ad hoc – visualization mainly by intuition, trial & error

§ Automated systems & packages

§ Some automated systems & packages

§ Benchmark comparisons / evaluation

§ No real comparison / evaluation

§ Attempts to standardize

§ No attempts to standardize

§ Formal instruction

§ No formal instruction

§ Community dialogue

§ Some community dialogue

§ papers, reviews, blog posts

§ blog posts (kind of), twitter 11

Introducing

GEviT 12

Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do?

13

Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do? GEviT provides a base

GEviT does not evaluate

§ Deliverables : 1. Typology 2. Interactive Gallery

§ Massive undertaking that would take many years § Needs GEviT to conduct evaluations

Preliminary

Systematic Completion: Fall 2017

Evaluation Completion: Fall 2117?

14

Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do? GEviT provides a base

GEviT does not evaluate

§ Deliverables : 1. Typology 2. Interactive Gallery

§ Massive undertaking that would take many years § Needs GEviT to conduct evaluations

Preliminary

Systematic Completion: Fall 2017

Evaluation Completion: Fall 2117?

15

Introducing GEviT § GEviT = Genomic Epidemiology Visualization Typology § A way to describe data visualization for analysis § Organizes qualitative descriptors into a typology § What does GEviT do and not do? GEviT provides a base

GEviT does not evaluate

§ Deliverables : 1. Typology 2. Interactive Gallery

§ Massive undertaking that would take many years § Needs GEviT to conduct evaluations

Preliminary

Systematic Completion: Fall 2017

Evaluation Completion: Fall 2117?

16

How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification

17

How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification

§ Why are data being visualized? § i.e. show transmission in a hospital

18

How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification

§ Why are data being visualized? § i.e. show transmission in a hospital

§ What data are being visualized? § i.e. patient location, duration in hospital, test outcomes, SNPs, clusters

19

How does GEviT do this? § Create a why-what-how typology § Uses methods from qualitative methods & infovis research § Typology, not ontology : typologies are lighter weight than ontologies § May be used in conjunction with a ontology § Blame epistemologists for their systems of classification

§ Why are data being visualized? § i.e. show transmission in a hospital

§ What data are being visualized? § i.e. patient location, duration in hospital, test outcomes, SNPs, clusters

§ How are data being visualized? § i.e. timeline, phylogenetic tree (high-level) § i.e. test outcome = shape ; patient location = colour; cluster = spatial arrangement (low-level) 20

GEviT

Development

21

GEviT Development (Example) WHY: Show within hospital transmission OPTION A

OPTION B

22

GEviT Development (Example) Same why, different high-level how OPTION A HOW (high-level): Timeline

OPTION B HOW (high-level): Timeline

HOW (high-level): Phylogeny

HOW (high-level): Node-link graph

23

GEviT Development (Example) Same why, same what , same how OPTION A

WHAT: Location [ HOW: Colour ]

OPTION B

WHAT: Location [ HOW : Colour ]

24

GEviT Development (Example) Same why, sameish what , different how OPTION A

WHAT: Test Performed [ HOW : Glyph ]

OPTION B

WHAT: Test Performed [ HOW : Line ]

WHAT: Test Result [ HOW : Colour] 25

GEviT Development (Example) Same why, different what and how OPTION A

OPTION B

WHAT: Clusters [ HOW : Colour ]

26

GEviT Development (Example) Same why, different what and how OPTION A

OPTION B

WHAT: Transmission Confidence [ HOW: Colour ]

27

GEviT Development (Example) What High-level Admin Genomic

How Low-level Patient ID P. Sample ID SNP Distance Clusters

How is what Location you see

Spatial

Laboratory

Temporal

Test Performed

High-Level Timeline Phylogeny Timeline Timeline

Low-level

A

B

Annotation

X

X

Annotation, Colour

X

Annotation, Position

X

Position, Colour

X

Colour

X

Glyph

X

Line, Colour

X X

Test Result

Colour

X

Admission Date

Position

X

X

Bar

X

X

Position

X

X

Episode Duration

Timeline

Test Date Transmission

Options

Transmission Confidence

Node-link graph

Colour

X

28

GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory

Temporal

How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed

High-Level Timeline Phylogeny Timeline Timeline

Low-level

A

B

Annotation

X

X

Annotation, Colour

X

Annotation, Position

X

Position, Colour

X

Colour

X

Glyph

X

Line, Colour

X X

Test Result

Colour

X

Admission Date

Position

X

X

Bar

X

X

Position

X

X

Episode Duration

Timeline

Test Date Transmission

Options

Transmission Confidence

Node-link graph

Colour

X

29

GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory

Temporal

How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed

High-Level Timeline Phylogeny Timeline Timeline

Low-level

A

B

Annotation

X

X

Annotation, Colour

X

Annotation, Position

X

Position, Colour

X

Colour

X

Glyph

X

Line, Colour

X X

Test Result

Colour

X

Admission Date

Position

X

X

Bar

X

X

Position

X

X

Episode Duration

Timeline

Test Date Transmission

Options

Transmission Confidence

Node-link graph

Colour

X

30

GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory

Temporal

How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed

High-Level Timeline Phylogeny Timeline Timeline

Low-level

A

B

Annotation

X

X

Annotation, Colour

X

Annotation, Position

X

Position, Colour

X

Colour

X

Glyph

X

Line, Colour

X X

Test Result

Colour

X

Admission Date

Position

X

X

Bar

X

X

Position

X

X

Episode Duration

Timeline

Test Date Transmission

Options

Transmission Confidence

Node-link graph

Colour

X

31

GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory

Temporal

How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed

High-Level Timeline Phylogeny Timeline Timeline

Low-level

A

B

Annotation

X

X

Annotation, Colour

X

Annotation, Position

X

Position, Colour

X

Colour

X

Glyph

X

Line, Colour

X X

Test Result

Colour

X

Admission Date

Position

X

X

Bar

X

X

Position

X

X

Episode Duration

Timeline

Test Date Transmission

Options

Transmission Confidence

Node-link graph

Colour

X

32

GEviT Development (Example) What High-level Admin Genomic Spatial Laboratory

Temporal

How Low-level Patient ID P. Sample ID SNP Distance Clusters Location Test Performed

High-Level Timeline Phylogeny Timeline Timeline

Low-level

A

B

Annotation

X

X

Annotation, Colour

X

Annotation, Position

X

Position, Colour

X

Colour

X

Glyph

X

Line, Colour

X X

Test Result

Colour

X

Admission Date

Position

X

X

Bar

X

X

Position

X

X

Episode Duration

Timeline

Test Date Transmission

Options

Transmission Confidence

Node-link graph

Colour

X

33

What is GEviT’s status? Systematic Portion of Project Anticipation Completion: Fall 2017

Prototype Gallery: https://amcrisan.shinyapps.io/gevit_gallery_prototype/

34

GEviT Takeaways § Data visualization rigor is possible § Rigor is also desirable § Analytic, rather than ad hoc, process to data visualization

§ Rigor will be necessary § Clinical application of GenEpi § Ensuring visualization is interpretable

Data Visualization!

§ GEviT provides a basis § Evolving framework § Enables evaluation 35

The Bigger Picture Same stats, different graphs

Autodesk Research (2017). Same Stats, Different Graphs: https://www.autodeskresearch.com/publications/samestats

36

Contact @amcrisan http://cs.ubc.ca/~acrisan

Thanks! Dr. Gardy, Dr. Munzner, & the UBC infovis group

37

Extra Slides 38

Missed EDA Opportunity •

EDA = Exploratory Data Analysis



Incorporate data visualization as a part of scientific process • •



Do all the

Science!

Inform

the masses!

Science != just statistics + wetlab Visualization != just pretty pictures

Visualization can … • • •

Check methdological assumptions Show unanticipated patterns Generate new hypothesis

Data Visualization! 39

How does GEviT add Rigor to Data Visualization in GenEpi? § Ad hoc – visualization mainly by intuition, trial & error

GEviT introduces design alternatives, systematic reasoning

§ No real comparison / evaluation

GEviT enables comparison for design & evaluation

§ No attempts to standardize

GEviT standardizes descriptions of elements in visualization design

§ No formal instruction

GEviT bridges GenEpi with infovis pedagogy

§ Some community dialogue

GEviT intended to start a bigger dialogue about data visualization practices

40

Design Trajectory of GEviT Gallery: Treevis.net

EXAMPLE: Info about vis EXAMPLE: Info about vis

Setviz.net

Vishealth.org

41