Deep Learning – Big Data [PDF]

1 downloads 71 Views 2MB Size Report
Ischemic Stroke. Acute. Acute. Outcome. Outcome. Patient A. Patient B. Time is Brain. ▫ Prolonged decision time. ▫ Expert stroke physician unavailable at the time.
Deep Learning – Big Data SMRT, Paris 2018

Kim Mouridsen, PhD, Associate Professor, Head of Neuroimaging Methods Aarhus University

Disclosures Cercare Medical, shares and salary

Introduction Deep Learning

Big Data

Buzz Artificial intelligence solution surpassing human performance in most tasks

Buzz Massive organized data volumes from interconnected devices facilitating discovery of unimaginably complex or surprising relations

Reality Extended regression model, but with potential to make many workflows more efficient

Reality Use of other data which cannot necessarily be organized in Excel such as images, free text and speach

Projections for AI in Healthcare In 2020 each person expected to generate 1.7MB per second Increase in healthcare data from 153 exabytes in 2013 to 2314 in 2020 Artificial intelligence projected to save 150 billion USD in 2026

Stanford Medicine 2017 Health Trends Report

1 Exabyte = 1 billion gigabytes

Deep Learning from Basic Principles

A very basic take on AI Population (classical)

Individual (’Big data’)

Marker

Marker

?

?

Group

A

B

A

Group

B

Group difference = 0?

Is patient X in group A or B?

Marker given group

Group given marker

Regression outcome = weight1•feature1

Measurement

Traditional Regression

feature

Regression outcome = weight1•feature1

outcome = weight1•feature1 + weight2•feature2

Measurement

Traditional Regression

feature

Regression outcome = weight1•feature1

outcome = weight1•feature1 + weight2•feature2 outcome = weight1•feature1 + weight2•feature2+ … + weightK•featureK

Measurement

Traditional Regression

feature

Regression outcome = weight1•feature1

outcome = weight1•feature1 + weight2•feature2

Measurement

Traditional Regression

feature

Classification

Probability =

weight1•feature1 - weight2•feature2 - …

Category

outcome = weight1•feature1 + weight2•feature2+ … + weightK•featureK

One ’neuron’ is one model Feature 1

Feature 2

Combine features ∑ Weighti•Featurei

Feature K

Generate activation Response

Same as regression! Feature 1 Combine features

Feature 2

Generate activation

∑ Weighti•Featurei

Feature K

Probability =

weight1•feature1 - weight2•feature2 - …

Response

Why not combine two models Feature 1 Com bine featur es

Gener ate activation

∑ Weighti•Featurei

Feature 2 Response Com bine featur es ∑ Weighti•Featurei

Feature K

Gener ate activation

Or create a hierarchy of models Feature 1

Com bine featur es

Gener ate activation

∑ Weighti•Featurei

Feature 2

Com bine featur es

Gener ate activation

∑ Weighti•Featurei

Com bine featur es

Gener ate activation

Response

∑ Weighti•Featurei

Com bine featur es ∑ Weighti•Featurei

Com bine featur es ∑ Weighti•Featurei

Feature K

Gener ate activation

Gener ate activation

Critical: Layers generate features Auto Feature 1

Com bine featur es

Gener ate activation

∑ Weighti•Featurei

Com bine featur es

Gener ate activation

∑ Weighti•Featurei

Auto Feature 2

Com bine featur es

Gener ate activation

Response

∑ Weighti•Featurei

Com bine featur es ∑ Weighti•Featurei

Auto Feature 3

Com bine featur es ∑ Weighti•Featurei

Gener ate activation

Gener ate activation

Critical: Layers generate features

Higher auto Feature 1

Com bine featur es

Gener ate activation

∑ Weighti•Featurei

Response Higher auto Feature 2

Com bine featur es ∑ Weighti•Featurei

Gener ate activation

Examples

Prediction with original features Regression Feature 1 Feature 2

Invent a new feature Regression + engineer additional feature

Feature 1 Feature 2 Additional feature Additional Feature 1

Invent a second feature Regression + invent additional feature + invent second feature

Feature 1 Feature 2 Additional feature Additional feature Additional Feature 1

Additional Feature 2

Automatically establish features with neural network

‘Deep’ neural network

Feature 1 Feature 2

Clinical Value

Applications Image reconstruction Intelligent user interface Automatic reporting Disease detection Prediction of disease progression Prediction of response to treatment

Ischemic Stroke Patient A

Patient B

Acute

Time is Brain

Acute

▪ Prolonged decision time ▪ Expert stroke physician unavailable at the time Outcome Outcome ▪ Stroke unit not available at hospital

Patient Triaging MODEL FOR TRIAGING

Perfusion/diffusion mismatch: state-of-the-art triaging for active or supportive treatment

PERMANENT

SALVAGEABLE

Patient Triaging Perfusion/diffusion mismatch: state-of-the-art triaging for active or supportive treatment

However mismatchvolume detection requires time-consuming and expert-dependent examination

IMAGING

MODEL FOR TRIAGING PERMANENT

SCANNER

Perfusion

Diffusion SALVAGEABLE

Image preprocessing

Manual feature engineering

Whole brain mask CSF mask Base slice elimina on Coregistra on Lesion laterality

Courtesy Kartheeban Nagenthiraja

PWI

DWI

Threshold mask

Grayscale morph. Recon.

Seed point detec on

Ini al masks

Morph. reconstruc on

Morph. reconstruc on

Level-sets

Removal of mirror comp.

Penumbra

Automated Mismatch Identification MODEL FOR TRIAGING

IMAGING

Mismatch Identification

PERMANENT

SCANNER

SALVAGEABLE

Perfusion

CFIN

Diffusion

Follow-Up

Automated identification of salvageable tissue in 1 minute. Works with routine vendor images. Validated in over 220 patients from 5 different countries and various scanner vendors 93% sensitivity and 95% specificity relative to expert concensus. Mean difference in mismatch volume of 4 ml between Cercare and experts

Deep Learning Hierarchical Categorization of visual stimuli in the Brain

Deep Learning Hierarchical Categorization of visual stimuli in the Brain

Image Classification – the AI Big Bang

ImageNet Database • 1.000.000 images • 1.000 categories

Image Classification – the AI Big Bang Better than human engineered solution

ImageNet Database • 1.000.000 images • 1.000 categories

Better than human performance

Deep Computational Architecture

( ) Estimated Outcome

Actual Outcome

Input

CED picture: Badrinarayanan et al., 2015

Identification of irreversibly damaged tissue DWI

N=847 acute ischemic stroke patients from three different studies

0

50.0 0.1

50

0.5

DWI, ADC T2-FLAIR

5.0

150

100

Estimated volume (ml)

GE Signa Genesis, -HDx 1.5 and 3.0T, -Excite 1.5T and 3T. Siemens Avanto and Sonata, Philips Itera, Gyroscan and Achieva

Estimated volume (ml)

200

250

Training data 719 patients (85%)

0

50

100

150

200

Courtesy Anne Nielsen, CFIN Expert based volume (ml)

250

0.1

0.5

5.0

50.0

Expert based volume (ml)

Deep NN

Expert

Step 1: Nonlinear ASL signal denoising using Non-Local (NLM) and Multi-contrast Guided Filter Nex=1 Low SNR ASL raw

Nex=6 High SNR Ref ASL

𝒘𝑨𝑺𝑳 = ∞

𝒘𝑨𝑺𝑳 = 𝟏𝟎𝟎

original recon

𝒘𝑨𝑺𝑳 = 𝟓𝟎

𝒘𝑨𝑺𝑳 = 𝟐𝟎

𝒘𝑨𝑺𝑳 = 𝟏𝟎

𝒘𝑨𝑺𝑳 = 𝟓

more regularization using nonlinear denoising

Step 2: Generate patches from High-SNR Ref. ASL, Low-SNR raw ASL, multi-level denoised ASL and anatomical MR images

Nex=6 High SNR Ref ASL

Nex=1 Low SNR ASL

Denoised ASL with different 𝒘𝑨𝑺𝑳

𝑻𝟐𝒘FSE

𝑷𝑫𝒘

Input Input Patches Patches

Network



by-passes connections More Layers

Multi-contrast patches Slide courtesy Greg Zaharchuk

Cost function

Step 3: Training deep network to learn the nonlinear image restoration from multi-contrast patches Deep Convolutonal-Deconvolutional Neural Output Compare vs.

Ref

Output: restored high-SNR ref

Gong, Pauly, Zaharchuk/Stanford/Proc ISMRM 2017

Improved Perfusion MRI

Slide courtesy Greg Zaharchuk

Deep Learning Model

High SNR ASL

Low SNR ASL

T2 weighted

Proton density

Synthetic ASL

RSME 10%

RSME 29% + 4-fold time reduction 3-fold RSME improvement

Error map vs High SNR Note: RMSE=Root-Mean-Squared-Error (normalized)

Gong, Pauly, Zaharchuk/Stanford/Proc ISMRM 2017

Super-Resolution for 3D Neuroimaging Deep Learning Model

• 3x resolution improvement • Better diagnostic quality • More confident clinical decision

Slide courtesy Greg Zaharchuk

Chaudhari, Gong/Subtle Medical/Proc Nvidia GTC 2018

Types of Artificial Intelligence

Artificial Intelligence

Supervised Learning Task-driven, for instance: Image reconstruction Disease identification Prediction of treatment response

Machine Learning

Unsupervised Learning Data-driven, for instance Segmentation Anomaly detection Sub-group identification

Unsupervised Learning – See the Pattern?

K-Means Clustering

K-Means Clustering

Immediate application: Tissue segmentation

tion 6

Iteration 7

Start

Iteration 1

Iteration 2

Iteration 3

Iteration 4

Iteration 5

Iteration 6

Iteration 7

Iteration 8

Iteration 8

Second Application: Arterial Supply 3. Vessels found

Inter-rater agreement

DCE-MRI: Arteries & Veines

DCE-MRI:Abdominal

Agreement with experts is comparable to agreement between experts

Automated Experts

Mouridsen et al, MRM 2006

2. Cluster analysis

1. Input

Can Deep Learning Sneak Beyond Human Performance?

WO2012079593, WO2014044284

The Challenge with Mismatch

Perfusion

Experts

Mismatch

Follow-Up

Diffusion

Likely too simplistic to predict actual tissue outcome Accurate prediction of progression should be ’learned’ from development in all previous patients

WO2014044284

Learning From Every Previous Patients

Perfusion

Experts

Mismatch

Follow-Up

Diffusion

Individual Prediction

Predicted Risk

Database

Imaging Biomarkers and Patients Acute

Follow-up

Biomarkers

• • • • •

Mean transit time (MTT) Cerebral blood volume (CBV) Cerebral blood flow (CBF) Cerebral metabolism of oxygen (CMRO2): Oxygen availability Relative transit time heterogeneity (RTH) Time-point for the maximum of the residue function (Tmax) Diffusion Weighted Imaging (MRI) Apparent Diffusion Coefficient (ADC) T2 FLAIR



IKnow multicenter study ▪ ▪ ▪ ▪ ▪

• • • •

Denmark UK France Germany Spain

Remote Ischemic Perconditioning Trial Philips Gyroscan NT, Intera 1.5T, Achieva 3T GE Signa Excite, Signa HDx, Signa Genesis 1.5T, Signa Excite 3.0T Siemens Avanto, Sonata 1.5T, TrioTim 3.0T

Hougaard et. al., Stroke 2013 IKNOW, 2006

• • • •

Data Sources

Deep Learning Predicted Outcome

Actual Outcome

Courtesy Anne Nielsen, Cercare Medical

Example Case 1.0

0.5

0.0

MTT

CBV

CBF

CMRO2

RTH

Tmax

DWI

ADC

T2-FLAIR

Generalized Linear Model

Courtesy Anne Nielsen, Cercare Medical

Deep Neural Network

Follow-up

Courtesy Anne Nielsen, Cercare Medical

Example Cases

WO2014044284

Optimal Treatment Decisions? Acute patient

Database

Possible treatments Thrombolysis

Predicted Outcome High risk

Low risk!

Instantly predict outcome with different treatments for individual patient based on imaging and clinical data from all previous patients Improve prediction with every new patient

Stent

Optimal Treatment Decision Cases Active Therapy

Courtesy Anne Nielsen, Cercare Medical

Conservative

WO2014044284

Differentiating Disease Progression with Alternative Treatment Options +rtPA

-rtPA

Follow Up

Treatment Effect

Patient A

Incorporate therapeutic options into predictive model

Model differentiates outcomes Guide intervention by estimating response to treatment

Patient B

Nielsen et al. In submission

No Treatment Effect

• • • •

N=443 stroke patients Intervention: remote ischemia in ambulance hypothesized to promote neuroprotection Three-month clinical outcome was not significantly different ’Big-data’ analysis at voxel level revealed significant differences in progression

Controls

Perconditioning

Hougaard et. al, Stroke 2013

Suggest Documents