Ischemic Stroke. Acute. Acute. Outcome. Outcome. Patient A. Patient B. Time is Brain. â« Prolonged decision time. â« Expert stroke physician unavailable at the time.
Deep Learning – Big Data SMRT, Paris 2018
Kim Mouridsen, PhD, Associate Professor, Head of Neuroimaging Methods Aarhus University
Disclosures Cercare Medical, shares and salary
Introduction Deep Learning
Big Data
Buzz Artificial intelligence solution surpassing human performance in most tasks
Buzz Massive organized data volumes from interconnected devices facilitating discovery of unimaginably complex or surprising relations
Reality Extended regression model, but with potential to make many workflows more efficient
Reality Use of other data which cannot necessarily be organized in Excel such as images, free text and speach
Projections for AI in Healthcare In 2020 each person expected to generate 1.7MB per second Increase in healthcare data from 153 exabytes in 2013 to 2314 in 2020 Artificial intelligence projected to save 150 billion USD in 2026
Stanford Medicine 2017 Health Trends Report
1 Exabyte = 1 billion gigabytes
Deep Learning from Basic Principles
A very basic take on AI Population (classical)
Individual (’Big data’)
Marker
Marker
?
?
Group
A
B
A
Group
B
Group difference = 0?
Is patient X in group A or B?
Marker given group
Group given marker
Regression outcome = weight1•feature1
Measurement
Traditional Regression
feature
Regression outcome = weight1•feature1
outcome = weight1•feature1 + weight2•feature2
Measurement
Traditional Regression
feature
Regression outcome = weight1•feature1
outcome = weight1•feature1 + weight2•feature2 outcome = weight1•feature1 + weight2•feature2+ … + weightK•featureK
Measurement
Traditional Regression
feature
Regression outcome = weight1•feature1
outcome = weight1•feature1 + weight2•feature2
Measurement
Traditional Regression
feature
Classification
Probability =
weight1•feature1 - weight2•feature2 - …
Category
outcome = weight1•feature1 + weight2•feature2+ … + weightK•featureK
One ’neuron’ is one model Feature 1
Feature 2
Combine features ∑ Weighti•Featurei
Feature K
Generate activation Response
Same as regression! Feature 1 Combine features
Feature 2
Generate activation
∑ Weighti•Featurei
Feature K
Probability =
weight1•feature1 - weight2•feature2 - …
Response
Why not combine two models Feature 1 Com bine featur es
Gener ate activation
∑ Weighti•Featurei
Feature 2 Response Com bine featur es ∑ Weighti•Featurei
Feature K
Gener ate activation
Or create a hierarchy of models Feature 1
Com bine featur es
Gener ate activation
∑ Weighti•Featurei
Feature 2
Com bine featur es
Gener ate activation
∑ Weighti•Featurei
Com bine featur es
Gener ate activation
Response
∑ Weighti•Featurei
Com bine featur es ∑ Weighti•Featurei
Com bine featur es ∑ Weighti•Featurei
Feature K
Gener ate activation
Gener ate activation
Critical: Layers generate features Auto Feature 1
Com bine featur es
Gener ate activation
∑ Weighti•Featurei
Com bine featur es
Gener ate activation
∑ Weighti•Featurei
Auto Feature 2
Com bine featur es
Gener ate activation
Response
∑ Weighti•Featurei
Com bine featur es ∑ Weighti•Featurei
Auto Feature 3
Com bine featur es ∑ Weighti•Featurei
Gener ate activation
Gener ate activation
Critical: Layers generate features
Higher auto Feature 1
Com bine featur es
Gener ate activation
∑ Weighti•Featurei
Response Higher auto Feature 2
Com bine featur es ∑ Weighti•Featurei
Gener ate activation
Examples
Prediction with original features Regression Feature 1 Feature 2
Invent a new feature Regression + engineer additional feature
Feature 1 Feature 2 Additional feature Additional Feature 1
Invent a second feature Regression + invent additional feature + invent second feature
Feature 1 Feature 2 Additional feature Additional feature Additional Feature 1
Additional Feature 2
Automatically establish features with neural network
‘Deep’ neural network
Feature 1 Feature 2
Clinical Value
Applications Image reconstruction Intelligent user interface Automatic reporting Disease detection Prediction of disease progression Prediction of response to treatment
Ischemic Stroke Patient A
Patient B
Acute
Time is Brain
Acute
▪ Prolonged decision time ▪ Expert stroke physician unavailable at the time Outcome Outcome ▪ Stroke unit not available at hospital
Patient Triaging MODEL FOR TRIAGING
Perfusion/diffusion mismatch: state-of-the-art triaging for active or supportive treatment
PERMANENT
SALVAGEABLE
Patient Triaging Perfusion/diffusion mismatch: state-of-the-art triaging for active or supportive treatment
However mismatchvolume detection requires time-consuming and expert-dependent examination
IMAGING
MODEL FOR TRIAGING PERMANENT
SCANNER
Perfusion
Diffusion SALVAGEABLE
Image preprocessing
Manual feature engineering
Whole brain mask CSF mask Base slice elimina on Coregistra on Lesion laterality
Courtesy Kartheeban Nagenthiraja
PWI
DWI
Threshold mask
Grayscale morph. Recon.
Seed point detec on
Ini al masks
Morph. reconstruc on
Morph. reconstruc on
Level-sets
Removal of mirror comp.
Penumbra
Automated Mismatch Identification MODEL FOR TRIAGING
IMAGING
Mismatch Identification
PERMANENT
SCANNER
SALVAGEABLE
Perfusion
CFIN
Diffusion
Follow-Up
Automated identification of salvageable tissue in 1 minute. Works with routine vendor images. Validated in over 220 patients from 5 different countries and various scanner vendors 93% sensitivity and 95% specificity relative to expert concensus. Mean difference in mismatch volume of 4 ml between Cercare and experts
Deep Learning Hierarchical Categorization of visual stimuli in the Brain
Deep Learning Hierarchical Categorization of visual stimuli in the Brain
Image Classification – the AI Big Bang
ImageNet Database • 1.000.000 images • 1.000 categories
Image Classification – the AI Big Bang Better than human engineered solution
ImageNet Database • 1.000.000 images • 1.000 categories
Better than human performance
Deep Computational Architecture
( ) Estimated Outcome
Actual Outcome
Input
CED picture: Badrinarayanan et al., 2015
Identification of irreversibly damaged tissue DWI
N=847 acute ischemic stroke patients from three different studies
0
50.0 0.1
50
0.5
DWI, ADC T2-FLAIR
5.0
150
100
Estimated volume (ml)
GE Signa Genesis, -HDx 1.5 and 3.0T, -Excite 1.5T and 3T. Siemens Avanto and Sonata, Philips Itera, Gyroscan and Achieva
Estimated volume (ml)
200
250
Training data 719 patients (85%)
0
50
100
150
200
Courtesy Anne Nielsen, CFIN Expert based volume (ml)
250
0.1
0.5
5.0
50.0
Expert based volume (ml)
Deep NN
Expert
Step 1: Nonlinear ASL signal denoising using Non-Local (NLM) and Multi-contrast Guided Filter Nex=1 Low SNR ASL raw
Nex=6 High SNR Ref ASL
𝒘𝑨𝑺𝑳 = ∞
𝒘𝑨𝑺𝑳 = 𝟏𝟎𝟎
original recon
𝒘𝑨𝑺𝑳 = 𝟓𝟎
𝒘𝑨𝑺𝑳 = 𝟐𝟎
𝒘𝑨𝑺𝑳 = 𝟏𝟎
𝒘𝑨𝑺𝑳 = 𝟓
more regularization using nonlinear denoising
Step 2: Generate patches from High-SNR Ref. ASL, Low-SNR raw ASL, multi-level denoised ASL and anatomical MR images
Nex=6 High SNR Ref ASL
Nex=1 Low SNR ASL
Denoised ASL with different 𝒘𝑨𝑺𝑳
𝑻𝟐𝒘FSE
𝑷𝑫𝒘
Input Input Patches Patches
Network
…
by-passes connections More Layers
Multi-contrast patches Slide courtesy Greg Zaharchuk
Cost function
Step 3: Training deep network to learn the nonlinear image restoration from multi-contrast patches Deep Convolutonal-Deconvolutional Neural Output Compare vs.
Ref
Output: restored high-SNR ref
Gong, Pauly, Zaharchuk/Stanford/Proc ISMRM 2017
Improved Perfusion MRI
Slide courtesy Greg Zaharchuk
Deep Learning Model
High SNR ASL
Low SNR ASL
T2 weighted
Proton density
Synthetic ASL
RSME 10%
RSME 29% + 4-fold time reduction 3-fold RSME improvement
Error map vs High SNR Note: RMSE=Root-Mean-Squared-Error (normalized)
Gong, Pauly, Zaharchuk/Stanford/Proc ISMRM 2017
Super-Resolution for 3D Neuroimaging Deep Learning Model
• 3x resolution improvement • Better diagnostic quality • More confident clinical decision
Slide courtesy Greg Zaharchuk
Chaudhari, Gong/Subtle Medical/Proc Nvidia GTC 2018
Types of Artificial Intelligence
Artificial Intelligence
Supervised Learning Task-driven, for instance: Image reconstruction Disease identification Prediction of treatment response
Machine Learning
Unsupervised Learning Data-driven, for instance Segmentation Anomaly detection Sub-group identification
Unsupervised Learning – See the Pattern?
K-Means Clustering
K-Means Clustering
Immediate application: Tissue segmentation
tion 6
Iteration 7
Start
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 8
Second Application: Arterial Supply 3. Vessels found
Inter-rater agreement
DCE-MRI: Arteries & Veines
DCE-MRI:Abdominal
Agreement with experts is comparable to agreement between experts
Automated Experts
Mouridsen et al, MRM 2006
2. Cluster analysis
1. Input
Can Deep Learning Sneak Beyond Human Performance?
WO2012079593, WO2014044284
The Challenge with Mismatch
Perfusion
Experts
Mismatch
Follow-Up
Diffusion
Likely too simplistic to predict actual tissue outcome Accurate prediction of progression should be ’learned’ from development in all previous patients
WO2014044284
Learning From Every Previous Patients
Perfusion
Experts
Mismatch
Follow-Up
Diffusion
Individual Prediction
Predicted Risk
Database
Imaging Biomarkers and Patients Acute
Follow-up
Biomarkers
• • • • •
Mean transit time (MTT) Cerebral blood volume (CBV) Cerebral blood flow (CBF) Cerebral metabolism of oxygen (CMRO2): Oxygen availability Relative transit time heterogeneity (RTH) Time-point for the maximum of the residue function (Tmax) Diffusion Weighted Imaging (MRI) Apparent Diffusion Coefficient (ADC) T2 FLAIR
•
IKnow multicenter study ▪ ▪ ▪ ▪ ▪
• • • •
Denmark UK France Germany Spain
Remote Ischemic Perconditioning Trial Philips Gyroscan NT, Intera 1.5T, Achieva 3T GE Signa Excite, Signa HDx, Signa Genesis 1.5T, Signa Excite 3.0T Siemens Avanto, Sonata 1.5T, TrioTim 3.0T
Hougaard et. al., Stroke 2013 IKNOW, 2006
• • • •
Data Sources
Deep Learning Predicted Outcome
Actual Outcome
Courtesy Anne Nielsen, Cercare Medical
Example Case 1.0
0.5
0.0
MTT
CBV
CBF
CMRO2
RTH
Tmax
DWI
ADC
T2-FLAIR
Generalized Linear Model
Courtesy Anne Nielsen, Cercare Medical
Deep Neural Network
Follow-up
Courtesy Anne Nielsen, Cercare Medical
Example Cases
WO2014044284
Optimal Treatment Decisions? Acute patient
Database
Possible treatments Thrombolysis
Predicted Outcome High risk
Low risk!
Instantly predict outcome with different treatments for individual patient based on imaging and clinical data from all previous patients Improve prediction with every new patient
Stent
Optimal Treatment Decision Cases Active Therapy
Courtesy Anne Nielsen, Cercare Medical
Conservative
WO2014044284
Differentiating Disease Progression with Alternative Treatment Options +rtPA
-rtPA
Follow Up
Treatment Effect
Patient A
Incorporate therapeutic options into predictive model
Model differentiates outcomes Guide intervention by estimating response to treatment
Patient B
Nielsen et al. In submission
No Treatment Effect
• • • •
N=443 stroke patients Intervention: remote ischemia in ambulance hypothesized to promote neuroprotection Three-month clinical outcome was not significantly different ’Big-data’ analysis at voxel level revealed significant differences in progression
Controls
Perconditioning
Hougaard et. al, Stroke 2013