Voxel by voxel statistics…

54 downloads 5884 Views 1MB Size Report
General Linear Model (in SPM). Auditory words every 20s. SPM{F}. 0 time {. 0 time {secs} 30. } .... H True (o). TN. FP. H False (x). FN. TP. Don't. Reject. Reject. ACTION. TRUTH .... Example: Experiment with 100,000 « voxels » and 40 d.f. type I error α=0.05 (5% risk) ⇒ tα = 1.68. 100,000 t ..... Active:View five letters, 2s pause,.
Contents Contents

DISCOS ège, 2009 Li DISCOS SPM SPM course, course, CRC, CRC, Liè Liège, 2009

Multiple Multiple comparison comparison problem problem

•• Recap Recap & & Introduction Introduction •• Inference Inference & & multiple multiple comparison comparison •• « « Take Take home home » » message message

C. Phillips, Centre de Recherches du Cyclotron, ULg, Belgium Based on slides from: T. Nichols

image data

parameter estimates

Statistical Parametric Map

corrected p-values

Voxel … Voxel by by voxel voxel statistics statistics… model specification parameter estimation

« «model modelfitting fitting « «statistic statisticimage image

Random Randomeffect effectanalysis analysis

smoothing smoothing

Dynamic Dynamiccausal causal modelling, modelling, Functional & Functional &effective effective connectivity, connectivity,PPI, PPI,... ... anatomical reference

kernel

design matrix

hypothesis test

correction correctionfor for multiple multiplecomparisons comparisons

Time

normalisation normalisation

General GeneralLinear LinearModel Model

e Ti m

realignment realignment&& motion motion correction correction

single single voxel voxel time time series series

statistic

Intensity statistic image or SPM

Contents Contents

General General Linear Linear Model Model (in (in SPM) SPM) Auditory words every 20s

•• Recap Recap & & Introduction Introduction •• Inference Inference & & multiple multiple comparison comparison

(Orthogonalised) Gamma functions ƒi(u) of peristimulus time u

SPM{F}

•• Single/multiple Single/multiple voxel voxel inference inference •• Family wise error rate Family wise error rate (FWER) (FWER) •• False False Discovery Discovery rate rate (FDR) (FDR)

Sampled every TR = 1.7s

•• SPM SPM results results

Design matrix, X

•• « « Take Take home home » » message message

[ƒ1(u)⊗ (u)⊗x(t) |ƒ |ƒ2(u)⊗ (u)⊗x(t) |...] 0

time {secs } {secs}

30

Inference Inference at at a a single single voxel voxel 0.4

NULL hypothesis, H: activation is zero

0.35 0.3 0.25

α = p(t>u|H)

0.2 0.15 0.1 0.05 0 −6

−4

−2

t-distribution

0

2

u=2

4

6

p-value: probability of getting a value of t at least as extreme as u. If α is small we reject the null hypothesis H.

Hypothesis Hypothesis Testing Testing

•• Null Null Hypothesis Hypothesis H H00 •• Test statistic T Test statistic T



–– tt observed observed realization realization of of TT

•• αα level level –– –– ––

Acceptable Acceptable false false positive positive rate rate α Level >uαα || H Level αα == P( P( TT>u H00 )) Null Distribution of T Threshold αα Threshold uuαα controls controls false false positive positive rate rate at at level level

•• PP-value -value –– ––

Assessment Assessment of of tt assuming assuming H H00 P( T > t | H ) P( T > t | H00 )

t

•• Prob. Prob.of ofobtaining obtainingstat. stat.as aslarge large or orlarger larger in inaa new new experiment experiment

–– P(Data|Null) ) not ) P(Data|Null P(Null|Data P(Data|Null) not P(Null|Data) P(Null|Data)

P-val Null Distribution of T

Inference Inference at at a a single single voxel voxel 0.4

NULL hypothesis, H: activation is zero

0.35 0.3 0.25

α = p(t>u|H)

0.2 0.15

We can choose u to ensure a voxel-wise significance level of α.

0.1 0.05 0 −6

−4

−2

t-distribution

0

2

u=2

4

6

This is called an ‘uncorrected’ p-value, for reasons we’ll see later. We can then plot a map of above threshold voxels.

What What we we need need •• Need Need an an explicit explicit spatial spatial model model •• No No routine routine spatial spatial modeling modeling methods methods exist exist –– High-dimensional mixture High High-dimensional mixture modeling modeling problem problem –– Activations ’t look don Activations don’ don’t look like like Gaussian Gaussian blobs blobs –– Need realistic shapes, sparse Need realistic shapes, sparse representation representation •• Some , Penny al. Some work work by by Hartvig Hartvig et et al., al., Penny et et al. al.

What ’d like What we we’d like •• Don’ ’t threshold, Don Don’t threshold, model model the the signal! signal! –– Signal ? location Signal location? location?

θˆMag.

•• Estimates ’s on CI Estimates and and CI’ CI’s on ((x,y,z) x,y,z) x,y,z) location location

–– Signal ? magnitude Signal magnitude? magnitude? •• CI’ ’s on CI CI’s on % % change change

–– Spatial ? extent Spatial extent? extent?

θˆLoc.

θˆExt.

space

•• Estimates ’s on CI Estimates and and CI’ CI’s on activation activation volume volume •• Robust to choice of cluster Robust to choice of cluster definition definition

•• ...but ...but this this requires requires an an explicit explicit spatial spatial model model

Real-life inference: Real Real-life inference: What What we we get get •• Signal Signal location location –– Local Local maximum maximum –– no no inference inference –– Center-ofCenter of-mass –– no Center-of-mass no inference inference •• Sensitive -definingblob defining-threshold Sensitive to to blobblob-defining-threshold

•• Signal Signal magnitude magnitude –– Local -values (& ’s) CI Local maximum maximum intensity intensity –– PP-values (& CI’ CI’s)

•• Spatial Spatial extent extent –– Cluster -value, no ’s CI Cluster volume volume –– PP-value, no CI’ CI’s •• Sensitive -definingblob defining-threshold Sensitive to to blobblob-defining-threshold

Voxel-level Inference Voxel Voxel-level Inference •• Retain -level threshold Retain voxels voxels above above αα-level threshold uuαα •• Gives Gives best best spatial spatial specificity specificity –– The The null null hypothesis hypothesis at at aa single single voxel voxel can can be be rejected rejected

Cluster-level Inference Cluster Cluster-level Inference •• Two -process step Two stepstep-process ––Define Define clusters clusters by by arbitrary arbitrary threshold threshold uuclus clus ––Retain -level threshold Retain clusters clusters larger larger than than αα-level threshold kkαα

uclus

uα space

Significant Voxels

space

No significant Voxels

Cluster-level Inference Cluster Cluster-level Inference



Cluster significant



Set-level Inference Set Set-level Inference •• Count Count number number of of blobs blobs cc

•• Typically Typically better better sensitivity sensitivity •• Worse Worse spatial spatial specificity specificity

–– Minimum Minimum blob blob size size kk

–– The The null null hypothesis hypothesis of of entire entire cluster cluster is is rejected rejected –– Only Only means means that that one one or or more more of of voxels voxels in in cluster cluster active active

•• Worst Worst spatial spatial specificity specificity –– Only Only can can reject reject global global null null hypothesis hypothesis

uclus

uclus

space

space

Cluster not significant

Cluster not significant





Cluster significant

k k Here c = 1; only 1 cluster larger than k

Sensitivity Sensitivity and and Specificity Specificity At u1

Reject

H True (o)

TN

FP

H False (x)

FN

TP

Sens=7/10=70% Spec=9/10=90% Eg. t-scores from regions that truly do and do not activate

Sensitivity = TP/(TP+FN) = β u1 u2 Specificity = TN/(TN+FP) = 1 - α FP = Type I error or ‘error’ oooooooxxxooxxxoxxxx FN = Type II error α = p-value/FP rate/error rate/significance level β = power

Inference Inference for for Images Images Noise

–– 1,000 1,000 multivariate multivariate observations, observations, each each with with 100,000 100,000 elements elements –– 100,000 100,000 time time series, series, each each with 3 with 1,000 1,000 observations observations

.

At u2

1,000

•• 4 -Dimensional Data 4-Dimensional Data

.

Don’t Reject

Sens=10/10=100% Spec=7/10=70%

.

ACTION

TRUTH

fMRI fMRI Multiple Multiple Comparisons Comparisons Problem Problem

•• Massively Massively Univariate Univariate Approach Approach –– 100,000 100,000 hypothesis hypothesis tests tests

•• Massive Massive MCP! MCP!

2 1

Multiple Multiple comparison comparison problem problem Use of ‘uncorrected’ p-value, α=0.1 11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2%

Percentage of Null Pixels that are False Positives

Signal

Signal+Noise

Using an ‘uncorrected’ p-value of 0.1 will lead us to conclude on average that 10% of voxels are active when they are not. This is clearly undesirable : multiple comparison problem. To correct for this we can define a null hypothesis for images of statistics.

9.5%

Multiple Multiple Comparisons Comparisons Problem Problem •• Which Which of of 100,000 100,000 voxels voxels are are sig.? sig.?

Assessing Assessing Statistic Statistic Images Images Where’ ’s the Where Where’s the signal? signal?

–– αα=0.05 =0.05 ⇒ ⇒ 5,000 5,000 false false positive positive voxels voxels

High Threshold

•• Which Which of of (random (randomnumber, number,say) say)100 100 clusters clusters significant? significant?

t > 5.5

Med. Threshold

Low Threshold t > 0.5

t > 3.5

–– αα=0.05 =0.05 ⇒ ⇒ 55 false false positives positives clusters clusters

t > 0.5

t > 1.5

t > 2.5

t > 3.5

t > 4.5

t > 5.5

t > 6.5

Good Specificity Poor Power (risk of false negatives)

Multiple … Multiple comparisons comparisons…

Good Power

Solutions Solutions for for Multiple Multiple Comparison Comparison Problem Problem

••Threshold Threshold at at pp ? ?

t59

––expect )% by expect (100 (100 ×× pp)% by chance chance

••Surprise Surprise ? ?

––extreme extreme voxel voxel values values

→ voxel level →voxel level inference inference

p = 0.05

––big big suprathreshold suprathreshold clusters clusters

→ cluster level →cluster level inference inference

––many many suprathreshold suprathreshold clusters clusters → set level →set level inference inference

Gaussian 10mm FWHM (2mm pixels)

Poor Specificity (risk of false positives)

••Power Power & & localisation localisation → sensitivity →sensitivity → spatial specificity →spatial specificity

•• A False A MCP MCP Solution Solution must must control control ““False Positives” ” Positives Positives”

–– How How to to measure measure multiple multiple false false positives? positives?

•• Familywise Familywise Error Error Rate Rate (FWER) (FWER)

–– Chance Chance of of any any false false positives positives –– Controlled , Random Bonferroni Controlled by by Bonferroni, Bonferroni, Random Field Field Methods, -parametric method SnPM). ). non ((SnPM). Methods, nonnon-parametric method (SnPM

•• False False Discovery Discovery Rate Rate (FDR) (FDR)

–– Proportion Proportion of of false false positives positives among among rejected rejected tests tests

Contents Contents •• Recap Recap & & Introduction Introduction •• Inference Inference & & multiple multiple comparison comparison •• Single/multiple Single/multiple voxel voxel inference inference •• Family wise error rate Family wise error rate (FWER) (FWER)

Family -wise Null Family-wise Null Hypothesis Hypothesis FAMILY-WISE NULL HYPOTHESIS: Activation is zero everywhere

•• Family Family of of hypotheses hypotheses

–– HHkk kk∈∈Ω …,K} {1, Ω=={1,… {1,…,K} 2 k K –– HHΩΩ==HH11∩ ∩HH2… …∩ ∩HHk∩ ∩HHK

If we reject a voxel null hypothesis at any voxel, we reject the family-wise null hypothesis

•• Bonferroni Random Field correction/ Bonferroni correction/Random correction/Random Field Theory Theory •• Non-parametric approach Non Non-parametric approach

•• False False Discovery Discovery rate rate (FDR) (FDR) •• SPM SPM results results

•• « « Take Take home home » » message message

FWE FWE MCP MCP Solutions: Solutions:

Controlling Controlling FWE FWE w/ w/ Max Max •• FWE FWE & & distribution distribution of of maximum maximum FWE FWE

== P(FWE) P(FWE) == P( Tii ≥≥ uu} } || H P( ∪ ∪ii {{T Hoo)) == P( P( max maxii TTii≥≥ uu || H Hoo))

•• 100(1-α)%ile of 100(1 100(1-α)%ile of max max dist distnn controls controls FWE FWE –– where where

..

FWE FWE == P( P( max maxii TTii≥≥ uuαα || H Hoo)) == αα 1 uuαα == FF--1 (1(1-α) max max (1-α)

α uα

A FP anywhere gives a Family Wise Error (FWE) Family-wise error rate = ‘corrected’ p-value

Multiple … problem Multiple comparison comparison problem… problem… Example: : Example Example: Experiment Experiment with with 100,000 100,000 « « voxels voxels » » and and 40 40 d.f. d.f. type =0.05 (5% )⇒ risk type II error error αα=0.05 (5% risk) risk) ⇒ ttαα = = 1.68 1.68 100,000 ⇒ 5000 100,000 tt values values ⇒ 5000 tt values values > > 1.68 1.68 just just by by chance chance !! FWE Familywise Familywise Error Error II test, test, P PFWE:: find threshold t such , in that find threshold tαα such that, that, in aa family family of of 100,000 100,000 tt statistics, , only statistics statistics, only 5% 5% probability probability of of one one or or more more tt values values above above that that threshold threshold

Bonferroni correction: simple method to find the new threshold Random field theory: more accurate for functional imaging

The Bonferroni” … Bonferroni” correction… correction The ““Bonferroni” correction… Given Given •• aa family family of of N N independent independent voxels voxels and and •• aa voxel-wise error voxel voxel-wise error rate rate vv The The probability probability that that all all tests tests are are below below the the threshold, threshold, NN i.e. true : (1v) i.e. that that H Hoo is is true : (1- v) The -Wise Error corrected’ Family corrected’ error The FamilyFamily-Wise Error rate rate (FWE) (FWE) or or ‘‘corrected’ error rate rate αα is is αα = -v)NN (1 = 11 –– (1(1-v) ~ ~ Nv Nv (for (forsmall small v) v) Therefore, Therefore, to to ensure ensure aa particular particular FWE FWE we we choose choose vv = = α/N α/N AA Bonferroni Bonferroni correction correction is is appropriate appropriate for for independent independent tests. tests.

Use of ‘uncorrected’ p-value, α=0.1

Use of ‘corrected’ p-value, α=0.1

FWE

The Bonferroni” … Bonferroni” correction… correction The ““Bonferroni” correction… Experiment Experiment with with N N= = 100,000 100,000 « « voxels voxels » » and and 40 40 d.f. d.f. –– vv = unknown corrected probability threshold, = unknown corrected probability threshold, –– find = 0.05 0.05 find vv such such that that family-wise family-wise error error rate rate αα = Bonferroni Bonferroni correction: correction: –– probability probability that that all all tests tests are are below below the the threshold, threshold, –– Use Use vv = = αα // N N –– Here Here v=0.05/100000=0.0000005 v=0.05/100000=0.0000005 ⇒ ⇒ threshold threshold tt = = 5.77 5.77 Interpretation: : Interpretation Interpretation: Bonferroni Bonferroni procedure procedure gives gives aa corrected corrected pp value, value, i.e. i.e. for for aa tt statistics statistics = = 5.77, 5.77, –– uncorrectd uncorrectd pp value value = = 0.0000005 0.0000005 –– corrected corrected pp value value = = 0.05 0.05

““Bonferroni” Bonferroni” correction correction & & independent independent observations observations

100 by 100 voxels, with a z value. 10000 independent measures Fix the PFWE = 0.05, z threshold ?

100 by 100 voxels, with a z value. 100 independent measures Fix the PFWE = 0.05, z threshold ?

Bonferroni: v = 0.05 / 10000 = 0.000005 ⇒ threshold z = 4.42

Bonferroni: v = 0.05 / 100 = 0.0005 ⇒ threshold z = 3.29

v=α/ni where ni is the number of independent observations.

““Bonferroni” Bonferroni” correction correction & & smoothed smoothed observations observations

Random Random Field Field Theory Theory •• Consider Consider aa statistic statistic image image as as aa lattice lattice representation representation of of aa continuous continuous random random field field •• Use Use results results from from continuous continuous random random field field theory theory

100 by 100 voxels, with a z value. 10000 independent measures

Fix the PFWE = 0.05, z threshold ? Bonferroni: v = 0.05/10000 = 0.000005 ⇒ threshold z = 4.42

100 by 100 voxels, with a z value. How many independent measures ? Fix the PFWE = 0.05, z threshold ? Bonferroni ?

Euler Euler Characteristic Characteristic (EC) (EC) Topological Topological measure measure –– threshold threshold an an image image at at uu –– excursion excursion set set AAuu −− χχ( (Α Αuu)) = =# # blobs blobs -- # # holes holes -- At (Α At high high u, u,χχ( Αuu)) = =# # blobs blobs Reject Reject H HΩΩ if if Euler Euler characteristic -zero non characteristic nonnon-zero

αα ≈≈ Pr(χ χ( Α Pr( Pr(χ( Αuu))>> 00 )) Expected Expected Euler Euler chararcteristic chararcteristic ≈≈ pp–value –value

αα ≈≈ EE[[χ( χ( Α Αuu)] )]]

(at (at high high u) u)

Lattice representation

Euler … Euler characteristic characteristic… Euler characteristic (EC) ≈ # blobs in a thresholded image.

(True only for high threshold)

EC = function of • threshold used • number of resels where resels (« resolution elements »)~ number of independent obsevations

⇒ E[EC] ≈ PFWE

Euler … Euler characteristic characteristic…

Euler … Euler characteristic characteristic… Euler characteristic (EC) ≈ # blobs in a thresholded image.

(True only for high threshold)

EC = function of • threshold used • number of resels

Threshold z-map at 2.75

Threshold z-map at 2.50

EC = 1

EC = 3

where resels (« resolution elements »)~ number of independent obsevations

⇒ E[EC] ≈ PFWE For a threshold zt at 2.50, E[EC] = 1.9 at 2.75, E[EC] = 1.1

Expected … characteristic Expected Euler Euler characteristic… characteristic… EE[[χ( (Ω) √√ ||Λ| Λ| (u u 22--1) χ( Α 1) exp(-u 22/2) 2π)22 exp /2 Αuu)] )]] ≈≈ λλ(Ω) (u exp(-u /2) // (2 (2π) 33 –– Ω → large search region Ω ⊂ R Ω → large search region Ω⊂R –– λλ(Ω) (Ω) → → volume volume –– √√|Λ| |Λ| → → smoothness smoothness 3 –– AAuu → AAuu == {x → excursion excursion set set {x ∈ ∈ RR3 :: Z(x) Z(x) >> u} u} 3 –– Z(x) Z(x) → → Gaussian Gaussian random random field field xx ∈ ∈ RR3 + + Multivariate Multivariate Normal Normal Finite Finite Dimensional Dimensional distributions distributions + + continuous continuous + + strictly strictly stationary stationary Au + (0,1) N 0,1) + marginal marginal N( N(0,1) + + continuously continuously differentiable differentiable + + twice twice differentiable differentiable at at 00 + + Gaussian Gaussian ACF ACF

Ω

instead of 3 and 1 as in example

Unified Unified Theory Theory •• General General form form for for expected expected Euler Euler characteristic characteristic •• χχ22,,FF,, & & ttfields fields

Au

•• restricted restricted search search regions regions

αα == Σ R Ω) ρρdd ((u) u) Rdd ((Ω) Rd (Ω), RESEL count

depends on : • the search region

Ω

ρd (υ): EC density depends on :

• type of field (eg. Gaussian, t) – how big, how smooth, • the threshold, u. what shape ?

(at (atleast leastnear nearlocal localmaxima) maxima)

Worsley et al. (1996), HBM

Unified Unified Theory Theory

Estimated Estimated component component fields fields Au

ρ0(u) ρ1(u) ρ2(u) ρ3(u) ρ4(u)

• R3(Ω)=resel volume

?

^ β parameter

« estimate

E.g. Gaussian RF:

• R2(Ω)=resel surface area

errors

scans

d-dimensional EC density :

• R1(Ω)=resel diameter

× parameters? +

Ω

ρd (u),

• R0(Ω)=χ(Ω) Euler characteristic of Ω

=

data matrix

αα == Σ R Ω) ρρdd ((u) u) Rdd ((Ω)

Rd (Ω), d-dimensional Minkowski functional of Ω

voxels

design matrix

•• General General form form for for expected expected Euler Euler characteristic characteristic •• χχ22,,FF,, & & ttfields fields •• restricted restricted search search regions regions

« estimates «

residuals

= 1- Φ(u) = (4 ln2)1/2 exp(-u2/2) / (2π) = (4 ln2) exp(-u2/2) / (2π)3/2

Each row is an estimated component field

= (4 ln2)3/2 (u2 -1) exp(-u2/2) / (2π)2 = (4

ln2)2

(u3

-3u)

exp(-u2/2)

/

(2π)5/2

÷ estimated variance = estimated component fields

Worsley et al. (1996), HBM

Random Random Field Field Theory Theory Smoothness Smoothness Parameterization Parameterization

••Smoothness |Λ| Smoothness √√|Λ|

•• RESELS RESELS –– –– ––

Smoothness, ... Smoothness, PRF, PRF, resels resels... –– variance-covariance matrix variance variance-covariance matrix of of partial (possibly location location partial derivatives derivatives (possibly

Resolution olution Elements ements Res El Resolution Elements 11 RESEL RESEL ==FWHM FWHMxx ××FWHM FWHMyy ××FWHM FWHMzz RESEL RESELCount CountRR

dependent) dependent)

3/2 (Ω) / ( FWHM × FWHM × FWHM ) •• RR==λλ(Ω) (Ω) √√|Λ |Λ|| ==(4log2) (4log2)3/2λλ(Ω) / ( FWHMxx × FWHMyy × FWHMzz ) •• Volume Volumeof ofsearch searchregion regionin inunits unitsof ofsmoothness smoothness

–– Eg: : 10 Eg voxels Eg: 10 voxels, voxels,, 2.5 2.5 FWHM FWHM44RESELS RESELS 1

2

1

3

4

5

6

2

7

3

8

9

10

Λ

[

⎛ var[∂∂xe ] cov ∂∂ex , ∂∂ye ⎜ var ∂∂ye Λ = ⎜ cov ∂∂ex , ∂∂ye ⎜ ∂e ∂e ∂e ∂e ⎝ cov[∂x , ∂z ] cov ∂y , ∂z

[

]

[

[]

] ]

cov[∂∂ex , ∂∂ez ]⎞ ⎟ cov ∂∂ey , ∂∂ez ⎟ ⎟ var[∂∂ez ] ⎠

[

]

•• Point Point Response Response Function Function PRF PRF

4

–– ΣΣ–– kernel kernel var/cov var/cov matrix matrix –– ACF Σ ACF 22Σ 1 –– ΛΛ ==(2Σ Σ) -1 (2 (2Σ) (8ln(2)) ⇒ FWHM ff== σσ √√(8ln(2)) ⇒FWHM ffxx 00 00 –– ΣΣ== 00 ffyy 00 11 00 00 ffzz 8ln(2) 8ln(2)

ignoring ignoringcovariances covariances

3/2 // (f ⇒ |Λ| == (4ln(2)) fxx ×× ffyy×× ffzz)) ((f ⇒ √√|Λ| (4ln(2))3/2

••Resolution olution Element ement ((RESEL Res El Resolution Element RESEL)) –– Resel fxx×× ffyy×× ffzz)) Resel dimensions dimensions ((f –– RR33((Ω) Ω) ==λλ(Ω) (Ω) //(f fxx×× ffyy×× ffzz)) ((f

•• Beware Beware RESEL RESEL misinterpretation misinterpretation –– RESEL number of things’ ” things’ in image RESEL are arenot not““number of independent independent ‘‘things’ in the the image” image” •• See . in Meth SeeNichols Nichols& &Hayasaka, Hayasaka,2003, 2003,Stat. Stat.Meth. Meth. inMed. Med.Res. Res. ..

••Gaussian Gaussian PRF PRF

ififstrictly strictlystationary stationary

•• Full Full Width Width at at Half Half Maximum Maximum FWHM. FWHM.Approximate Approximate the the peak peak of of the the Covariance Covariance function function with with aa Gaussian Gaussian

EE[χ(Α [χ(Α Αu)])] u 22--1) 1) exp(-u 22/2) π)22 ] = R (Ω) (4ln(2))3/2 3/2((u exp( (2 exp(-u /2)//(2π (2π) u = R33(Ω) (4ln(2)) ≈≈RR3((Ω) Ω) ((11––ΦΦ(u) (u))) for forhigh highthresholds thresholds uu 3

RFT RFT Assumptions Assumptions ••Model Model fit fit & & assumptions assumptions –– valid valid distributional distributional results results

••Multivariate Multivariate normality normality

–– of of component component images images

••Covariance Covariance function function of of component component images images must must be be -- Can Can be be nonstationary nonstationary -- Twice Twice differentiable differentiable

Smoothness Smoothness

smoothness smoothness »» voxel voxel size size

Random Random Field Field Intuition Intuition •• Corrected -value for P Corrected PP-value for voxel voxel value value tt PPcc == P(max P(max TT >> tt)) ≈≈ E(χ χtt)) E( E(χ 1/2 tt22exp(-t22/2) ≈≈ λλ(Ω) (Ω) |Λ Λ|1/2 exp( ||Λ| exp(-t /2)

lattice lattice approximation approximation smoothness smoothness estimation estimation

practically practically

FWHM FWHM ≥≥ 33 ×× VoxDim VoxDim

otherwise otherwise

conservative conservative

““Typical” Typical” Typical” applied applied smoothing: smoothing: Single Single Subj Subj fMRI: fMRI: 6mm 6mm PET: PET: 12mm 12mm Multi -12mm 88-12mm Multi Subj Subj fMRI: fMRI: 8PET: PET: 16mm 16mm Level Level of of smoothing smoothing should should actually actually depend depend on on what what you’ ’ re looking for… … you for you’re looking for…

Small Small Volume Volume Correction Correction

•• Statistic Statistic value value tt increases increases –– PPcc decreases decreases (but (but only only for for large large tt))

•• Search ) Search volume volume increases increases (bigger (biggerΩ Ω) –– PPcc increases increases (more (more severe severe MCP) MCP)

•• Smoothness Smoothness increases increases (roughness (roughness|Λ |Λ||

1/2 1/2decreases) decreases)

–– PPcc decreases decreases (less (less severe severe MCP) MCP)

Resel Resel Counts Counts for for Brain Brain Structures Structures

SVC SVC = = correction correction for for multiple multiple comparison comparison in in aa user’ ’s defined of interest’ ’. user interest user’s defined volume volume ‘‘of interest’. Shape and size of volume become important for small or oddly shaped volume ! Example of SVC (900 voxels) • compact volume: samples from maximum 16 resels • spread volume: sample from up to 36 resels ⇒ threshold higher for spread volume than compact volume.

FWHM=20mm

(1) Threshold depends on Search Volume (2) Surface area makes a large contribution

Contents Contents

Summary Summary •• We We should should correct correct for for multiple multiple comparisons comparisons –– We We can can use use Random Random Field Field Theory Theory (RFT) (RFT) or or other other methods methods

•• RFT RFT requires requires –– aa good good lattice lattice approximation approximation to to underlying underlying multivariate multivariate Gaussian Gaussian fields, fields, –– that that these these fields fields are are continuous continuous with with aa twice twice differentiable differentiable correlation correlation function function

•• Recap Recap & & Introduction Introduction •• Inference Inference & & multiple multiple comparison comparison •• Single/multiple Single/multiple voxel voxel inference inference •• Family wise error rate Family wise error rate (FWER) (FWER)

•• To To aa first first approximation, approximation, RFT RFT is is aa Bonferroni Bonferroni correction correction using using RESELS. RESELS. •• We We only only need need to to correct correct for for the the volume volume of of interest. interest. •• Depending Depending on on nature nature of of signal signal we we can can trade-off trade-off anatomical anatomical specificity specificity for for signal signal sensitivity sensitivity with with the the use use of of cluster-level cluster-level inference. inference.

Nonparametric Nonparametric Permutation Permutation Test Test

•• Nonparametric Nonparametric methods methods –– Use Use data data to to find find distribution distribution of of statistic statistic under under null null hypothesis hypothesis –– Any Any statistic! statistic!

•• False False Discovery Discovery rate rate (FDR) (FDR) •• SPM SPM results results

•• « « Take Take home home » » message message

Permutation Permutation Test Test :: T Toy oy Example Example •• Data Data from from V1 V1 voxel voxel in in visual visual stim. stim. experiment experiment

•• Parametric Parametric methods methods –– Assume Assume distribution distribution of of statistic statistic under under null null hypothesis hypothesis

•• Bonferroni Random Field correction/ Bonferroni correction/Random correction/Random Field Theory Theory •• Non-parametric approach Non Non-parametric approach

A: A: Active, Active, flashing flashing checkerboard checkerboard B: B: Baseline, Baseline, fixation fixation 66 blocks, Just blocks, ABABAB ABABAB Just consider consider block block averages... averages...

5% Parametric Null Distribution

A

B

A

B

A

B

103.00

90.48

99.93

87.83

99.76

96.06

•• Null Null hypothesis hypothesis H Hoo

–– No No experimental experimental effect, effect, AA & & BB labels labels arbitrary arbitrary

•• Statistic Statistic 5% Nonparametric Null Distribution

–– Mean Mean difference difference

Permutation Permutation Test Test :: Toy Toy Example Example •• Under Under H Hoo

–– Consider Consider all all equivalent equivalent relabelings relabelings

Permutation Permutation Test Test :: Toy Toy Example Example •• Under Under H Hoo

–– Consider Consider all all equivalent equivalent relabelings relabelings –– Compute Compute all all possible possible statistic statistic values values

AAABBB

ABABAB

BAAABB

BABBAA

AAABBB 4.82

ABABAB 9.45

BAAABB -1.48

BABBAA -6.86

AABABB

ABABBA

BAABAB

BBAAAB

AABABB -3.25

ABABBA 6.97

BAABAB 1.10

BBAAAB 3.15

AABBAB

ABBAAB

BAABBA

BBAABA

AABBAB -0.67

ABBAAB 1.38

BAABBA -1.38

BBAABA 0.67

AABBBA

ABBABA

BABAAB

BBABAA

AABBBA -3.15

ABBABA -1.10

BABAAB -6.97

BBABAA 3.25

ABAABB

ABBBAA

BABABA

BBBAAA

ABAABB 6.86

ABBBAA 1.48

BABABA -9.45

BBBAAA -4.82

Permutation Permutation Test Test :: Toy Toy Example Example •• Under Under H Hoo

–– Consider Consider all all equivalent equivalent relabelings relabelings –– Compute all possible Compute all possible statistic statistic values values –– Find Find 95%ile 95%ile of of permutation permutation distribution distribution

AAABBB 4.82

ABABAB 9.45

BAAABB -1.48

BABBAA -6.86

AABABB -3.25

ABABBA 6.97

BAABAB 1.10

BBAAAB 3.15

AABBAB -0.67

ABBAAB 1.38

BAABBA -1.38

BBAABA 0.67

AABBBA -3.15

ABBABA -1.10

BABAAB -6.97

BBABAA 3.25

ABAABB 6.86

ABBBAA 1.48

BABABA -9.45

BBBAAA -4.82

Permutation Permutation Test Test :: Toy Toy Example Example •• Under Under H Hoo

–– Consider Consider all all equivalent equivalent relabelings relabelings –– Compute all possible Compute all possible statistic statistic values values –– Find Find 95%ile 95%ile of of permutation permutation distribution distribution

-8

-4

0

4

8

Permutation Permutation Test Test :: Toy Toy Example Example •• Under Under H Hoo

–– Consider Consider all all equivalent equivalent relabelings relabelings –– Compute Compute all all possible possible statistic statistic values values –– Find Find 95%ile 95%ile of of permutation permutation distribution distribution

Controlling Controlling FWER: FWER: Permutation Permutation Test Test

•• Parametric Parametric methods methods –– Assume Assume distribution distribution of of max max statistic statistic under under null null hypothesis hypothesis

•• Nonparametric Nonparametric methods methods AAABBB 4.82

ABABAB 9.45

BAAABB -1.48

BABBAA -6.86

AABABB -3.25

ABABBA 6.97

BAABAB 1.10

BBAAAB 3.15

AABBAB -0.67

ABBAAB 1.38

BAABBA -1.38

BBAABA 0.67

AABBBA -3.15

ABBABA -1.10

BABAAB -6.97

BBABAA 3.25

ABAABB 6.86

ABBBAA 1.48

BABABA -9.45

BBBAAA -4.82

Permutation Permutation Test Test & & Exchangeability Exchangeability

•• Exchangeability Exchangeability is is fundamental fundamental –– Def: Def: Distribution Distribution of of the the data data unperturbed unperturbed by by permutation permutation –– Under Under H H00,, exchangeability exchangeability justifies justifies permuting permuting data data –– Allows Allows us us to to build build permutation permutation distribution distribution

•• Subjects Subjects are are exchangeable exchangeable –– Under Under Ho, Ho, each each subject’s subject’s A/B A/B labels labels can can be be flipped flipped

•• Are Are fMRI fMRI scans scans exchangeable exchangeable under under H Hoo?? –– If If no no signal, signal, can can we we permute permute over over time? time?

5% Parametric Null Max Distribution

–– Use Use data data to to find find distribution distribution of of max max statistic statistic under null hypothesis under null hypothesis –– Again, Again, any any max max statistic! statistic!

5% Nonparametric Null Max Distribution

Permutation Permutation Test Test & & Exchangeability Exchangeability

•• fMRI fMRI scans scans are are not not exchangeable exchangeable –– Permuting Permuting disrupts disrupts order, order, temporal temporal autocorrelation autocorrelation

•• Intrasubject subject fMRI Intra Intrasubject fMRI permutation permutation test test –– Must Must decorrelate decorrelate data, data, model model before before permuting permuting –– What What is is correlation correlation structure? structure?

•• Usually Usually must must use use parametric parametric model model of of correlation correlation

–– E.g. E.g. Use Use wavelets wavelets to to decorrelate decorrelate •• Bullmore -78 12:61 Bullmore et et al al 2001, 2001, HBM HBM 12:6112:61-78

•• Intersubject subject fMRI Inter Intersubject fMRI permutation permutation test test –– Create Create difference difference image image for for each each subject subject –– For For each each permutation, permutation, flip flip sign sign of of some some subjects subjects

Permutation Permutation Test Test :: Example Example

Permutation Permutation Test Test :: Example Example

•• Permute! Permute! •• fMRI fMRI Study Study of of Working Working Memory Memory –– 12 12 subjects, subjects, block block design design –– Item Item Recognition Recognition

Active

Marshuetz Marshuetz et et al al (2000) (2000)

•• Active:View Active:View five five letters, letters, 2s 2s pause, pause, view view probe probe letter, letter, respond respond •• Baseline: Baseline: View View XXXXX, XXXXX, 2s 2s pause, pause, view view YY or or N, N, respond respond

•• Second Second Level Level RFX RFX –– Difference Difference image, image, A-B A-B constructed constructed for for each each subject subject –– One One sample, sample, smoothed smoothed variance variance tt test test

...

... D

UBKDA

12 = –– 2 212 = 4,096 4,096 ways ways to to flip flip 12 12 A/B A/B labels labels –– For For each, each, note note maximum maximum of of tt image image ..

yes

Baseline

...

... N

XXXXX

no

Permutation Distribution Maximum t

Maximum Intensity Projection Thresholded t

Does Does this this Generalize? Generalize? RFT RFT vs vs Bonf. Bonf. vs vs Perm. Perm.

uPerm = 7.67 58 sig. vox.

t11 Statistic, Nonparametric Threshold

uRF = 9.87 uBonf = 9.80 5 sig. vox.

t11 Statistic, RF & Bonf. Threshold

• Compare with Bonferroni α = 0.05/110,776 • Compare with parametric RFT 110,776 2×2×2mm voxels 5.1×5.8×6.9mm FWHM smoothness 462.9 RESELs Test Level vs. t11 Threshold

Verbal Fluency Location Switching Task Switching Faces: Main Effect Faces: Interaction Item Recognition Visual Motion Emotional Pictures Pain: Warning Pain: Anticipation

df 4 9 9 11 11 11 11 12 22 22

t Threshold (0.05 Corrected) RF Bonf Perm 4701.32 42.59 10.14 11.17 9.07 5.83 10.79 10.35 5.10 10.43 9.07 7.92 10.70 9.07 8.26 9.87 9.80 7.67 11.07 8.92 8.40 8.48 8.41 7.15 5.93 6.05 4.99 5.87 6.05 5.05

RFT RFT vs vs Bonf. Bonf. vs vs Perm. Perm.

Verbal Fluency Location Switching Task Switching Faces: Main Effect Faces: Interaction Item Recognition Visual Motion Emotional Pictures Pain: Warning Pain: Anticipation

df 4 9 9 11 11 11 11 12 22 22

No. Significant Voxels (0.05 Corrected) t RF Bonf Perm 0 0 0 0 0 158 4 6 2241 127 371 917 0 0 0 5 5 58 626 1260 1480 0 0 0 127 116 221 74 55 182

Performance Performance Summary Summary •• Bonferroni Bonferroni

–– Not Not adaptive adaptive to to smoothness smoothness –– Not Not so so conservative conservative for for low low smoothness smoothness

•• Random Random Field Field

–– Adaptive Adaptive –– Conservative Conservative for for low low smoothness smoothness & & df df

•• Permutation Permutation

–– Adaptive Adaptive (Exact) (Exact)

“Old” “Old” Conclusions Conclusions

Contents Contents

•• tt random random field field results results conservative conservative for for

•• Recap Recap & & Introduction Introduction

–– Low Low df df & & smoothness smoothness –– 9 df & ≤12 9 df & ≤12 voxel voxel FWHM; FWHM; 19 19 df df & &< < 10 10 voxel voxel FWHM FWHM

•• Bonferroni Bonferroni surprisingly surprisingly satisfactory satisfactory for for low low smoothness smoothness •• Nonparametric Nonparametric methods methods perform perform well well overall overall

•• Inference Inference & & multiple multiple comparison comparison •• Single/multiple Single/multiple voxel voxel inference inference •• Family Family wise wise error error rate rate (FWER) (FWER) •• False Discovery rate (FDR) False Discovery rate (FDR) •• SPM SPM results results

•• « « Take Take home home » » message message

False False Discovery Discovery Rate Rate Illustration: Illustration:

False False Discovery Discovery Rate Rate ACTION

TRUTH

H True (o) H False (x)

Reject

TN

FP

At u2

TP

FDR=1/12=8% α=1/10=10%

FN

Noise

FDR=3/17=18% α=3/10=30%

Signal

Eg. t-scores from regions that truly do and do not activate

FDR = FP/(FP+TP) α = FP/(FP+TN)

At u1

Don’t Reject

Signal+Noise

oooooooxxxooxxxoxxxxxxxx u1

u2

Control of Per Comparison Rate at 10% 11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5% Percentage of Null Pixels that are False Positives

•• •• ••

Select Select desired desired limit limit qq on on E(FDR) E(FDR) ≤ p (2) ≤≤ ... Order pvalues, p p ... ≤≤ pp((V) Order p-values, p(1) V) (1) ≤ p(2) Let r be largest i such that Let r be largest i such that

p(i) ≤ i/V*q 1

Control of Familywise Error Rate at 10%

Benjamini Benjamini & & Hochberg Hochberg Procedure Procedure

•• Reject Reject all all hypotheses hypotheses corresponding corresponding to to , ... , p r).. pp(1) (1), ... , p((r)

6.7% 10.4% 14.9% 9.3% 16.2% 13.8% 14.0% 10.5% 12.2% 8.7% Percentage of Activated Pixels that are False Positives

0

Control of False Discovery Rate at 10%

p(i) p-value

Occurrence of Familywise Error

FWE

0

JRSS-B (1995) 57:289-300

i/V × q/c(V) i/V

1

Benjamini Benjamini & & Hochberg: Hochberg:

Benjamini Benjamini & & Hochberg: Hochberg:

Varying Varying Signal Signal Extent Extent p=

Signal Intensity 3.0

Varying Varying Signal Signal Extent Extent

z=

Signal Extent 1.0

p=

Noise Smoothness 3.0

Signal Intensity 3.0

z=

Signal Extent 2.0

Noise Smoothness 3.0

1

Benjamini Benjamini & & Hochberg: Hochberg:

Benjamini Benjamini & & Hochberg: Hochberg:

Varying Varying Signal Signal Extent Extent p=

Signal Intensity 3.0

Varying Varying Signal Signal Extent Extent

z=

Signal Extent 3.0

2

p = 0.000252

Noise Smoothness 3.0

Signal Intensity 3.0 3

z = 3.48

Signal Extent 5.0

Noise Smoothness 3.0 4

Benjamini Benjamini & & Hochberg: Hochberg:

Benjamini Benjamini & & Hochberg: Hochberg:

Varying Varying Signal Signal Extent Extent p = 0.001628

Signal Intensity 3.0

Varying Varying Signal Signal Extent Extent

z = 2.94

Signal Extent 9.5

p = 0.007157

Noise Smoothness 3.0

Signal Intensity 3.0

z = 2.45

Signal Extent16.5

Noise Smoothness 3.0

5

Benjamini Benjamini & & Hochberg: Hochberg:

Benjamini Benjamini & & Hochberg: Hochberg: Properties Properties

Varying Varying Signal Signal Extent Extent p = 0.019274

6

z = 2.07

•• Adaptive Adaptive –– Larger Larger the the signal, signal, the the lower lower the the threshold threshold –– Larger the signal, the more false Larger the signal, the more false positives positives •• False False positives positives constant constant as as fraction fraction of of rejected rejected tests tests •• Not ’s sparse imaging Not aa problem problem with with imaging’ imaging’s sparse signals signals

•• Smoothness Smoothness OK OK –– Smoothing Smoothing introduces introduces positive positive correlations correlations Signal Intensity 3.0

Signal Extent25.0

Noise Smoothness 3.0 7

Contents Contents

Summary: Summary: Levels Levels of of inference inference & & power power SPM intensity

•• Recap Recap & & Introduction Introduction •• Inference Inference & & multiple multiple comparison comparison •• Single/multiple Single/multiple voxel voxel inference inference •• Family wise error rate Family wise error rate (FWER) (FWER)

h

u

Sensitivity L1

SPM position : significant at the set level : significant at the cluster level : significant at the voxel level

L1 > spatial extent threshold L2 < spatial extent threshold

•• False False Discovery Discovery rate rate (FDR) (FDR) •• SPM SPM results results

•• « « Take Take home home » » message message

Levels … Levels of of inference inference… voxelvoxel-level P(c ≥ 1 | n ≥ 0, t ≥ 4.37) = 0.048 (corrected) P(t P(t ≥ 4.37) = 1 - Φ{4.37} < 0.001 (uncorrected)

omnibus P(c≥ P(c≥7 | n ≥ 0, t ≥ 3.09) = 0.031

n=12 t=4.37 n=82

n=32 clustercluster-level P(c ≥ 1 | n ≥ 82, t ≥ 3.09) = 0.029 (corrected) P(n ≥ 82 | t ≥ 3.09) = 0.019 (uncorrected)

setset-level P(c ≥ 3 | n ≥ 12, t ≥ 3.09) = 0.019

Parameters u k S FWHM D

- 3.09 - 12 voxels - 323 voxels - 4.7 voxels -3

Test based on

L2

SPM SPM results... results...

The intensity of a voxel

The spatial extent above u or the spatial extent and the maximum peak height

Parameters set by the user • Low pass filter

• Low pass filter • intensity threshold u

The number of clusters above u with size greater than n

• Low pass filter • intensity thres. u • spatial threshold n

The sum of square of the SPM or a MANOVA

• Low pass filter

Regional specificity

SPM SPM results... results...

SPM SPM results... results...

fMRI :

Activations significant at voxel and cluster level

Contents Contents

Conclusions: Conclusions: FWER FWER vs vs FDR FDR •• Must Must account account for for multiplicity multiplicity –– Otherwise fishing expedition” ” expedition Otherwise have have aa ““fishing expedition”

•• Recap Recap & & Introduction Introduction •• Inference Inference & & multiple multiple comparison comparison •• « « Take Take home home » » message message

•• FWER FWER –– Very Very specific, specific, less less sensitive sensitive

•• FDR FDR –– Less Less specific, specific, more more sensitive sensitive –– Trouble with cluster … inference Trouble with cluster inference… inference…

More ! Ya More Power Power to to Ya! Ya! Statistical Power • the probability of rejecting the null hypothesis when it is actually false • “if there’s an effect, how likely are you to find it”? Effect size • bigger effects, more power

• e.g., MT localizer (moving rings - stationary runs) -- 1 run is usually enough • looking for activation during imagined motion might require many more runs

Sample size • larger n, more power

• more subjects - longer runs - more runs

Signal:Noise Ratio • better SNR, more power

• stronger, cleaner magnet - more focal coil - fewer artifacts - more filtering