A General(ised) Linear Model. – Effects are linear ... SPM. Choose your model. “
All models are wrong, but some are useful” ... Question: Is there a change in the
BOLD ..... Do movement parameters (or other confounds) account for anything?
The The general general linear linear model model and and Statistical Statistical Parametric Parametric Mapping Mapping I: I: Introduction Introduction to to the the GLM GLM Alexa Morcom
and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline
Overview • Introduction • Essential concepts – – – –
Modelling Design matrix Parameter estimates Simple contrasts
• Summary
Some terminology • SPM is based on a mass univariate approach that fits a model at each voxel – Is there an effect at location X? Investigate localisation of function or functional specialisation – How does region X interact with Y and Z? Investigate behaviour of networks or functional integration
• A General(ised) Linear Model – Effects are linear and additive – If errors are normal (Gaussian), General (SPM99) – If errors are not normal, Generalised (SPM2)
Classical statistics... • Parametric – – – – – – – – – –
one sample t-test two sample t-test paired t-test ANOVA ANCOVA correlation linear regression multiple regression F-tests etc…
all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model
Classical statistics... • Parametric – – – – – – – – – –
one sample t-test two sample t-test paired t-test ANOVA ANCOVA correlation linear regression multiple regression F-tests etc…
all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model
Classical statistics... • Parametric – – – – – – – – – –
one sample t-test two sample t-test paired t-test ANOVA ANCOVA correlation linear regression multiple regression F-tests etc…
• Multivariate?
all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model
→ PCA/ SVD, MLM
Classical statistics... • Parametric – – – – – – – – – –
one sample t-test two sample t-test paired t-test ANOVA ANCOVA correlation linear regression multiple regression F-tests etc…
• Multivariate? • Non-parametric?
all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model
→ PCA/ SVD, MLM → SnPM
Why modelling? Why?
Make Makeinferences inferencesabout abouteffects effectsof ofinterest interest
How?
1.1. Decompose Decomposedata datainto intoeffects effectsand anderror error 2.2. Form Formstatistic statisticusing usingestimates estimatesof ofeffects effectsand anderror error
Model?
Use Useany anyavailable availableknowledge knowledge
effects effectsestimate estimate data data
model model
statistic statistic error errorestimate estimate
Choose your model “All models are wrong, but some are useful” George E.P.Box
AAvery verygeneral general model model
Variance Variance
Bias Bias
No Nonormalisation normalisation
Lots Lotsofofnormalisation normalisation
No Nosmoothing smoothing Massive Massivemodel model
default default SPM SPM
Lots Lotsof ofsmoothing smoothing Sparse Sparsemodel model
Captures signal
High sensitivity
but but... ...not notmuch much sensitivity sensitivity
but but......may maymiss miss signal signal
Modelling with SPM Functional Functionaldata data
Design Designmatrix matrix
Smoothed Smoothed normalised normalised data data
Preprocessing Preprocessing
Templates Templates
Generalised Generalised linear linear model model
Contrasts Contrasts
Parameter Parameter estimates estimates
Variance Variancecomponents components
SPMs SPMs
Thresholding Thresholding
Modelling with SPM Design Designmatrix matrix
Contrasts Contrasts
Preprocessed Preprocessed data: data:single single voxel voxel
Generalised Generalised linear linear model model
Parameter Parameter estimates estimates
Variance Variancecomponents components
SPMs SPMs
fMRI example One Onesession session
Time Timeseries seriesofofBOLD BOLD responses responsesininone onevoxel voxel
Passive Passiveword wordlistening listening versus versusrest rest 77cycles cyclesof of rest restand andlistening listening Each Eachepoch epoch66scans scans with with77sec secTR TR
Question: Question:IsIsthere thereaachange changeininthe theBOLD BOLD response between listening and rest? response between listening and rest?
Stimulus Stimulusfunction function
GLM essentials • The model – Design matrix: Effects of interest – Design matrix: Confounds (aka effects of no interest) – Residuals (error measures of the whole model)
• Estimate effects and error for data – Parameter estimates (aka betas) – Quantify specific effects using contrasts of parameter estimates
• Statistic – Compare estimated effects – the contrasts – with appropriate error measures – Are the effects surprisingly large?
GLM essentials • The model – Design matrix: Effects of interest – Design matrix: Confounds (aka effects of no interest) – Residuals (error measures of the whole model)
• Estimate effects and error for data – Parameter estimates (aka betas) – Quantify specific effects using contrasts of parameter estimates
• Statistic – Compare estimated effects – the contrasts – with appropriate error measures – Are the effects surprisingly large?
Regression model
Intensity
+ β2
x1
Question: Question:IsIsthere thereaachange changeininthe theBOLD BOLD response responsebetween betweenlistening listeningand andrest? rest?
+
x2
error
Time
= β1
General case
ε∼Ν(0, σ2Ι) (error is normal and independently and identically distributed)
ε Hypothesis test: β1 = 0? (using t-statistic)
Regression model
= βˆ1
Y
=
βˆ1 x1
General case
+ βˆ2
+
ˆ β + 2 x2
+
εˆ
Model Modelisisspecified specifiedby by 1.1. Design Designmatrix matrixXX 2.2. Assumptions Assumptionsabout aboutεε
Matrix formulation Yi = β1 xi + β2 + εi
i = 1,2,3
Y1 = β1 x1 + β2 × 1 + ε1 Y2 = β1 x2 + β2 × 1 + ε2 Y3 = β1 x3 + β2 × 1 + ε3 dummy variables
Y1 x1 1 Y2 = x2 1 Y3 x3 1
β1
Y =
β+ ε
X
ε1 + ε2 β1 ε 3
Matrix formulation
^ Hats Hats== estimates estimates
Linear regression Yi = β1 xi + β2 + εi
i = 1,2,3
Y1 = β1 x1 + β2 × 1 + ε1 Y2 = β1 x2 + β2 × 1 + ε2
Yi
y
× ^ β
2
×
1
×
Y3 = β1 x3 + β2 × 1 + ε3
^ β
1
^ Yi x
dummy variables
Y1 x1 1 Y2 = x2 1 Y3 x3 1
β1
Y =
β+ ε
X
ε1 + ε2 β1 ε 3
Fitted values
^ ^ β1 & β2 ^ ^ &Y Y
Residuals
^ ε1 & ^ε2
Parameter estimates
1
2
Geometrical perspective Y1 x1 1 Y2 = x2 1 Y3 x3 1
β1
ε1 + ε2 β2 ε3
DATA (Y1, Y2, Y3) Y
Y = β1 × X1 + β2 × X2 + ε
X1 (x1, x2, x3) (1,1,1) O
X2
design space
GLM essentials • The model – Design matrix: Effects of interest – Design matrix: Confounds (aka effects of no interest) – Residuals (error measures of the whole model)
• Estimate effects and error for data – Parameter estimates (aka betas) – Quantify specific effects using contrasts of parameter estimates
• Statistic – Compare estimated effects – the contrasts – with appropriate error measures – Are the effects surprisingly large?
Parameter estimation
Ordinary least squares
Y = Xβ + ε T −1 T ˆ β = (X X ) X Y
εˆ = Y − Xβˆ residuals residuals
Parameter Parameter estimates estimates
Estimate Estimateparameters parameters such suchthat that
N
∑ εˆ
2 t
t =1
minimal minimal
Least Leastsquares squares parameter parameterestimate estimate
Parameter estimation
Ordinary least squares
Y = Xβ + ε T −1 T ˆ β = (X X ) X Y
Parameter Parameter estimates estimates
εˆ = Y − Xβˆ residuals residuals==rr
2 Error Errorvariance varianceσσ2== (sum (sumof) of)squared squaredresiduals residuals standardised standardisedfor fordf df ==rrTTrr//df df (sum (sumof ofsquares) squares) …where …wheredegrees degreesof offreedom freedomdf df(assuming (assumingiid): iid): ==NN--rank(X) rank(X) (=N-P (=N-PififXXfull fullrank) rank)
Estimation (geometrical) Y1 x1 1 Y2 = x2 1 Y3 x3 1
β1
ε1 + ε2 β2 ε3
DATA (Y1, Y2, Y3) Y RESIDUALS ^ ε = (ε^1,ε^2,ε^3)T
Y = β1 × X1 + β2 × X2 + ε
X1 (x1, x2, x3)
^ Y
(1,1,1) O
X2
design space
^ ^ ,Y^ ,Y (Y 1 2 3)
GLM essentials • The model – Design matrix: Effects of interest – Design matrix: Confounds (aka effects of no interest) – Residuals (error measures of the whole model)
• Estimate effects and error for data – Parameter estimates (aka betas) – Quantify specific effects using contrasts of parameter estimates
• Statistic – Compare estimated effects – the contrasts – with appropriate error measures – Are the effects surprisingly large?
Inference - contrasts Y = Xβ + ε
A contrast = a linear combination of parameters: c´ x β à spm_con*img
t-test: one-dimensional contrast, difference in means/ difference from zero boxcar boxcarparameter parameter>>00?? Null Nullhypothesis: hypothesis: β1
=0
à spmT_000*.img SPM{t} map
F-test: tests multiple linear hypotheses – does subset of model account for significant variance Does Doesboxcar boxcarparameter parameter model modelanything? anything? Null Nullhypothesis: hypothesis:variance variance of oftested testedeffects effects==error error variance variance
à ess_000*.img SPM{F} map
t-statistic - example Y = Xβ + ε
c βˆ t= Stˆd (cT βˆ ) T
cc==1100000000000000000000
X
Contrast Contrastof ofparameter parameter estimates estimates Variance Varianceestimate estimate
Standard Standarderror errorof ofcontrast contrastdepends dependson onthe thedesign, design, and andisislarger largerwith withgreater greaterresidual residualerror errorand and ‚greater‘ ‚greater‘covariance/ covariance/autocorrelation autocorrelation Degrees Degreesof offreedom freedomd.f. d.f.then then==n-p, n-p,where wherenn observations, observations,ppparameters parameters Tests Testsfor foraadirectional directional difference differenceininmeans means
F-statistic - example Do Domovement movementparameters parameters(or (orother otherconfounds) confounds)account accountfor foranything? anything?
H0: True model is X0 H0: β3-9 = (0 0 0 0 ...) X0
X1 (β3−9)
This model ?
test H0 : c´ x β = 0 ?
X0
Or this one ?
Null Nullhypothesis hypothesisHH00:: That are Thatall allthese thesebetas betasββ3-9 3-9 are zero, zero,i.e. i.e.that thatno nolinear linear combination combinationof ofthe theeffects effects accounts accountsfor forsignificant significant variance variance This Thisisisaanon-directional non-directionaltest test
F-statistic - example Do Domovement movementparameters parameters(or (orother otherconfounds) confounds)account accountfor foranything? anything?
H0: True model is X0 H0: β3-9 = (0 0 0 0 ...) X0
X1 (β3−9)
X0 c’ =
This model ?
Or this one ?
test H0 : c´ x β = 0 ?
00100000 00010000 00001000 00000100 00000010 00000001
SPM{F}
Summary so far • The essential model contains – Effects of interest
• A better model? – A better model (within reason) means smaller residual variance and more significant statistics – Capturing the signal – later – Add confounds/ effects of no interest – Example of movement parameters in fMRI – A further example (mainly relevant to PET)…
Example PET experiment β2
β3
β4
rank(X)=3
β1
12 12scans, scans,33conditions conditions(1-way (1-wayANOVA) ANOVA) yyj ==xx1j bb1 ++xx2j bb2 ++xx3j bb3 ++xx4j bb4 ++eej j 1j 1 2j 2 3j 3 4j 4 j
where where(dummy) (dummy)variables: variables: xx1j ==[0,1] [0,1]==condition conditionAA(first (first44scans) scans) 1j xx2j ==[0,1] [0,1]==condition conditionBB(second (second44scans) scans) 2j xx3j ==[0,1] [0,1]==condition conditionCC(third (third44scans) scans) 3j xx4j ==[1] [1]== grand grandmean mean(session (sessionconstant) constant) 4j
xx x x oo xx xx x x oo o xx o o xx x oo o o x oo o o o global o gCBF o
rCBF
x x
xx
x xo o x o xo o
rCBF
o
AnCova
xx
xx
α k2
xx
xx
α k1
xx 1 oo
ζ k
oo
xx oo
oo
oo oo
global gCBF
x x x x x x o o o o o o
rCBF
g.. rCBF (adj)
•• May Maybe bevariation variationin inPET PETtracer tracerdose dose from fromscan scanto toscan scan •• Such Such“global” “global”changes changesin inimage image intensity intensity(gCBF) (gCBF)confound confoundlocal local// regional regional(rCBF) (rCBF)changes changesof of experiment experiment •• Adjust Adjustfor forglobal globaleffects effectsby: by: --AnCova AnCova(Additive (AdditiveModel) Model)--PET? PET? --Proportional --fMRI? ProportionalScaling Scaling fMRI? •• Can Canimprove improvestatistics statisticswhen when orthogonal orthogonalto toeffects effectsof ofinterest… interest… •• …but …butcan canalso alsoworsen worsenwhen wheneffects effects of ofinterest interestcorrelated correlatedwith withglobal global
rCBF
Global effects
x
Scaling
xx o x o x x xx o oo oo o o o o x
x xo
x
global 0
0
50
gCBF
Global effects (AnCova) β2
β3
β4
β5
rank(X)=4
β1
12 12scans, scans,33conditions, conditions,11confounding confoundingcovariate covariate yyj ==xx1j bb1 ++xx2j bb2 ++xx3j bb3 ++xx4j bb4 ++xx5j bb5 ++eej j 1j 1 2j 2 3j 3 4j 4 5j 5 j
where where(dummy) (dummy)variables: variables: xx1j ==[0,1] [0,1]==condition conditionAA(first (first44scans) scans) 1j xx2j ==[0,1] [0,1]==condition conditionBB(second (second44scans) scans) 2j xx3j ==[0,1] [0,1]==condition conditionCC(third (third44scans) scans) 3j xx4j ==[1] [1]== grand grandmean mean(session (sessionconstant) constant) 4j
xx5j ==global globalsignal signal(mean (meanover overall allvoxels) voxels) 5j
Global effects (AnCova) β3
β4
β1
•• Global Globaleffects effectsnot not accounted accountedfor for •• Maximum Maximumdegrees degrees of offreedom freedom(global (global uses usesone) one)
β2
β3
β4
β5
Correlated Correlatedglobal global β1
rank(X)=4
β2
rank(X)=3
β1
Orthogonal Orthogonalglobal global
•• Global Globaleffects effects independent independentof of effects effectsof ofinterest interest •• Smaller Smallerresidual residual variance variance •• Larger LargerTTstatistic statistic •• More Moresignificant significant
β2
β3
β4
β5
rank(X)=4
No NoGlobal Global
•• Global Globaleffects effects correlated correlatedwith with effects effectsof ofinterest interest •• Smaller Smallereffect effect&/or &/or larger largerresiduals residuals •• Smaller SmallerTTstatistic statistic •• Less Lesssignificant significant
Global effects (scaling) •• Two Twotypes typesof ofscaling: scaling:Grand GrandMean Meanscaling scalingand andGlobal Globalscaling scaling -- Grand GrandMean Meanscaling scalingisisautomatic, automatic,global globalscaling scalingisisoptional optional -- Grand GrandMean Meanscales scalesby by100/mean 100/meanover overall allvoxels voxelsand andALL ALLscans scans (i.e, (i.e,single singlenumber numberper persession) session) -- Global Globalscaling scalingscales scalesby by100/mean 100/meanover overall allvoxels voxelsfor forEACH EACHscan scan (i.e, (i.e,aadifferent differentscaling scalingfactor factorevery everyscan) scan) •• Problem Problemwith withglobal globalscaling scalingisisthat thatTRUE TRUEglobal globalisisnot not(normally) (normally) known… known…… …only onlyestimated estimatedby bythe themean meanover overvoxels voxels -- So Soififthere thereisisaalarge largesignal signalchange changeover overmany manyvoxels, voxels,the theglobal global estimate estimatewill willbe beconfounded confoundedby bylocal localchanges changes -- This Thiscan canproduce produceartifactual artifactualdeactivations deactivationsininother otherregions regionsafter after global globalscaling scaling •• Since Sincemost mostsources sourcesof ofglobal globalvariability variabilityin infMRI fMRIare arelow lowfrequency frequency (drift), (drift),high-pass high-passfiltering filteringmay maybe besufficient, sufficient,and andmany manypeople peopledo do not notuse useglobal globalscaling scaling
Summary • General(ised) linear model partitions data into – Effects of interest & confounds/ effects of no interest – Error
• Least squares estimation – Minimises difference between model & data – To do this, assumptions made about errors – more later
• Inference at every voxel – Test hypothesis using contrast – more later – Inference can be Bayesian as well as classical
• Next: Applying the GLM to fMRI