Methods Simulation: Results fMRI Dataset: Univariate

4 downloads 0 Views 4MB Size Report
Rogers, Timothy T., Ma hew A. Lambon Ralph, Peter Garrard, Sasha Bozeat, James L. McClelland, John R. Hodges, and Karalyn Pa erson. “Structure and ...
Are semantic representations of words radically distributed? 1 Cox ,

Christopher

Mark

1 Seidenberg ,

Jeffery

2 Binder ,

Rutvik

2 Desai

& Timothy

1 Rogers

1-University of Wisconsin—Madison, 2-Medical College of Wisconsin

Introduction

fMRI Dataset:

Multivariate: Methods

• Neuroimaging  studies  of  seman0c  memory  o3en   suggest  localiza0on  of    conceptual  domains  in  cortex. • Connec0onist  models  of  seman0c  memory  propose: ‣ A  voxel  can  encode  informa0on  about  >1  domain. ‣ Adjacent  voxels  may  encode  different  informa0on. ‣ Distal  regions  can  contribute  to  one  representa0on. ‣ Experience  determines  what  a  given  voxel  encodes. • We  argue  that  univariate  analyses  are  ill  suited  to   detect  such  radically  distributed  signals  should  they   exist  in  fMRI  datasets.  [1] • We  use  sparse  logis-c  regression  (LASSO)  to  test  the   localist  and  distributed  hypotheses  with  less  bias.

Par-cipants:  14  right  handed  subjects  scanned  at  the  Medical   College  of  Wisconsin.  S-muli:  1200  le]er  strings  (900  words,  300   nonwords  obeying  English  phonology).  600  words  were  concrete,   300  abstract.  Of  the  concrete  nouns:  94  animals  145  ar-facts   Task:  Respond  to  the  ques0on  "Is  it  something  physical  you  can   experience  with  your  senses?"  by  bu]on-­‐press  for  every   s0mulus.  Procedure:  Le]er  strings  presented  one  at  a  0me  at   ji]ered  intervals  in  random  order  over  the  course  of  10  scanner   runs  with  no  repe00ons  in  a  rapid  event-­‐related  design.  Imaging   and  Preprocessing:  Par0cipants  were  scanned  in  a  GE  3T  scanner   with  a  2-­‐second  TR.  The  first  6  TRs  from  each  run  were  discarded.   Func0onal  data  was  0me-­‐shi3ed,  registered  to  the  anatomy  and   scaled  prior  to  deconvolu0on.    Cor-cal  Masks:  Surface   reconstruc0ons  for  each  subject's  anatomy  were  created  and   used  to  select  only  cor0cal  voxels  (i.e.  space  between  pial  and   white  ma]er  ribbons)  from  the  volumes  of  beta  coefficients  for   each  word  and  subject  (Freesurfer[3]).

Deconvolu-on:  Mul0ple  linear  regression  with  gamma  variate   HRF.  GLM  analysis  included  a  regressor  for  each  word  and   nonword  s0mulus,  resul0ng  in  1200  volumes  with  beta   coefficients  at  each  voxel  for  each  subject. Classifica-on  Tasks:  In  separate  tasks,  LASSO[4]  is  used  to   discover  a  sparse  solu0on  given  λ  to  classify  900  words  as   animal/non-­‐animal  or  ar-fact/non-­‐ar0fact. Voxel  Pre-­‐selec-on:  Applied  cor0cal  mask  to  the  beta-­‐volumes   for  each  par0cipant. Cross  Valida-on:  The  words  were  separated  into  10  blocks,  each   balanced  for  number  of  animals,  ar-facts,  and  abstract  words.   Cross  valida0on  was  performed  at  two  levels:  one  to  pick  a  good   value  of  λ  from  10  pre-­‐specified  choices  without  peeking,  and   another  to  evaluate  the  chosen  λ  for  all  train-­‐test  combina0ons   of  the  10  blocks  of  words.

Least  Absolute  Shrinkage  and  Selec0on  Operator   (LASSO)[5]  regression  applies  an  L1-­‐norm  regularizer  to   p find  a  sparse  solu-on. � argmin l(β) + λ||β||1 , where ||β||1 =

j=1

SUMMARY:  LASSO  sa0sfies  two  condi0ons,  which  can   be  parametrically  priori0zed;  (1)  minimize  predic0on   error  (2)  while  keeping  the  sum  of  the  absolute  values   of  β  low.  For  large  enough  λ  many  βi's  are  zero-­‐-­‐-­‐the   solu0on  is  sparse.

Simulation: Methods Created  1-­‐D  brains  with  128  voxels  for  30  “subjects”,  with   three  structures.

Distributed Diff. Distrib.

Subj 1

=

...

=

space



space

‣Noise  added  to  each  voxel   N (µ = 0, σ = 1).   ‣Signal  strength  =  2. ‣When  applying  a  blur,  FWHM  =  4  voxels. ‣Analyzed  either  by  univariate  method  (t-­‐test)  or  LASSO.

Distributed

Diff. Distributed

3 2 1 0

Ind. Group

Raw

Ind. Group

Smooth

Ind. Group

Raw

Ind. Group

Smooth

Ind. Group

Raw

Ind. Group

Smooth

0

0.4

Hit Rate

0.8

Univariate LASSO

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Ind. Group

Smooth

Raw

Smooth

Raw

Smooth

0

0.4

0.8

False Rate

Raw

Raw

Smooth

Raw

Smooth

0.6 0.4

0.1

0.2

0

0

Max ! best Sparsity Helped Sparsity Hurt

!

t(13)=6.03,  sd=.16,  p 2.16, k > 20

References: 1.Rogers,  Timothy  T.,  Ma]hew  A.  Lambon  Ralph,  Peter  Garrard,  Sasha  Bozeat,  James  L.   McClelland,  John  R.  Hodges,  and  Karalyn  Pa]erson.  “Structure  and  Deteriora0on  of   Seman0c  Memory:  A  Neuropsychological  and  Computa0onal  Inves0ga0on.”  Psychological   Review  111(1):  205–235,  2004 2.Cox,  RW.  AFNI:  So3ware  for  analysis  and  visualiza0on  of  func0onal  magne0c  resonance   neuroimages.  Computers  and  Biomedical  Research,  29:162-­‐173,  1996. 3.Cor0cal  reconstruc0on  and  volumetric  segmenta0on  was  performed  with  the  Freesurfer   image  analysis  suite,  which  is  documented  and  freely  available  for  download  online   (h]p://surfer.nmr.mgh.harvard.edu/). 4.Tomioka,  R.,  Suzuki,  T.,  Sugiyama,  M.  and  Kashima,  H.  A  Fast  Augmented  Lagrangian   Algorithm  for  Learning  Low-­‐Rank  Matrices.  Proc.  of  the  27th  Annual  Interna0onal   Conference  on  Machine  Learning  (ICML  2010),  Haifa,  Israel,  2010 5.Has0e,  T.,  Tibshirani,  R,  and  Friedman,  J.    “ The  Elements  of  Sta0s0cal  Learning:  Data   Mining,  Inference,  and  Predic0on”.    Second  Edi0on,  2009 6.Chao,  Haxby,  and  Mar0n.  “A]ribute-­‐based  neural  substrates  in  temporal  cortex  for   perceiving  and  knowing  about  objects”.  Nature  Neuroscience,  2(10):913-­‐919,  1999

• LASSO   provides   evidence   of   r a d i c a l l y   d i s t r i b u t e d   a n d   overlapping   code   in   areas   implicated  by  disorder.[1] • Each   solu0on  is   likely   a  subset   of   the  full  system. • Elastic  nets  blend   LASSO  and   ridge   r e g r e s s i o n ,   s p a r s i t y   a n d   inclusiveness. • C o n st ra i n i n g   re g re s s i o n   i n   Adapted from Our results, different   ways   changes   the   Chao, Haxby, including voxels Martin in solution of >3 preferred   solution;   different   brain   and (1999) [6] participants regions   may   encode   semantics   in   different  ways. • Tuning  elastic  nets  to  prefer   solutions  of  different  types  should   identify   regions   thought   to   represent   semantics   in   a   corresponding  way.