Using genomics to unravel diversity and population

2 downloads 0 Views 4MB Size Report
presence of ACME. Interesjng clusters are shaded and named A. -‐ H. ISOLATE COLLECTION. In total 1,453 community-‐ and hospital-‐acquired ST8 isolates ...
Using  genomics  to  unravel  diversity  and  population   structure  of  MRSA  USA300  to  an  unprecedented  depth     Janina  DORDEL1,  Ma.hew  TG  HOLDEN1,2,  Brian  G  SPRATT3,  Anne-­‐Catrin  UHLEMANN4,    Susan  S  HUANG5,  Julian  PARKHILL1    

1  Pathogen  Genomics,  Sanger  Ins4tute,  Cambridge,  UK  2  School  of  Medicine,  University  of  St  Andrews,  St  Andrews,  UK  3  Department  of  Infec4ous  Disease  

Epidemiology,  Imperial  College  London,  London,  UK  4  Department  of  Medicine,  Division  of  Infec4ous  Diseases,  Columbia  University  Medical  Center,  New   York,  NY,  USA  5  Division  of  Infec4ous  Diseases  and  Health  Policy  Research  Ins4tute,  University  of  California  Irvine  School  of  Medicine,  Irvine,  CA,  USA  

INTRODUCTION   Over   the   past   two   decades   community-­‐associated   methicillin   resistant   Staphylococcus   aureus   (CA-­‐MRSA)   strains   have   drama-cally   increased   the   global   burden   of   S.   aureus   infec-ons.   The   pandemic   sequence   type   ST8/USA300   is   the   dominant   CA-­‐MRSA   clone   in   the   United  States,  but  its  evolu-onary  history  and  basis  for  biological  success  are  incompletely  understood.   Here   we   present   a   detailed   analysis   of   the   distribu-on   and   diversity   of   mobile   gene-c   elements   and   their   associated   virulence   and   resistance  factors  to  reveal  how  these  shape  the  popula-on  of  USA300  and  may  have  promoted  its  spread.        

PHYLOGENY  OF  ST8   ISOLATE   COLLECTION.   In   total   1,453   community-­‐   and   hospital-­‐acquired   ST8   isolates   from   Orange   County,   CA,   Northern   ManhaXan,   NY,   San   Diego,   TX,   San   Francisco,   CA,   and   Houston,   TX   were   collected   between  2003  and  2011.  

Geographical Origin Orange County, CA San Francisco, CA San Diego, TX Houston, TX

SCCmec IVa IVg

A IVb IVc MSSA

B

ACME

DISTRIBUTION  OF  PLASMIDS  

➩  rep16  and  rep21  most  prevalent  in  USA300-­‐like   isolates  indica-ng  clonal  propaga-on   ➩  no   significant   geographical   prevalence   in   USA300   ➩  sporadic  acquisi-on  and  loss  of  all  plasmids   ➩  cluster  A:  lacks  prevalent  rep21  plasmid   ➩  cluster  H:  acquisi-on  of  rep7  plasmid  (pUSA02)  

ogra

AC phic SCCm ME al o ec rigin

presence absence

PLASMID   CLASSIFICATION.   Plasmids   were   classified   using   conserved   areas   of   the   replica-on   ini-a-ng   genes   (rep)   (Lozano   et   al.  2012).      

Ge

C D

H

E

F G

Fig.   1:   Phylogene-c   tree   of   1,453   ST8   isolates   based   on   37,323   core   genome   SNPs,   rooted   by   using   the   distantly   related   S.   aureus   isolate   COL   as   an   outgroup.   The   USA300   core   clade   is   shaded   in   black,   and   the   non-­‐USA300   ST8   clades   are   in   lighter   shade.   Colors   of   branches   and   the   inner   ring   indicate   the   geographical   origin   of   the   isolates,   then   SCCmec   type   and   presence   of   ACME.   Interes-ng   clusters   are   shaded   and   named   A   -­‐  H.  

➩  “bush”   isolates   distantly   related   (includes    MSSA  and  USA500)   ➩  geographical  clustering     The  core  clade:  USA300-­‐like  isolates   ➩  ~   94   %   of   isolates   cluster   within   a   c l o s e l y   r e l a t e d   c o r e -­‐ c l a d e   represen-ng  USA300-­‐like  isolates   ➩  SCCmec  IVa  and  ACME  posi-ve   ➩  limited   geographical   and   HA/CA   clustering   ➩  clusters   A   -­‐   H   represent   possible   successful   and   more   recent   clusters   (all   linked   to   the   acquisi-on   of   mobile  gene-c  elements)  

Phylogenomics   and   phylogeography   indicate   a   highly   dynamic   USA300-­‐like   MRSA  popula4on  even  over  very  long  distances.  

DISTRIBUTION  AND  DIVERSITY  OF  PROPHAGES  

Fig.  4:  Phylogene-c  tree  as  in  Fig.  1.  Colors  in   the   rings   represent   presence   and   absence   of   11  plasmid/rep  gene  types.  

DRUG  RESISTANCE   IN  SILICO  ANTIBIOTICA  RESISTANCE  TESTING.  Resistance  against  16  an-bio-c  families  was  tested  by  detec-on  of   53  acquired  resistance  determinants  and  12  SNPs  in  housekeeping  genes.    

➩  drug  resistance  does  not  differ  between  geographical  origin  or  CA/HA  state   ➩  sporadic   acquisi-on   of   resistance   determinants   without   further   propaga-on   for   most  an-bio-c  classes  (for  excep-ons  see  box  and  text  below)  

Acquisi4on   and   subsequent   clonal   propaga4on   of   kanamycin,   macrolide   &   fluoroquinolone   resistance   together   with   core   genome   changes   may   have   promoted  the  success  of  the  USA300-­‐like  isolates.  

PROPHAGE   CLASSIFICATION.   Prophages   were   classified     using   conserved   areas   of   the   integrase   genes   (int)   (Goerke   et   al.   2013).   Absence  and  presence  of  the  PVL-­‐genes  were  detected  using  an  in   silico  PCR  approach.        

➩  ➩  ➩  ➩  ➩  ➩  Fig.  2:  Phylogene-c  tree  as  in  Fig.  1.  Colors  in   the   outer   rings   represent   presence   and   absence  of  7  prophage  types.  

Fig.   3A:   Distribu-on   of   SNPs   along   the   φSa2int  haplotype  7  prophage  (shaded  in  red)   with  surrounding  core  genome  areas  (shaded   in   grey).   Asterisk   highlights   the   posi-on   of   the   PVL   locus.   B:   Representa-on   of   the   φSa2int   genome.   Prophage   genes   are   colored   according  to  their  func-on.  

no  significant  geographical  prevalence   sporadic  acquisi-on  and  loss  of  all  prophages   cluster  A:  acquisi-on  of  φSa5int   subcluster  within  B:  acquisi-on  of  φSa1int   subcluster  within  C:  acquisi-on  of  φSa5int     cluster  F:  acquisi-on  of  φSa1int  

The  PVL  carrying  prophage  φSa2int   ➩  diversity  analysis  reveal  seven  haplotypes  which   match  the  phylogene-c  tree  of  the  ST8  host   ➩  only  haplotype  7  carries  the  PVL-­‐gene   ➩  presence   of   haplotype   7   matches   core   clade   and  progenitor  isolates   ➩  high   SNP   density   in   φSa2int   haplotype   7   compared  to  the  core  genome   ➩  PVL   locus   shows   lowest   number   of   SNPs   highligh-ng   the   selec-ve   advantage   of   this   virulence  factor   ➩  gene   boundaries   show   lower   SNP   density   indica-ng   the   transfer   of   whole   genes   from   a   prophage  gene  pool  

TAKE  HOME  MESSAGES  ABOUT  USA300-­‐LIKE  ISOLATES   ➩  highly  dynamic  popula-on  without  geographical  structures   ➩  MGEs   provide   USA300-­‐like   isolates   with   resistance   and   virulence   factors,   which  may  shape  the  popula-on  and  promote  its  spread   ➩  no  significant  geographical  correla-on  between  presence/absence  or  degree   of  diversifica-on  of  MGEs   ➩  examples  for  possible  successful  clusters  within  the  USA300-­‐like  clade  show   accumula-on  and  further  diversifica-on  of  MGEs  

Fig.  5:  Phylogene-c  tree  as  in  Fig  1.  Colors  in   the   outer   rings   represent   -­‐from   inside   to   outside-­‐   the   SCCmec   type   and   determinants   conferring  resistance  against  an-bio-cs.  

  ➩  kanamycin   (aph3)   and   macrolide   (msrA)   resistance   clusters   in   ST8   SCCmec-­‐IVg   and   USA300-­‐like  core  clade.  Inferred  through  uptake   of  p18805-­‐P03  (rep16,  see  plasmids)   ➩  fluoroquinolone   resistance   clusters   in   ST8   SCCmec-­‐IVg   and   a   cluster   within   the   USA300-­‐ like  core  clade  defined  through  different  alleles   in  grlA  and  gyrA,  respec-vely   ➩  cluster  F:  addi-onal  fluoroquinolone  resistance   conferring  SNP  in  grlB   ➩  subcluster   within   H:   tetracycline   resistance   (tetK)  through  acquisi-on  of  pUSA02  (rep7,  see   plasmids)  

PATHOGENICITY  ISLAND  5   DETECTION  OF  SaPI5.  SaPI5  was  extracted  from  the  whole  genome   alignment   of   mapped   reads   against   USA300   FPR3757.   A   threshold   of   85   %   mapping   coverage   was   used   to   detect   the   presence   of   SaPI5.  Associated  enterotoxins  sek2  and  seq2  genes  were  detected   using  an  in  silico  PCR  approach.      

Fig.  6:  Phylogene-c  tree  as  in  Fig  1.  Colors  in   the   inner   ring   represent   the   presence   and   diversity   (inferred   by   mapping   %)   of   SaPI5.   Rings  2  and  3  show  the  presence  of  associated   enterotoxins.  

➩  presence   of   SaPI5   matches   core   clade   and   progenitor  isolates   ➩  sporadic   loss   of   SaPI5   and   enterotoxins   sek2   and  seq2   ➩  d egree   o f   ma p p i n g   covera ge   revea l s   degrada-on   and/or   further   diversifica-on   from   the   reference   for   clusters   A,   F   &   G   and   a   subcluster  within  B  

Combina4on   of   mobile   gene4c   elements   and   core   genome   SNPs   provide  USA300  with  selec4ve  advantages  and  promotes  its  spread.