Using the 90k Buffalo SNP array

1 downloads 0 Views 553KB Size Report
InternaConal effort to produce the first Buffalo SNP array (Affymetrix Axiom® ... Axiom Buffalo Genotyping Array. Array: 90k SNPs ... (MON) for news about this!
Using  the  90k  Buffalo  SNP  array   Ezequiel  Luis  Nicolazzi,  Cur$s  P.  VanTassell,  Daniela  Iamar$no,  James  M.  Reecy,     Eric  FritzWaters,  Tad  S.  Sonstegard,  James  E.  Koltes,  Steven  G.  Schroeder,  Ali  Ahmad,     Jose  Fernando  Garcia,  Luigi  Ramunno,  Gianfranco  Cosenza,  John  Williams  and  the   Interna'onal  Buffalo  Consor'um  

From  SNP  array  data  to  applicaBon   •  Interna$onal  effort  to  produce  the  first  Buffalo  SNP   array  (Affymetrix  Axiom®  technology)  in  the   framework  of  the  Interna$onal  Buffalo  Consor$um,   coordinated  by  PTP.  Presented  at  PAG  XXI   •  And  then?  

–  Genotyping  of  (a  lot  of)  individuals   –  ExtracBon  of  SNP  genotypes  from  raw  Affymetrix  files   –  QC     –  Genomic  analyses  

StarBng  data   A  total  of  1605  individuals  genotyped  with  the  Affymetrix   Axiom  Buffalo  Genotyping  Array       Array:  90k  SNPs      123k  probes  genotyped  (~33k  double-­‐SNP  probes)     Star$ng  dataset:   –  1376  River  Buffalo  (Bubablus  Bubalis)  

•  10  countries  (ITA,BRA,COL,EGY,IRN,MOZ,PHL,PAK,ROM,TUR)  

–  206  Swamp  Buffalo  (Bubablus  Bubalis)  

•  6  countries  (IDN,  PHL,  BRA,  CHN,  PHL,  THA)  

   

–  15  South-­‐African  Cape  Buffalo  (Syncerus  Caffer)   –  15  Indonesian  Anoa    (Bubalus  Depressicornis)  

Extract  SNP  array  data    

•  Affymetrix  provides  raw  intensity  files  (named   “CEL”  files)  that  have  to  be  QC’d  to  obtain   genotype  files.     •  Sojware  is/may  be  an  issue    

–  Windows:  Affymetrix  Genotyping  Console  (GUI),   automated.   –  Linux/Unix/Mac:  Affymetrix  Power  Tools  (command   line)  +  SNPolisher  R  package.  Step  by  step  procedure,   not  automated.   –  Probably  not  an  issue  anymore:  See  Affy’s  workshop   (MON)  for  news  about  this!    

AffyPipe   hJps://github.com/nicolazzie/AffyPipe  

•  Created  for  the  Buffalo  species   purposes,  but  extended  to  any   species  genotyped  with  Axiom   technology   •  Open-­‐source  (many  improvements   came  from  the  Affy-­‐users   community!)   •  Features:  

   

Nicolazzi,  et  al.  (2014)  Bioinforma'cs  

•  Avoids  step-­‐by-­‐step  procedure   •  Does  not  require  ANY  programming   skills   •  Standardizes  the  workflow   •  Automa$c  edi$ng  of  individuals   •  Automa$c  edi$ng  of  SNP  probes   •  Output  file  in  PLINK  format  (A/B  or   A/C/G/T)  

 

First  QC  ediBng  (AffyPipe)    

Default  thresholds     (Affymetrix  Best  Prac'ce)  

River  

Swamp  

Cape  

Anoa  

PHR  

74.76%  

48.34%  

0.85%  

2.55%  

MHR  

6.83%  

6.65%  

72.26%  

55.89%  

VINO  

0.79%  

2.1%  

1.19%  

1.49%  

NMH  

1.23%  

23.85%  

3.43%  

8.32%  

CRBT  

4.55%  

3.41%  

9.43%  

20.72%  

Other  

11.83%  

15.65%  

12.84%  

11.03%  

In  green  “high  quality”  probes  

From:  Affymetrix  Best  Prac'ce  manual  

Problems  encountered  

•  Unstable    #  of  “high  quality”  SNP  probes  obtained  from   different  plate  and  batch  (e.g.  more  plates  in  different  $mes)   extrac$ons   •  Two  probes  corresponding  to  the  same  SNP  giving   inconsistent  (some$mes  opposite)  genotype  calls.   Both  studied  over  River  Buffalo  (>  #  of  samples)  

1)  Unstable  extracBon  of  Gtypes   •  Not  an  “issue”  per  se.  Intrinsic  to  Affymetrix   procedure  of  SNP  extrac$on  (assigning  genotypes   using  bayesian  model)   •  Low  number  of  individuals/plate  (max  96)   •  Repeatability:  99.99%  

1)  Unstable  extracBon  of  Gtypes   Much  more  stable  “consensus”  (e.g.  same  probes  being  of  3  “high   quality”  classes)  results  ajer  >500  individuals  

SOLUTION:  Combine  mulBple  plates  in  1  single  extracBon                          …  use  as  many  samples  as  possible!  

2)  Inconsistent  probes     Using  full  data  (all  individuals)  extrac$on.       Concordant  probe  pair   Discordant  probe  pair                      PROBE  1.a    =>  genotypes  “BB”                        PROBE  2.a    =>    genotypes  “AA”                    PROBE  1.b    =>  genotypes  “BB”                                  PROBE  2.b    =>    genotypes  “BB”       Most  cases  within-­‐class  (mainly  monomorphic  SNPs).     Few  “across  genotype  class”  (1  probe  PHR,  1  probe  MHR  or  NMH)       Not  an  issue  in  nearly  all  genomic  analyses.     BIG  issue  in  biodiversity  analyses..                                    …especially  in  biodiversity  analyses  across  buffalo  species/populaBons!    

2)  Inconsistent  probes   Extent  of  the  problem:     •  From  the  “original”  33k  double-­‐probes/SNP,  20k  had  both  probes   with  a  “high  quality”  classifica$on.   •  From  these,    5  individual  genotypes.       Again,  lots  of  work/conference  calls  with  Affymetrix  R&D  people.     Some  bexer  results  with  a  more  stringent  “intensity”  cutoff  during  genotype   extrac$on.     SBll,  no  final  soluBon  to  this  issue.     PATCH:  IdenBfy  the  “bestprobe”  of  the  two  in  the  largest  populaBon  and   consider  only  that  PROBE  throughout  the  comparison  (especially  if         mulBple  species/pops  are  considered!).    

Conclusions   •  SNP  chip  successfully  tested  on  Buffalo  (see  analyses  later   on).   •  A  bioinforma$cs  pipeline  built  to  automa$cally  extract   genotypes  in  Linux/Mac  environment,  named  AffyPipe   •  Affymetrix  is  working  on  a  (mul$-­‐playorm?)  new   extrac$on  suite  (beta-­‐tested)   •  2  main  issues  encountered,  partly  solved.   •  Advises  to  users:  

–  Know  the  technology  and  read  carefully  Affy’s  documenta$on!   –  Know  your  goals  (issues  not  an  issue  for  most  common   analyses)   –  Extract  genotypes  with  the  highest  number  of  individuals   possible   –  If  comparing  mul$ple  popula$ons,  consider  always  the  same   “bestprobe”  probes  

Thank  you  for  your  aJenBon!  

Suggest Documents