SHORT COMMUNICATION
doi: 10.1111/age.12416
Selection signatures in Shetland ponies M. Frischknecht*†‡, C. Flury‡§, T. Leeb†‡, S. Rieder*‡ and M. Neuditschko*‡ *Agroscope, Swiss National Stud Farm, Les Longs-Pres, 1580 Avenches, Switzerland. †Vetsuisse Faculty, Institute of Genetics, University of Bern, Bremgartenstrasse 109a, 3012 Bern, Switzerland. ‡Swiss Competence Center of Animal Breeding and Genetics, University of Bern, Bern University of Applied Sciences HAFL & Agroscope, Bremgartenstrasse 109a, 3001 Bern, Switzerland. §School of Agricultural, Bern University of Applied Sciences, Forest and Food Sciences, L€ anggasse 85, 3052 Zollikofen, Switzerland.
Summary
Shetland ponies were selected for numerous traits including small stature, strength, hardiness and longevity. Despite the different selection criteria, Shetland ponies are well known for their small stature. We performed a selection signature analysis including genome-wide SNPs of 75 Shetland ponies and 76 large-sized horses. Based upon this dataset, we identified a selection signature on equine chromosome (ECA) 1 between 103.8 Mb and 108.5 Mb. A total of 33 annotated genes are located within this interval including the IGF1R gene at 104.2 Mb and the ADAMTS17 gene at 105.4 Mb. These two genes are well known to have a major impact on body height in numerous species including humans. Homozygosity mapping in the Shetland ponies identified a region with increased homozygosity between 107.4 Mb and 108.5 Mb. None of the annotated genes in this region have so far been associated with height. Thus, we cannot exclude the possibility that the identified selection signature on ECA1 is associated with some trait other than height, for which Shetland ponies were selected. Keywords Composite Selection Score, Horse
Horse breeds are characterized by numerous different traits, including conformation, performance and behavior. Height has been a trait under intense selection from the beginning of domestication until now (Makvandi-Nejad et al. 2012). Therefore, a great variation of height within modern horses can be noticed, starting from about 70 cm height at withers (e.g. Mini Shetland Ponies) up to 2 m (e.g. Percheron and Shire Horse). A genome-wide association study involving multiple breeds including Shetland ponies revealed that four loci located on ECA3, ECA6, ECA9 and ECA11 explain about 83% of the size variation in horses (Makvandi-Nejad et al. 2012). Three of the four reported loci contain candidate genes for height: LCORL on ECA3, HMGA2 on ECA6 and ZFAT on ECA9. The LCORL/NCAPG and ZFAT genes were already previously reported to control size variation in the Franches-Montagnes (FM) horse breed (Signer-Hasler et al. 2012). Additionally, the LCORL/NCAPG locus has been found to be associated with height in Warmblood and Thoroughbred horses (Metzger et al. 2013; Tetens et al.
Address for correspondence M. Neuditschko, Agroscope, Swiss National Stud Farm, Les Longs-Pre´s, 1580 Avenches, Switzerland. E-mail:
[email protected] Accepted for publication 13 December 2015 © 2016 Stichting International Foundation for Animal Genetics
2013; Boyko et al. 2014). The fourth height QTL was located on ECA11 next to the LASP1 gene, but it is not clear whether LASP1 really is the causative gene. To identify putative selection signatures associated with height/body size, we collected genome-wide SNP genotypes for a total of 75 Shetland ponies (48 samples genotyped on the Illumina Equine SNP70 chip and an additional 27 samples genotyped on the Illumina Equine SNP50 chip) and 76 large-sized horses consisting of 24 Clydesdale, 22 Percheron and 30 Belgian Draft horses. The 50K genotype information of the 27 Shetland Ponies and the 76 largesized horses were derived from Petersen et al. (2013b) (www.animalgenome.org/repository/pub/UMN2012.1130/). The draft horses included in this study were phylogenetically more closely related to Shetland ponies compared with other large-sized breeds (e.g. Warmblood and Thoroughbred) (Petersen et al. 2013a). Thus, this selection of horses provided an optimal reference population to investigate selection signatures in Shetland ponies associated with height. We did not observe any stratification between the two independently sampled Shetland pony datasets in conducting a principal components analysis using 30 709 common SNPs (Fig. S1). For the final analysis of the multibreed dataset, we used quality control (QC) filtered SNPs, excluding markers with a minor allele frequency of less than 0.05, the fraction of missing genotypes greater 1
2
Frischknecht et al.
Figure 1 Selection signature analysis: (a) Result of the composite selection score statistics. (b) Interval of homozygosity on ECA1. Homozygous haplotypes are indicated as horizontal black lines. A homozygous interval shared among 50 Shetland ponies is indicated in red and spanned from 107 419 578 to 109 678 693 on chromosome 1. (c) Genes in the region from 103.8 Mb to 109.7 Mb.
than 0.1 and violating Hardy–Weinberg equilibrium (P < 10 7). After pruning the dataset, 151 samples and 33 805 SNPs were included in the selection signature analysis. Additionally, we investigated whether SNPs with alternatively fixed alleles within the pony and large horse population were removed during the QC filtering. In this study, we applied the composite selection score (CSS) method to detect selection signatures within our multibreed dataset (Randhawa et al. 2014). The CSS method takes as input the results of different test statistics that are commonly applied to identify selection signatures. Here, we decided to work with four different test statistics, including FST statistics (Holsinger & Weir 2009), cross-population haplotype homozygosity (XP-EHH) (Sabeti et al. 2007), integrated haplotype score (iHS) (Sabeti et al. 2002; Voight et al. 2006) and composite likelihood ratio (CLR) (Nielsen et al. 2005). It should be noted that three of the four test statistics (FST, XP-EHH and CLR) detect selection signatures between populations (test vs. reference population), whereas the iHS statistic computes selection signatures within populations. Therefore, we applied the iHS statistic only on the Shetland pony dataset, which included 75 samples and
30 709 SNPs. For the other three methods, we included the collection of the 76 large horses as a reference population (Fig. S2). Finally, we combined the results of the four test statistics and calculated the CSS per SNP, as previously described (Randhawa et al. 2014). To minimize the amount of false positives and the large standard variations, selection signatures are commonly averaged over windows. Following Petersen et al. (2013b), we used windows of 500 kb and considered only windows containing at least four SNPs. Our dataset contained 5931 overlapping windows. Applying the CSS method to our filtered multibreed dataset, the top associated windows were located on ECA1 between 103.8 Mb and 108.5 Mb (Fig. 1a). A total of 33 annotated genes are located within this region. Two of these genes, IGF1R at 104.2 Mb and ADAMTS17 at 105.4 Mb, were previously associated with height (Lango Allen et al. 2010). While running the single FST test statistics on the unfiltered multibreed dataset, we identified an additional selection signature based on two SNPs on ECA3 including the LCORL/NCAPG locus (Fig. S3). Investigating the genotype frequency of these two SNPs in our samples revealed that both SNPs are close to fixation to alternative alleles
© 2016 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12416
Selection signatures in Shetland ponies within the large horses and Shetland ponies; therefore, these SNPs were removed during the QC filtering. In order to refine the map position of the selection signature on ECA1, we additionally performed a homozygosity mapping based upon unfiltered Shetland pony data using the -homozyg function as implemented in PLINK (Purcell et al. 2007) and subsequent visual inspection of the top interval. We identified a homozygous region within Shetland ponies between 107.4 Mb and 109.7 Mb on ECA1 (Fig. 1b). Of the 75 Shetland ponies, 50 were homozygous for this region. The region shared between the selection signature and the homozygous interval spans from 107.4 Mb to 108.5 Mb and includes 15 annotated genes (Fig. 1c). None of these genes have so far been associated with height. Strong selection for small height in Shetland ponies might have led to the identified selection signature. If this is the case, the candidate genes under selection are most likely IGF1R and ADAMTS17, given that both genes have been previously associated with human height (Lango Allen et al. 2010) and IGF1R additionally to dwarfism in cattle (Blum et al. 2007). However, the region of these two genes does not overlap with the most pronounced homozygosity window. One possibility could be that unknown genes located within the overlapping interval are associated with the small size of Shetland ponies. In the FM horse breed, ECA1 explained a major fraction of the variance observed in height, suggesting that this chromosome harbors not-yetidentified important genes for height (Signer-Hasler et al. 2012). Another possibility is that the selection signature is associated with another trait. In this context, the FAM189A1 gene would be a perfect candidate for selection. This gene has previously been associated with human body mass index (http://www.ncbi.nlm.nih.gov/gap/phegeni? tab=1&gene=23359). It is known that Shetland ponies are very feed efficient horses, and therefore, it is plausible that a gene associated with weight gain in humans might have been under selection within this breed. Further research is needed to identify the trait associated to the selection signature with more confidence.
Acknowledgements The authors would like to thank to Dr. Imtiaz Randhawa for providing the CSS script. We would also like to thank Dr. Regula Hauswirth, Dr. Doreen Becker and Dr. Iris Bachmann for the collection of blood samples of the 48 Shetland ponies.
References Blum J.W., Elsasser T.H., Dreger D.L., Wittenberg S., de Vries F. & Distl O. (2007) Insulin-like growth factor type-1 receptor downregulation associated with dwarfism in Holstein calves. Domestic Animal Endocrinology 33, 245–68.
Boyko A.R, Brooks S.A., Behan-Braman A., et al. (2014) Genomic analysis establishes correlation between growth and laryngeal neuropathy in Thoroughbreds. BMC Genomics 15, 259. Holsinger K.E. & Weir B.S. (2009) Genetics in geographically structured populations: defining, estimating and interpreting F (ST). Nature Reviews Genetics 10, 639–50. Lango Allen H., Estrada K., Lettre G. et al. (2010) Hundreds of variants clustered in genomic loci and biological pathways affect human height. Nature 467, 832–8. Makvandi-Nejad S., Hoffman G.E., Allen J.J., et al. (2012) Four loci explain 83% of size variation in the horse. PLoS One 7, e39929. Metzger J., Schrimpf R., Philip U. & Distl O. (2013) Expression levels of LCORL are associated with body size in horses. PLoS One 8, e56497. Nielsen R., Williamson S., Kim Y., Hubisz M.J., Clark A.G. & Bustamante C. (2005) Genomic scans for selective sweeps using SNP data. Genome Research 15, 1566–75. Petersen J.L., Mickelson J.R., Cothran E.G. et al. (2013a) Genetic diversity in the modern horse illustrated from genome-wide SNP data. PLoS One 8, e54997. Petersen J.L., Mickelson J.R., Rendahl A.K. et al. (2013b) Genomewide analysis reveals selection for important traits in domestic horse breeds. PLoS Genetics 9, e1003211. Purcell S., Neale B., Todd-Brown K. et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81, 559–75. Randhawa I.A.S., Khatkar M.S., Thomson P.C. & Raadsma H.W. (2014) Composite selection signals can localize the trait specific genomic regions in multi-breed populations of cattle and sheep. BMC Genetics 15, 34. Sabeti P.C., Reich D.E., Higgins J.M. et al. (2002) Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832–7. Sabeti P.C., Varilly P., Fry B. et al. (2007) Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–8. Signer-Hasler H., Flury C., Haase B., Burger D., Simianer H., Leeb T. & Rieder S. (2012) A genome-wide association study reveals loci influencing height and other conformation traits in horses. PLoS One 7, e37282. Tetens J., Widmann P., K€ uhn C., & Thaller G. (2013) A genomewide association study indicates LCORL/NCAPG as a candidate locus for withers height in German Warmblood horses. Animal Genetics 44, 467–71. Voight B.F., Kudaravalli S., Wen X. & Pritchard J.K. (2006) A map of recent positive selection in the human genome. PLoS Biology 4, e72.
Supporting information Additional supporting information may be found in the online version of this article. Figure S1 Principal components analysis of Shetland pony (SHP) genotypes originating from two different microarrays. Figure S2 Selection signature analysis following different test statistics. Figure S3 FST statistics applied on unfiltered data reveals two SNPs on ECA3 in the region of LCORL/NCAPG.
© 2016 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12416
3