The evolutionary history of the DMRT3 ‘ ... - Wiley Online Library

1 downloads 0 Views 856KB Size Report
The evolutionary history of the DMRT3 'Gait keeper' haplotype ... *Department of Medical Biochemistry and Microbiology, Uppsala University, SE-75123 ...
doi: 10.1111/age.12580

The evolutionary history of the DMRT3 ‘Gait keeper’ haplotype E. A. Staiger*1, M. S. Alm en*, M. Promerov a*2, S. Brooks†, E. G. Cothran‡, F. Imsland*, K. J€aderkvist Fegraeus§, G. Lindgren§, H. Mehrabani Yeganeh¶, S. Mikko§, J. L. Vega-Pla**, T. Tozaki††, C. J. Rubin* and L. Andersson*‡§ *Department of Medical Biochemistry and Microbiology, Uppsala University, SE-75123 Uppsala, Sweden. †Department of Animal Science, University of Florida, Gainesville, FL 32611-0910, USA. ‡Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77843-4458, USA. §Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, SE-75007 Uppsala, Sweden. ¶Department of Animal Science, University of Tehran, 54500 n Aplicada, Crıa Caballar de las Fuerzas Armadas, 14080 Cordoba, Spain. ††Genetic Analysis Tehran, Iran. **Laboratorio de Investigacio Department, Laboratory of Racing Chemistry, Tochigi 320-0851, Utsunomiya, Japan.

Summary

A previous study revealed a strong association between the DMRT3:Ser301STOP mutation in horses and alternate gaits as well as performance in harness racing. Several follow-up studies have confirmed a high frequency of the mutation in gaited horse breeds and an effect on gait quality. The aim of this study was to determine when and where the mutation arose, to identify additional potential causal mutations and to determine the coalescence time for contemporary haplotypes carrying the stop mutation. We utilized sequences from 89 horses representing 26 breeds to identify 102 SNPs encompassing the DMRT3 gene that are in strong linkage disequilibrium with the stop mutation. These 102 SNPs were genotyped in an additional 382 horses representing 72 breeds, and we identified 14 unique haplotypes. The results provided conclusive evidence that DMRT3:Ser301STOP is causal, as no other sequence polymorphisms showed an equally strong association to locomotion traits. The low sequence diversity among mutant chromosomes demonstrated that they must have diverged from a common ancestral sequence within the last 10 000 years. Thus, the mutation occurred either just before domestication or more likely some time after domestication and then spread across the world as a result of selection on locomotion traits. Keywords domestication, donkey, horse, locomotion, Przewalski’s horse

Introduction No other species displays the same level of congenital variation in locomotion traits as the horse. Based upon speed and footfall patterns, the horse has four major gait classifications. The fastest gait is the gallop, a four-beat asymmetrical gait with a moment of suspension at speeds of 9–20 m/s (Barrey 2013). A slower variation of the gallop is the canter, a three-beat asymmetrical gait with a moment of suspension at 2.9–9 m/s and considered a separate gait for competitions (Barrey 2013). The slowest gait is the walk, an Address for correspondence L. Andersson, Department of Medical Biochemistry and Microbiology, Uppsala University, Box 582, SE-75123 Uppsala, Sweden. E-mail: [email protected] 1 Present address: Department of Animal Science, Cornell University, Ithaca, NY 14853, USA 2 Present address: Department of Archaeogenetics, Max Planck Institute for the Science of Human History, 07745 Jena, Germany Accepted for publication 25 May 2017 © 2017 Stichting International Foundation for Animal Genetics

even four-beat symmetrical gait with no suspension (two or three legs on the ground at any given time) at speeds of 1.2– 1.8 m/s (Barrey 2013). Gaits other than the aforementioned gaits, performed at speeds generally faster than the walk, are known as the intermediate gaits; these include among others trot, pace, amble, t€ olt (or rack), running walk, foxtrot, and marcha batida, and they can be performed at speeds ranging from 2.8 to 16 m/s (Barrey 2013). The intermediate gaits are more of a multidimensional continuum rather than a distinct set of classifications due to variation in the minute differences of speed, footfall patterns and stance durations (Hildebrand 1965). General characterization of the intermediate gaits typically relies on the footfall sequence, support sequence, footfall timing, cadence and sometimes movement of the head, such as the presence or absence of an evenly timed head shake (Harris 1993; Ziegler 2005). Trot and pace range in speed and are defined by two moments of suspension when all four feet are off the ground simultaneously. Trot is diagonally symmetrical, and pace is laterally symmetrical. Most of the other intermediate gaits are ambling gaits of variable speeds that are defined by 1

2

Staiger et al. symmetry ranging between diagonal and lateral, and lack of suspension, when one or two feet are on the ground at any given time (Nicodemus & Clayton 2003). Both the trot and pace have a brief moment of suspension when all four feet are off the ground before one pair lands at the same time. Any horse able to perform any of the four-beat intermediate gaits is considered a ‘gaited’ horse. An earlier study identified a single-base substitution in the double-sex and mab-3-related transcription factor 3 (DMRT3) gene that causes a premature stop codon and truncation of the DMRT3 protein (Andersson et al. 2012). The mutation is permissive for the ability to perform alternate gaits and occurs at high frequency in the ‘gaited’ breeds (Promerov a et al. 2014). As noted earlier, these breeds also have the ability to extend the speed of their intermediate gaits to those equivalent to, or exceeding, that of canter (Barrey 2013). The mutation also occurs at a high frequency in harness racing trotters and pacers, and it allows them to extend the speed range of their symmetrical gait (trot or pace) instead of switching to the asymmetric canter/gallop (Andersson et al. 2012; Promerov a et al. 2014). The mutation occurs at a low frequency in breeds typically considered ‘non-gaited’, and recent evidence suggests that the mutation may have a negative effect on the desired quality of trot and canter in some breeds (Kristjansson et al. 2014; J€ aderkvist Fegraeus et al. 2015; J€ aderkvist et al. 2015). A worldwide screening of horse populations identified a haplotype consisting of the stop mutation (chr23: g.22 999 655C>A) and a strongly linked tagging SNP (chr23:g.22 967 656C>T), for which the C-C and T-A haplotypes were the most common, followed by a recombinant C-A haplotype and a T-C haplotype that is ancestral to the mutant T-A haplotype (Promerov a et al. 2014). However, these haplotypes were not sufficient for the identification of where or when the mutation first arose. Therefore, the aim of this study was to generate an extended haplotype from targeted re-sequencing of the DMRT3 gene. Horses with the previously identified haplotypes, specifically the predicted ancestral and recombinant haplotypes, were selected from worldwide populations and analyzed to further explore the origin and distribution of mutant DMRT3 haplotypes across breeds.

Materials and methods Animals and phenotyping Sixteen horses representing 13 diverse breeds were selected from a previous dataset (Promerov a et al. 2014) for targetcapture sequencing of a 686-kb region surrounding DMRT3 (Table S1). The selected horses represented the four haplotypes identified by Promerov a et al. (2014). We used whole genome sequence (WGS) data (from other projects) from 73 horses, representing 15 breeds, including

Przewalski’s horses, a Quarter horse and several Thoroughbreds from the Sequence Read Archive as well as bam-slices from 49 individuals subjected to WGS (Table S1). Seventeen of the horses were homozygous for the DMRT3:Ser301STOP mutation, five were heterozygous and the remaining 67 were homozygous wild-type. We obtained 401 extracted DNA samples (i) from the Animal Genetics Laboratory, SLU, Sweden, (ii) collected by collaborators and (iii) archived at the Veterinary Genetics Laboratory, University of California, Davis, USA and used these for genotyping on a custom SNP panel. Two hundred twenty-nine horses were heterozygous at either the known linked SNP and/or the nonsense stop mutation whereas 172 horses were homozygous for both variants. Three whole-genome sequenced individuals were included as internal genotyping quality controls. Known gait phenotypes determined by video recording were available for only 54 horses; an additional 30 horses had gait phenotypes as determined by a caretaker-completed questionnaire.

Targeted re-sequencing A 686-kb sequence flanking the DMRT3:Ser301STOP mutation (chr23:22 628 976–23 315 048) from the horse reference genome (Wade et al. 2009) was selected from the University of California Santa Cruz (UCSC) genome browser (Kent et al. 2002) and uploaded to the Agilent SureDesign web service (http://earray.chem.agilent.com) for array design in Agilent’s SureSelect Target Enrichment System. Target enrichment and library preparation were performed following the protocol from Agilent’s SureSelect Paired-End Target Enrichment System at the Science for Life Laboratory at Uppsala University, Sweden. The libraries were sequenced using Illumina HiSeq instruments.

Alignment and SNP calling The Illumina reads were aligned against the horse reference genome (Equcab 2.0) using BWA v0.7.4 (Li & Durbin 2010) with the mem algorithm and default settings. After alignment, the reads were sorted with SAMTOOLS (Li et al. 2009) and read duplicates were identified using PICARD’S MARKDUPLICATES (v. 1.85; https://broadinstitute.github.io/picard/). Finally, the DMRT3 region of interest (chr23:22 628 976– 23 315 071) was extracted using SAMTOOLS. Within the DMRT3 region, SNPs were called using GATK’s UNIFIEDGENOTYPER v3.4 (McKenna et al. 2010). After SNP calling, low quality SNPs were identified and removed if they matched any of the following conditions, as recommended by GATK’s best practice: quality by depth less than 2.0, Fisher’s strand bias greater than 60, mapping quality less than 40, mapping quality rank sum less than 12.5, haplotype score greater than 13 or read position rank sum less than 8.0. After filtration, 5555 SNPs

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

DMRT3 haplotype history remained (Table S2) and were subject to imputation and phasing using BEAGLE v4.0 (Browning & Browning 2007) with default settings.

Analysis The observed heterozygosity was calculated for each variant site based on all individuals that were homozygous for the DMRT3 nonsense mutation (chr23:22 967 656). A low heterozygosity region (chr23:22 919 468–23 214 474) was selected for further analysis. By utilizing the phased SNP data, we could identify 178 SNPs within the low heterozygosity region as highly associated with chromosomes carrying either the stop mutation or the known linked SNP (chr23:22 967 656). From these SNPs we selected 102 across the region for genotyping in an extended sample panel using the Sequenom MassARRAYâ platform, processed by GeneSeek, Inc. The selected 102-SNP panel spanned the region surrounding the DMRT3 nonsense mutation, from 22 927 387 to 23 210 971 bp with an average intermarker distance of 2882 bp. SNP quality control filters were applied using PLINK v1.07 (Purcell et al. 2007). SNPs with a call rate less than 90% across all individuals (n = 10), minor allele frequency less than 5% (n = 0) or in disagreement with genome-sequencing genotype (n = 5) were excluded. Samples were excluded for a call rate less than 90% across the remaining SNPs (n = 16). After filtration, 87 SNPs remained and were subject to imputation and phasing using BEAGLE v3.0 with default settings. Genotypes from the sequenced samples were pulled for the coordinates of the custom panel SNPs and merged with the custom panel genotypes using PLINK v1.09 (Chang et al. 2015) for a total of 471 horses representing 81 breeds to be included for further analysis. HAPLOVIEW (Barrett et al. 2005) was used to visualize the extent of linkage disequilibrium (LD) among the SNPs, and the region was divided into two blocks of strong LD under the default algorithm taken from Gabriel et al. (2002). The first haplotype block of 39 SNPs was selected for further analysis and filtered down to 25 informative SNPs. Haplotypes were examined individually, and we excluded three horses with ambiguous haplotypes. We calculated nucleotide diversity of the two most common haplotypes in this sample of horses, W1 and M1, as an average of the number of heterozygous SNPs over the distance of the sequencetargeted 686-kb region in individuals identified as homozygous for either haplotype from the WGS and target sequenced individuals. The time since divergence (t) of haplotypes/alleles was calculated for the 295-kb region using t = dA/(2k), where k is the genomic substitution rate and dA is the net frequency of nucleotide substitutions calculated according to the method of Nei (1987). We used the genomic substitution rate of 7.242 9 109 per generation as reported by Schubert et al. (2014). The maximum time since divergence for the M1 haplotype was determined

by combining all identified heterozygous SNPs (n = 44) across individuals to calculate dA, whereas the minimum time since divergence was determined from an M1 haplotype individual with the fewest observed heterozygous SNPs (n = 2). NETWORK software (Bandelt et al. 1999) was used to construct a median-joining network to visualize how the genotyped haplotypes relate to one another, as not all haplotypes were detected in the sequenced individuals. The MEGA v6 program (Tamura et al. 2013) was used to generate a maximum-likelihood tree from the sequences of a donkey, a Przewalski’s horse, a homozygous W1 horse (Twilight) and a homozygous M1 horse over the 83-kb core haplotype region.

Results Definition of a 295-kb extended haplotype associated with the DMRT3 stop mutation We used target-capture sequences from 89 horses, representing 26 breeds, for a 686-kb region overlapping the DMRT3 gene to reveal sequence polymorphisms showing strong association with the DMRT3 stop mutation and to precisely define the haplotype block associated with different gait phenotypes. Horses carrying both the DMRT3 stop codon and wild-type alleles were included in the analysis. A large number of sequence variants associated with the DMRT3 stop mutation were identified (Table S2). We calculated the nucleotide diversity in sliding windows of 10 kb in horses homozygous for either the DMRT3 stop mutation or the wild-type allele in this region (Fig. 1) and revealed an extended haplotype within a 295-kb region associated with the stop mutation. This 295-kb region showed a nearly 20-fold reduction in nucleotide diversity in horses homozygous for the stop codon compared with horses homozygous for the wild-type allele (Fig. 1). Nucleotide diversity among wild-type chromosomes was similar to the genome average (Wade et al. 2009). This result shows that the Gait keeper haplotype recently underwent a selective sweep.

DMRT3:Ser301STOP mutation is a recent mutation From the 295-kb extended haplotype region, we selected and genotyped 87 SNPs linked to the stop mutation in 385 horses representing 72 breeds. Using data on these 87 SNPs, we identified two blocks of SNPs in high LD with the stop mutation (Fig. 2). The first block was pruned to 25 SNPs and spanned an 83-kb region encompassing the DMRT3 gene, whereas the second block consisted of seven SNPs and spanned a 44-kb region encompassing the DMRT3–DMRT2 intergenic region and the 50 end of the DMRT2 gene (Fig. 2). The second block did not show the same level of LD with the stop mutation as the first block did (Fig. 2). Further analysis of the first block distinguished 14

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

3

4

Staiger et al.

Figure 1 Genetic variation for the 686-kb region surrounding the DMRT3:Ser301STOP mutation. Average nucleotide diversity (p) for the two most common haplotypes, W1 and M1. The average observed nucleotide diversity of 10-kb windows were calculated among homozygote mutant M1 haplotype carriers (yellow) and homozygote wild-type W1 haplotype carriers (blue). The region showing very low heterozygosity (between red arrows) among animals homozygous for the DMRT3 stop mutation was selected for further analysis. Gene track images are from the UCSC genome browser (http://genome.ucsc.edu, accessed 2/12/2017).

haplotypes, of which 10 haplotypes (W1–W10) had the wild-type C allele at position 22 967 656 bp and the remaining four (M1–M4) had the mutant A allele (Fig. 3). The most common haplotype across our sample of horses was the W1 haplotype (62%) followed by the M1 haplotype (26%; Fig. 3). We identified two rare recombinant haplotypes (M3 and M4) carrying the stop mutation (see Table 1 for breed origin). This analysis revealed only one sequence polymorphism (chr23:22 977 337) found in approximately 15% of the haplotypes carrying the stop mutation but absent among wild-type haplotypes. Thus, this sequence variant, which defines the M2 haplotype, is a mutation that must have occurred subsequent to the stop mutation (Fig. 3). We constructed a phylogenetic tree from the eight nonrecombinant haplotypes (Fig. S1a). In this tree, there are two major clusters defining the haplotypes that are most similar to either the most common mutant-bearing haplotype (M1) or the common wild-type haplotype (W1; Fig. S1a). The W5 haplotype and the haplotypes carrying the DMRT3 stop mutation (M1 and M2) appear to be derived from a W6 haplotype (Fig. 3 & Fig. S2a). We used the pattern of sequence variation in the DMRT3 region to explore the evolutionary history of the haplotypes. Sequences were available from a donkey and three Przewalski’s horses, and all carried the same alleles as found on

the W1 haplotype across the 25 SNPs included in Fig. 3; however, the W1 and M1 haplotypes in modern horses share a common ancestor after the split of horses from both donkeys and Przewalski’s horses (Fig. S1b). This pattern indicates that the majority of the SNPs included in Fig. 3 represent derived alleles that hitch-hiked to high frequency due to the close linkage with the DMRT3 stop mutation. The lack of the DMRT3 mutation in the Przewalski’s horse was expected, as recent genetic evidence indicates that modern horses are not direct descendants of the Przewalski’s horse, as the lineages diverged about 45 000 years ago (YA) (Der Sarkissian et al. 2015). Przewalski’s horses have never been domesticated and have been observed showing only walk, trot, canter and gallop and not any of the other intermediate gaits (Boyd & Houpt 1994). Sequence data were also available for two ancient horses from the late Pleistocene era (Schubert et al. 2014). The data were consistent with the presence of W1-like haplotypes not carrying the DMRT3 stop mutation, but due to gaps and low read depth coverage in the region, the exact haplotype identity is ambiguous. Among all the available whole genome sequences (Table S1), we observed only the W1, W5, W9, M1, and M2 haplotypes. We calculated the average nucleotide diversity within the W1 and M1 haplotypes based on all available sequence data (Fig. 1) and then applied the equine genomic substitution rate reported by Schubert et al. (2014) to estimate the coalescence time for the M1 haplotype over the 295-kb low heterozygosity region. The very low nucleotide diversity among the M1 haplotypes (Fig. 1) indicates that the M1 haplotypes sampled in this study diverged from a common ancestral sequence approximately 467–9595 YA, encompassing the time period estimated for horse domestication approximately 5000–6000 YA (Ludwig et al. 2009; Schubert et al. 2014).

Breed distribution of haplotypes and genotype– phenotype associations To determine the likely origin of the DMRT3 nonsense mutation, we examined the breed distribution of the different haplotypes. The W1 haplotype was present in 90% of the breeds sampled, the M1 in 60% and the M2 in 10% of the breeds. Within breed, the haplotype number was the greatest in Mongolian local horses (Table 1). We identified the ancestral wild-type W6 haplotype in the Sierra Tarahumara, Turkoman, Mongolian local and Shackelford Banks horses (Table 1). The M2 haplotype, carrying the variant allele at SNP position chr23: 22 977 337 (Fig. 3), occurred at a high frequency in four breeds: Marwari horses from India, Mangalarga Marchador from Brazil and Tennessee Walking and Rocky Mountain horses, both from the USA. The great majority of individuals confirmed to be gaited by video recording (n = 54) were either homozygous for the

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

DMRT3 haplotype history

Figure 2 Two blocks of linkage disequilibrium in the vicinity of the DMRT3 gene, identified from the custom genotyping panel. Gene track images are from the UCSC genome browser (http://genome.ucsc.edu, accessed 4/28/2016). In the LD plot, red diamonds represent D0 values equal to 1 as a percentage, lower values of D0 are given within the boxes in shades of pink to white; non-significant associations are blue. The DMRT3 STOP mutation is highlighted in green. Blocks are defined by confidence intervals estimated as described in Gabriel et al. (2002) and are marked by black border lines. SNP coordinates are shown in bold in Table S2.

Figure 3 Haplotypes as determined by 25 SNPs encompassing the DMRT3 gene. The 14 haplotypes defined in this study are shown. Blue boxes represent the reference allele, and white boxes represent the alternate allele based on alignment to EquCab2.0.

M1 or M2 haplotypes or heterozygous M1/M2, M1/W1, M2/ W1 or M3/W1. The result for the recombinant M3 haplotype is particularly interesting as it provides further support for the notion that the DMRT3 nonsense mutation is a major causal mutation for locomotion differences in horses; the haplotype carries only the same alleles as the M1 haplotype across seven SNPs linked to the DMRT3 stop mutation (Fig. 3). The W5 haplotype, identical to the M1 except for two SNP positions (Fig. 3), had the lowest frequency in gaited breeds (Fig. S2), but precise gait phenotype

information was missing for individuals carrying the haplotype, and therefore, its precise effect on gait cannot be determined. However, the results of the W5 haplotype combined with the narrowed region highlighted by the recombinant M3 haplotype provides substantial evidence that the DMRT3 stop mutation must be a causal mutation for gaitedness. Across the breed-reported gait phenotypes, the frequency of the mutant-bearing haplotypes was the lowest in the breeds classified as not gaited (Table 1, Fig. S2). However,

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

5

6

Staiger et al. Table 1 DMRT3 haplotype distribution by breed of the 468 horses analyzed. Haplotypes1 Breed

Gait

Wild-type (W)

Name

Origin

Phenotype2

n

1

Ahkal-Teke Altai American Curly Horse American Miniature Horse Andalusian (PRE) Appaloosa Arabian Ardennes3 Asturcon Brazilian Criollo Campolina Canadian Horse Caspian Coldblooded Trotter Colombian Criollo Dartmoor Pony Faeroese Finnhorse Florida Cracker French Trotter ~o Galicen Gotland Russ Greek Gaited Hackney Pony Hokkaido Icelandic Irish Cob (Tinker) Israeli Local Jordanian Local Kentucky Mountain Saddle Horse Knabstrup Konik (Polish Primitive) Kurd Losino Mangalarga Mangalarga Marchador Marajo Marwari Missouri Fox Trotter Missouri Fox Trotter cross Mongolian Local Morgan New Forest Pony Newfoundland North Swedish Draft Norwegian Fjord Paint Peneia Percheron Peruvian Paso Pindos Polish Heavy Horse Potoka Przewalski’s Horse Puerto Rican Paso Fino Pura Raza Gala Quarter Horse

Turkmenistan Russia USA USA Spain USA Middle East Belgium Spain Brazil Brazil Canada Iran Sweden Colombia UK Faroe Islands Finland USA France Mexico Sweden Greece UK Japan Iceland UK Israel Jordan USA Denmark Poland Iran Spain Brazil Brazil Brazil India USA USA Mongolia USA UK Canada Sweden Norway USA Greece France Peru Greece Poland Spain Mongolia USA Spain USA

No No Some Some Some Some Some No Yes Some Yes Some No Harness Yes No Some Harness Yes Harness No No Yes Harness Some Yes No Unknown Unknown Yes No No No No No Yes Some Some Yes Some Some Some No Some No No Some Yes No Yes Some No No No Yes Some Some

28 10 18 4 10 8 4 4 20 6 20 4 28 18 2 4 6 10 8 18 8 28 2 14 16 42 2 12 10 6 2 8 10 14 6 76 6 20 2 2 30 12 24 2 8 2 8 2 6 12 8 10 14 6 4 6 14

25 6 10 3 9 4 4 4 19 4 6 4 17 8 . 4 4 4 . 5 5 28 . 8 3 17 2 11 9 3 2 5 9 14 3 39 4 17 . 1 13 7 17 1 8 2 5 . 4 . 3 7 11 6 . 3 12

2

3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1 . . . .

Mutant (M) 4

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5 . . . . . . . . . . . . . . . .

6 3

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . 1 . . 2 . . . . . . . . . .

1

5 . . . . . . . 3 . . . . . . . . 10

7

1 .

1 . . . . . 1 . . 3 2 . 1 . .

. . . . . . . . . . . . . . . .

8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . 1 . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1 . . . . . . . . . . . . . . . . . . . . . . . .

9

2 . . . . . . . . . . . . . . . .

. . . . . . . .

1

2 .

. . . . . . . . . . . . . . . . . . . . . . . . .

4 8 1 1 4 . . . .

1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

14 . 9 10 2 . . 4 8 13 2 . 2 1 13 24

. . . . . . . . .

1 3 . . . . 2 2 2

35 . .

3

.

1

2

.

4 5 6 1 . . 3 2 1 12 5 . . . 3 3 1

. . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

1 . . . . . . . . . . . . . . . . . . . . . . . .

1 . .

4

1 . . . . . . . . . . . . . . . . . . . . . .

1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

DMRT3 haplotype history Table 1 (Continued) Haplotypes1 Breed

Gait

Wild-type (W)

Name

Origin

Phenotype2

n

1

Retuertas Rhodes Rocky Mountain Romanian Danube Delta Saddlebred Shackelford Banks Shetland Sierra Tarahumara Spanish Mustang Spotted Saddle Horse Standardbred Sumba Swedish Warmblood Tennessee Walking cross Tennessee Walking Horse Thessaly Thoroughbred Trakehner Turkoman Tushuri Cxeni Venezuelan Criollo Welsh Pony Yakut Zemaitukai

Spain Greece USA Romania USA USA UK Mexico USA USA Sweden/USA Indonesia Sweden USA USA Greece UK Germany Iran Georgia Venezuela UK Russia Lithuania

Unknown Some Yes Unknown Some Some No Some Some Yes Harness No No Some Yes No No No Some Unknown Some No Some No

20 8 12 2 24 2 16 18 6 2 16 2 6 6 12 2 38 4 6 16 10 14 4 6

18 3 1 2 17 1 16 8 4 2 3 2 5 3 3 . 37 4 4 7 7 10 2 4

2

3 . . . . . . . . . . . . . . . . . . . . . . . .

Mutant (M) 4

1 . . . . . . . . . . . . . . . . . . . . . . .

5 . . . . . . . . . . . . . . . . . . . . . . . .

6 . . . . . . . . . . . .

. . . . . 1 . 3 . . . . . . . . . .

1 . . . . . . . . . 2 2

7

1 . . . . .

8 . . . . . . . . . . . . . . . . . . . . . . . .

9 . . . . . . . . . . . . . . . . . . . . . . . .

10 . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

1

2 1 5 6

3 . . 1

. . . 7 2 11

4

1 . . .

4 2 1 . 1 9 3 4 . .

. .

. . . . . . .

7

4

. . . . . . . 1

. . 3 5 . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

n, number of chromosomes. “.” means no observations. 2 Gait phenotypes were not collected for all individuals but are based on breed reported characteristics. 3 Ardennes sampled in Sweden. 1

we did identify five lateral-gaited individuals, confirmed by video recording, that are homozygous for the W1 haplotype from the Icelandic (n = 1), Mangalarga Marchador (n = 3) and Spotted Saddle Horse (n = 1) breeds.

Discussion The present study demonstrates that the spectrum of gait phenotypes strongly associated with the DMRT3 stop mutation is a recently derived trait. We also provide conclusive evidence that this mutation is at least the most prominent causal mutation in this region, as no other sequence change detected in this study showed such a consistent association with differences in locomotion as did the DMRT3 stop mutation. The minute nucleotide diversity among M1 haplotypes over more than 150 kb shows that the DMRT3 stop mutation was not a common polymorphism among wild horses. The horse was domesticated between 5000 and 6000 YA (Ludwig et al. 2009; Lippold et al. 2011; Schubert et al. 2014). We estimated the coalescence time for the M1 haplotype to be in the range 467–9595 years. Therefore, we suspect the mutation arose after domestication and subsequently spread with human activity and horse trade.

The present study does not provide any conclusive evidence concerning the geographic origin of the DMRT3 stop mutation. There are two main reasons why genetic analysis of modern DNA cannot resolve this question. First, the W5 haplotype is widespread across breeds from different parts of the world and only differs by one base pair change from the ancestral haplotype (W6). Another reason why it is challenging to determine the geographic origin of this mutation is that it has been under strong positive selection and could therefore have been spread quickly from one geographic region to another. For instance, the mutation could have been spread widely via the military exploits of Alexander the Great (3rd century BC), Attila (5th century) and Ghengis Khan (13th century). Furthermore, the Romans, and their use of Greek horses, could have helped spread the mutation throughout Europe and the Middle East (Antikas 2015). It appears that the geographic origin of the Gait keeper mutation can be resolved only if studies of ancient DNA can demonstrate that the mutation first became abundant in one specific geographic area. A recent study of ancient DNA demonstrated that the mutation was present in medieval times (850–900 AD) in England and Iceland, and the authors argued that the mutation may have originated in England (Wutke et al. 2016). However,

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

7

8

Staiger et al. because that study did not include any ancient horses from Asia for the critical period of 0–800 AD, and only two horses from southern Europe for the same time period, it is fully possible that the mutation arose elsewhere and was brought to England. For instance, Pliny the Elder (23–79 AD) noted that horses from the Asturias region in Northern Spain moved both legs on the same side alternately and did not trot (Hendricks 1995), suggesting that the Gait keeper mutation was already present in Europe 2000 YA. Selection pressure to cover various terrains quickly without injuring the horse whilst maintaining rider comfort likely enhanced the spread of the DMRT3 mutation. In terms of dynamics, any gait with suspension, such as trot or canter, will lead to increased ground reaction forces at the time of impact on the foot and leg and could cause serious injury or lameness if traveling on hard surfaces (Clayton 2004). With the four-beat gaits, the force of impact is reduced, as there is no suspension and the risk for injury is reduced on hard surfaces. Horses able to perform these gaits at high speeds would have been highly valued, as they would be less prone to injury and more comfortable to ride due to the reduction in the amplitude of the dorsoventral displacement experienced by the rider. On the other hand, trot (or any diagonal gait) offers increased stability compared to pace, because the horse’s center of mass is closer to the midline; in pace, the center of mass is closer to the peripheral and increases the chance of rolling to the side (Hildebrand 1965). DMRT3 wild-type horses would be more suitable as pack animals for travel on more stable surfaces and for pulling heavy loads. It is worth noting that the DMRT3 mutation is absent from various breeds of draft horses. Therefore, the frequency of the DMRT3 mutation in various breeds likely reflects local topographies, how the animals were utilized and/or reproductive isolation including drift.

Acknowledgements We thank Susana Dunner, Petr Horın, P all Imsland, Rytis Juras, David Modr y, Cecilia Penedo, Monika Reissmann, Knut Roed, and Navid Yousefi-Mashouf for kindly providing horse samples for this study.

Funding This study was financially supported by the Knut and Alice Wallenberg Foundation, the World Wildlife Fund and The Swedish Research Council Formas.

Conflict of interest Leif Andersson and Gabriella Lindgren are co-inventors on a granted patent (EP2542702 B1) concerning commercial testing of the DMRT3 mutation.

References Andersson L.S., Larhammar M., Memic F. et al. (2012) Mutations in DMRT3 affect locomotion in horses and spinal circuit function in mice. Nature 488, 642–6. Antikas T. (2015) Native Greek horses: from man- and fish-eaters B.C. to DMRT3 gaiters in modern times. In: Hungarian Grey, Racka, Mangalitsa, pp. 127–34. Museum and Library of Hungarian Agriculture, Budapest. Bandelt H.J., Forster P. & R€ ohl A. (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 16, 37–48. Barrett J.C., Fry B., Maller J. & Daly M.J. (2005) HAPLOVIEW: analysis and visualization of LD and haplotype maps. Bioinformatics 21, 263–5. Barrey E. (2013) Gaits and interlimb coordination. In: Equine Locomotion (Ed. by W. Back & H.M. Clayton), pp. 85–97. Elsevier, New York, NY. Boyd L. & Houpt K.A. (1994) Przewalski’s Horse: The History and Biology of an Endangered Species. State University of New York Press, Albany, NY. Browning S.R. & Browning B.L. (2007) Rapid and accurate halplotype phasing and missing data inference for whole genome association studies by use of localized haplotype clustering. American Journal of Human Genetics 81, 1084–97. Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M. & Lee J.J. (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 1–16. Clayton H.M. (2004) The Dynamic Horse: A Biomechanical Guide to Equine Movement and Performance. Sport Horse Publications, Mason, MI. Gabriel S.B., Schaffner S.F., Nguyen H. et al. (2002) The structure of haplotype blocks in the human genome. Science 296, 2225–9. Harris S.E. (1993) Horse Gaits, Balance, and Movement. Howell Book House, New York, NY. Hendricks B. (1995) International Encyclopedia of Horse Breeds. University of Oklahoma Press, Norman, OK. Hildebrand M. (1965) Symmetrical gaits of horses. Science 150, 701–8. € M., Mykk€ J€ aderkvist Fegraeus K., Johansson L., M€ aenp€ aa anen A.,  Andersson L.S., Velie B.D., Andersson L., Arnason T. & Lindgren G. (2015) Different DMRT3 genotypes are best adapted for harness racing and riding in Finnhorses. Journal of Heredity 106, 734–40.  J€ aderkvist K., Holm N., Imsland F., Arnason T., Andersson L., Andersson L.S. & Lindgren G. (2015) The importance of the DMRT3 ‘Gait keeper’ mutation on riding traits and gaits in Standardbred and Icelandic horses. Livestock Science 176, 33–9. Kent W.J., Sugnet C.W., Furey T.S., Roskin K.M., Pringle T.H., Zahler A.M. & Haussler D. (2002) The human genome browser at UCSC. Genome Research 12, 996–1006. Kristjansson T., Bjornsdottir S., Sigurdsson A., Andersson L.S., Lindgren G., Helyar S.J., Klonowski A.M. & Arnason T. (2014) The effect of the ‘Gait keeper’ mutation in the DMRT3 gene on gaiting ability in Icelandic horses. Journal of Animal Breeding and Genetics 131, 415–25. Li H. & Durbin R. (2010) Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–95.

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

DMRT3 haplotype history Li H., Handsaker B., Wysoker A., Fennell T., Ruan J. & Homer N. (2009) The sequence alignment/map format and SAMTOOLS. Bioinformatics 25, 2078–9. Lippold S., Knapp M., Kunetsova T. et al. (2011) Discovery of lost diversity of paternal horse lineages using ancient DNA. Nature Communications 2, 450. Ludwig A., Pruvost M., Reissmann M. et al. (2009) Coat color variation at the beginning of horse domestication. Science 324, 485. McKenna A., Hanna M., Banks E. et al. (2010) The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, 1297–303. Nei M. (1987) Molecular Evolutionary Genetics. Columbia University Press, New York, NY. Nicodemus M.C. & Clayton H.M. (2003) Temporal variables of fourbeat, stepping gaits of gaited horses. Applied Animal Behaviour Science 80, 133–42. Promerov a M., Andersson L.S., Juras R. et al. (2014) Worldwide frequency distribution of the ‘Gait keeper’ mutation in the DMRT3 gene. Animal Genetics 45, 274–82. Purcell S., Neale B., Todd-Brown K. et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81, 559–75. Der Sarkissian C., Ermini L., Schubert M. et al. (2015) Evolutionary genomics and conservation of the endangered Przewalski’s horse. Current Biology 25, 2577–83. Schubert M., J onsson H., Chang D. et al. (2014) Prehistoric genomes reveal the genetic foundation and cost of horse

domestication. Proceedings of the National Academy of Sciences of the United States of America 111, E5661–9. Tamura K., Stecher G., Peterson D., Filipski A. & Kumar S. (2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Molecular Biology and Evolution 30, 2725–9. Wade C.M., Giulotto E., Sigurdsson S. et al. (2009) Genome sequence, comparative analysis, and population genetics of the domestic horse. Science 326, 865–7. Wutke S., Andersson L., Benecke N. et al. (2016) The origin of ambling horses. Current Biology 26, R697–9. Ziegler L. (2005) Easy-Gaited Horses. Storey Publishing, North Adams, MA.

Supporting information Additional supporting information may be found online in the supporting information tab for this article: Figure S1 Phylogenetic trees of DMRT3 haplotypes. Figure S2 Frequency of mutant and wild-type DMRT3 haplotypes across breeds classified according to gait phenotypes. Table S1 List of horses used for whole genome sequencing or capture sequencing to study the DMRT3 target region. Table S2 List of 5555 SNPs from the target-capture sequencing of a 686-kb region surrounding the DMRT3 gene.

© 2017 Stichting International Foundation for Animal Genetics, doi: 10.1111/age.12580

9