Development of an unambiguous and discriminatory ... - CiteSeerX

2 downloads 0 Views 259KB Size Report
1Centre for Preventive Medicine, Animal Health Trust, Lanwades Park, Kentford, Newmarket, ... associated with disease in a wide range of other animal hosts ...
Microbiology (2008), 154, 3016–3024

DOI 10.1099/mic.0.2008/018911-0

Development of an unambiguous and discriminatory multilocus sequence typing scheme for the Streptococcus zooepidemicus group Katy Webb,1 Keith A. Jolley,2 Zoe Mitchell,1 Carl Robinson,1 J. Richard Newton,1 Martin C. J. Maiden2 and Andrew Waller1 Correspondence Andrew Waller [email protected]

Received 31 March 2008 Revised 10 June 2008 Accepted 17 June 2008

1

Centre for Preventive Medicine, Animal Health Trust, Lanwades Park, Kentford, Newmarket, Suffolk CB8 7UU, UK

2

The Peter Medawar Building for Pathogen Research and Department of Zoology, University of Oxford, South Parks Road, Oxford OX1 3SY, UK

The zoonotic pathogen Streptococcus equi subsp. zooepidemicus (S. zooepidemicus) is commonly found harmlessly colonizing the equine nasopharynx. Occasionally, strains can invade host tissues or cross species barriers, and S. zooepidemicus is associated with numerous different diseases in a variety of hosts, including inflammatory airway disease and abortion in horses, pneumonia in dogs and meningitis in humans. A biovar of S. zooepidemicus, Streptococcus equi subsp. equi, is the causative agent of strangles, one of the most important infections of horses worldwide. We report here the development of the first multilocus sequence typing (MLST) scheme for S. zooepidemicus and its exploitation to define the population genetic structure of these related pathogens. A total of 130 unique sequence types were identified from 277 isolates of diverse geographical and temporal origin. Isolates of S. equi shared a recent evolutionary ancestor with isolates of S. zooepidemicus that were significantly associated with cases of uterine infection or abortion in horses (P,0.001). Isolates of S. zooepidemicus from three UK outbreaks of acute fatal haemorrhagic pneumonia in dogs during 1999, 2001 and 2008 were found to be related to isolates from three outbreaks of this disease in the USA during 2005, 1993 and 2006, respectively. Our data provide strong evidence that S. equi evolved from an ancestral S. zooepidemicus strain and that certain related strains of S. zooepidemicus have a greater propensity to infect particular hosts and tissues.

INTRODUCTION Streptococcus equi subsp. zooepidemicus (S. zooepidemicus) is the most frequently isolated opportunistic pathogen of horses: it is associated with inflammatory airway disease in thoroughbred racehorses (Wood et al., 1993, 2005), uterine infections in mares (Hong et al., 1993; Smith et al., 2003) and ulcerative keratitis (Brooks et al., 2000). It is also associated with disease in a wide range of other animal hosts, including cattle (Sharp et al., 1995), sheep (Las Heras et al., 2002; Stevenson, 1974), pigs (Salasia et al., 2004; Soedarmanto et al., 1996), monkeys (Salasia et al., 2004; Soedarmanto et al., 1996) and humans (Bradley et al., 1991; Downar et al., 2001; Hashikawa et al., 2004). Recently, several cases of acute fatal haemorrhagic Abbreviations: D–ln L, difference in log-likelihood; AHT, Animal Health Trust; DLV, double locus variant; dN/dS ratios, ratios of non-synonymous to synonymous substitutions; ET, electrophoretic type; ML, maximumlikelihood; MLEE, multi-enzyme electrophoresis; MLST, multilocus sequence typing; PFGE, pulsed-field gel electrophoresis; SLV, single locus variant; ST, sequence type.

3016

pneumonia in dogs have been attributed to S. zooepidemicus infection (Chalker et al., 2003; Pesavento et al., 2008). A biovar of S. zooepidemicus, Streptococcus equi subsp. equi (S. equi), is the causative agent of equine strangles, one of the most frequently diagnosed and important infectious diseases of horses worldwide. This disease is characterized by pyrexia, followed by profuse nasal discharge and abscessation of the lymph nodes of the head and neck. The swelling of lymph nodes in the head and neck may restrict the airway, and it is this clinical feature that gave the disease its name (Timoney, 1993). However, despite the prevalence of this global endemic disease, S. equi infections of other animals are rare, with only a few documented cases in humans (Breiman & Silverblatt, 1986; Elsayed et al., 2003; Popescu et al., 2006) and one ‘strangles-like’ infection in a dog (Ladlow et al., 2006). The variety of distinct infections caused by this group of pathogens suggests that determination of the population genetic structure of the S. zooepidemicus group could yield valuable information towards the identification of the genetic basis for the selection of host and site of infection. 2008/018911 G 2008 SGM Printed in Great Britain

MLST of Streptococcus zooepidemicus

Strains of S. zooepidemicus have previously been discriminated using two different subtyping approaches. Firstly, individual loci or uncharacterized regions of the genome that are highly variable within the bacterial population were exploited. This type of approach is exemplified by pulsed-field gel electrophoresis (PFGE) (Lindmark et al., 1999) and the PCR typing of M-protein hypervariable regions (Walker & Timoney, 1998) or the 16S–23S-RNA intergenic spacer (Chanter et al., 1997). These methods used restriction enzymes or PCR primers to maximize the detected variability within the population. However, the variation that is indexed evolves very rapidly through unknown mechanisms and although useful in short-term epidemiological studies, these methods may be misleading in the study of global epidemiology (Maiden et al., 1998). In addition, PFGE and PCR subtyping methodologies provide little information regarding the genetic relationships of one bacterial strain with another and the results obtained from these analyses are difficult to compare between different laboratories. The second type of approach identifies variation that accumulates very slowly in the population that is likely to be selectively neutral. Multilocus enzyme electrophoresis (MLEE) indexes allelic variation by measuring differences in the mobility of housekeeping enzymes on cellulose acetate strips (Selander et al., 1986). Although only a small number of electromorphs corresponding to distinct alleles can be identified within a population, high levels of discrimination are achieved by analysing many loci. The S. zooepidemicus MLEE scheme utilizes 10 enzymes with an average number of 2.5 alleles per locus (Jorm et al., 1994). This approach identified 41 electrophoretic types (ETs) of S. zooepidemicus, including the ET 12 characteristic of S. equi. However, a major problem with MLEE typing methods is that the results obtained from different laboratories remain difficult to compare. Multilocus sequence typing (MLST) is a development from MLEE, which allows the identification of different alleles directly from the nucleotide sequences of ~400–500 bp internal fragments of usually seven housekeeping genes (Maiden, 2006). This has several advantages over alternative typing methods. Firstly, much more variation can be detected, resulting in many more alleles per locus than can be detected by MLEE (for instance, some of the more established schemes currently have over 400 alleles at most of their loci). Secondly, the majority of the variation detected results in synonymous codon changes and so is not subjected to functional selective pressure. Finally, nucleotide sequence data are fully portable and can easily be compared between different laboratories via electronic databases available on the Internet. These databases are therefore a powerful resource with which to conduct global epidemiological studies (Maiden, 2006). In this report, a MLST scheme using seven housekeeping loci was used to evaluate 253 S. zooepidemicus and 24 S. equi isolates from horses, donkeys and dogs from several continents and spanning a time period of .30 years. http://mic.sgmjournals.org

METHODS Bacterial isolates. Full details of all of the isolates examined in this

study are available on the MLST database (http://pubmlst.org/ szooepidemicus/). A total of 46 S. zooepidemicus isolates from horses, donkeys and dogs in the UK and USA were kindly provided by Professor John Timoney (University of Kentucky). A further seven isolates of S. zooepidemicus from infections of dogs in the UK were kindly provided by Professor Joe Brownlie (Royal Veterinary College). The remaining 200 isolates of S. zooepidemicus and 24 S. equi isolates from the UK, USA, Canada, Ireland, Spain and Australia were obtained from the Animal Health Trust (AHT) collection. The majority of these isolates had been previously submitted to the AHT diagnostic laboratories from cases of clinical disease in horses or dogs. The S. zooepidemicus H70 reference strain was obtained from a healthy thoroughbred racehorse in Newmarket by the AHT and is the focus of the S. zooepidemicus genome-sequencing project at the Sanger Institute (http://www.sanger.ac.uk/Projects/S_zooepidemicus/). The S. equi 4047 reference strain was isolated from a New Forest pony by the AHT and is the focus of the S. equi genome-sequencing project at the Sanger Institute (http://www.sanger.ac.uk/Projects/S_equi/). Preparation of chromosomal DNA. S. zooepidemicus and S. equi strains were grown on COBA strep select plates (bioMe´rieux). DNA was then purified from a single colony using GenElute spin columns according to the manufacturer’s instructions (Sigma). MLST. Internal fragments of the carbamate kinase (arcC), ribonu-

cleoside-diphosphate reductase (nrdE), prolyl-tRNA synthetase (proS), signal peptidase I (spi), thymidylate kinase (tdk), triosephosphate isomerase (tpi) and acetyl-CoA acetyltransferase (yqiL) genes were amplified by PCR using the following primer pairs: arcC1 (59AGC CAT CTA CCG ACT AAC AC-39) and arcC4 (59-TCT GAA AGG GTT TGG CTA GC-39); nrdE1 (59-TTC TCC TTC AGG TGA CAG ATG-39) and nrdE4 (59-AGA CTA GGC GTT TGA ACC TG39); proS1 (59-TTG GTT GGA AAT GAC CAG ATC-39) and proS4 (59-CCT GAT CCT TGA CAT TAA CGG-39); spi1 (59-CTT AGA GCA TTG CGC TAA GC-39) and spi4 (59-TTG CCT GCT ATC TAG GGA AG-39); tdk1 (59-GAA TTC AGG AAA GAC CAT TG-39) and tdk4 (59-TAA TGC TTG CGA CAA ACT GG-39); tpi1 (59-GGC AGT AGT AAG CAA ATT ACC-39) and tpi4 (59-AGC AAG GCA AGG AAG CTA TC-39); and yqiL1 (59-CCA CAT GGG AAT TAC AGC AG-39) and yqiL4 (59-TCA ATA CAG ACG TCC CTT GAC C-39). The PCRs were performed in volumes of 20 ml using Taq DNA polymerase (Sigma) with 34 cycles of 94 uC for 30 s, 55 uC for 30 s and 72 uC for 1 min. The amplified DNA fragments were purified using a PCR purification kit (Qiagen), and the sequence of each fragment was obtained on both strands using an ABI3100 DNA sequencer with BigDye fluorescent terminators and the following primers, which are internal to those used in the initial PCR: arcC2 (59-TGT CGC ACT TGG AGG AAA TG-39) and arcC3 (59-CAC CAC AAC ACC AGA ATC AAC-39); nrdE2 (59-CGC TCT ATT AAT TCT GCC TTG C-39) and nrdE3 (59-GCA TAG GTT GCT CAT GAT GAT G-39); proS2 (59-GAC TTC TTA GAG CCA GCT AG-39) and proS3 (59-AAC GGA GCT AGC TCT TTA GG-39); spi2 (59-CCT ACT TTT TGG ACT CTC ACG-39) and spi3 (59-GAT TTT ATT GAG TGG CCA AAA GCG-39); tdk2 (59-CTT GAA GGT AGC ACA CAA TTA TG-39) and tdk3 (59-GGT CTC ATT GCC ACC AAT TTG-39); tpi2 (59-GGT CAC TAC AAT TGA GGC TG-39) and tpi3 (59-TTC AGG CTT CAC AGA ACC AC-39); and yqiL2 (59-AGT ACC GCA ACG TAA AGG TG-39) and yqiL3 (59-GCC TGA CGC TTT TCC ATC TC-39. Sequence data were assembled using SeqMan 5.03 (DNASTAR), and high-quality double-stranded sequence data were used for further analysis. For each locus, every unique sequence was assigned a distinct allele number, and each sequence type (ST) was defined by a series of seven integers (the allelic profile) corresponding to the alleles at the 3017

K. Webb and others seven loci, in the order (alphabetical) arcC, nrdE, proS, spi, tdk, tpi and yqiL. A MLST database containing the sequences of all alleles, the allelic profiles and information about the origins of each isolate is maintained at the University of Oxford and can be found on the S. zooepidemicus pages of the MLST website (http://pubmlst.org/ szooepidemicus/) (Jolley et al., 2004). Allele sequences for each ST were exported for each locus in turn from the MLST database using the built-in concatenation function. Previously described statistical methods were used to examine the extent of congruence among gene trees inferred from these individual loci (Feil et al., 2001; Holmes et al., 1999). Firstly, the Shimodaira– Hasegawa test was used to determine whether there were significant differences among the tree topologies inferred for each gene. This analysis was undertaken by determining the maximum-likelihood (ML) tree for each of the seven genes and then comparing, in turn, the difference in log-likelihood (D–ln L) between the topologies generated using sequences from each of the other six genes. To further assess the extent of congruence among the seven ML gene trees, we used the randomization test (Feil et al., 2001; Holmes et al., 1999). In this case the D–ln L values of each of the seven ML trees generated using the housekeeping genes were compared to the values calculated for 200 random trees and those calculated from the trees generated for the other housekeeping genes. All of these analyses were carried out using PAUP* version 4.0b10 (Swofford, 2003). The relatedness of isolates was determined using eBURST (http:// eburst.mlst.net) (Feil et al., 2004) and ClonalFrame v1.1 (Didelot & Falush, 2007). The Sequence type Analysis and Recombinational Test (START2) software (Jolley et al., 2001) was used to determine the ratios of non-synonymous to synonymous substitutions (dN/dS ratios) for each locus (Nei & Gojobori, 1986). Fisher’s exact test was used to test the null hypothesis that there was no statistically significant difference in the proportion of S. zooepidemicus isolates recovered from cases of abortion or respiratory infection that clustered together compared with those isolates that did not cluster. Statistical analysis was conducted using Stata 9.2 software (StataCorp LP), with statistical significance set at P¡0.05. PCR and sequencing of the seM gene. The primers ASW73 (59CAG AAA ACT AAG TGC CGG TG-39) and ASW74 (59-ATT CGG TAA GAG CTT GAC GC-39) were used to PCR amplify and sequence the N-terminal variable region of the seM gene unique to S. equi and SeM alleles were assigned as previously described using the website http://pubmlst.org/szooepidemicus/seM/ (Kelly et al., 2006).

RESULTS Development of a MLST scheme for the S. zooepidemicus group Chromosomal DNA was obtained from 253 isolates of S. zooepidemicus and 24 isolates of S. equi. Twelve loci were initially chosen as candidates for the MLST scheme. These either had been used successfully for other streptococci (Coffey et al., 2006; Enright & Spratt, 1998; Enright et al., 2001) or were selected with guidance using data from the publicly available Wellcome Trust Sanger Institute S. equi and S. zooepidemicus genome sequencing projects (http:// www.sanger.ac.uk). Seven of these loci were then chosen based on the observed sequence diversity in pilot studies with a limited set of nine diverse S. zooepidemicus isolates previously characterized by PCR subtyping (Chanter et al., 1997; Walker & Timoney, 1998) and four diverse S. equi strains (Kelly et al., 2006). Loci selected for this study were evenly distributed across the recently completed S. equi strain 4047 and S. zooepidemicus strain H70 genomes with minimum distances of 30 kb (arcC and nrdE) and 55 kb (arcC and nrdE), respectively (Table 1). Population structure The collection of 253 S. zooepidemicus and 24 S. equi isolates was assembled with several goals in mind. Firstly, we evaluated many S. zooepidemicus isolates with large temporal and/or spatial distances from their isolation. Secondly, the isolates had been recovered in association with a variety of host tissues and diseases, including respiratory colonization (108 isolates), pneumonia or purulent rhinitis (35 isolates), acute fatal haemorrhagic pneumonia in dogs (18 isolates), non-strangles lymph node abscesses (13 isolates), abortion or uterine infection (48 isolates), wound infection (19 isolates), strangles (24 isolates) and other diseases such as nephritis, keratitis and peritonitis (12 isolates). One of the main aims of this was to determine whether related strains were associated

Table 1. Characteristics of housekeeping gene loci included in the S. zooepidemicus MLST scheme Alleles of the seven housekeeping loci can be obtained at http://pubmlst.org/szooepidemicus/. Locus

arcC nrdE proS spi tdk tpi yqiL

Size of sequenced fragment (bp)

437 448 435 459 370 424 396

No. of alleles

28 25 33 28 22 27 42

No. of polymorphic nucleotide sites

30 49 51 35 24 38 43

dN/dS

0.0160 0.0274 0.0535 0.0190 0.0472 0.0793 0.0917

Chromosomal location (no. of kb from origin) S. zooepidemicus*

S. equiD

1586 1641 241 1838 1044 736 601

586 556 2028 376 1144 1509 1635

*Based on the recently completed S. zooepidemicus H70 genome at http://sanger.ac.uk. DBased on the recently completed S. equi 4047 genome at http://sanger.ac.uk. 3018

Microbiology 154

MLST of Streptococcus zooepidemicus

The sequences of the seven loci were determined for each of the 277 isolates, and their allelic profiles were assigned based on gene fragments with sequence lengths of between 370 bp (tdk) and 459 bp (spi). The sequences of all seven loci, and the properties of the 277 isolates studied, are available from the MLST website (http://pubmlst.org/ szooepidemicus/). The number of unique alleles identified for each of the seven housekeeping loci ranged from 22 (tdk) to 42 (yqiL) (Table 1). The mean number of alleles per locus was 29.3. The most common allele for each locus was arcC-1 (57 isolates); nrdE-3 (77 isolates); proS-1 (46 isolates); spi-45 (64 isolates); tdk-1 (132 isolates); tpi-1 (57 isolates); and yqiL-1 (59 isolates) (no isolates were identified that had this allelic profile among the 277 studied). One hundred and thirty allelic profiles were found in this study. Twenty-three of the 24 S. equi isolates were found to be ST-179. The remaining S. equi isolate was found to be a single locus variant (SLV) of ST-179, ST-151. The most common S. zooepidemicus ST was ST-108; all 20 of these isolates came from an AHT study of equine respiratory isolates from 1996 and were included for comparison of MLST with the PCR-typing method. The dN/dS ratios for all seven loci were calculated, and all were substantially less than 1 (Table 1). Evidence of recombination

congruent than random trees. However, on comparing spi against other MLST loci, the other loci trees cluster with the random trees (Fig. 1). Lineage assignment Clustering of STs was investigated using eBURST at the default setting (6 out of 7 matches for group definition). This identified one clonal complex of five STs consisting of ST-1, ST-71, ST-104, ST-108 and ST-166, with ST-71 as the founding member, and one clonal complex of four STs, consisting of ST-114, ST-118, ST-139 and ST-146, with ST118 as the founding member (Fig. 2). In addition, nine groups of three STs and 12 pairs of STs clustered together. Overall the S. zooepidemicus population as represented by eBURST is indicative of a diverse species. We further analysed the MLST data using ClonalFrame (Didelot & Falush, 2007). A majority-rules consensus tree was generated from 12 independent runs each with 250 000 iterations (100 000 burn-in iterations, thinning interval of 100) (Fig. 3). When tested with ClonalFrame’s built-in tool, full convergence was not achieved between individual runs so the Markov chains were examined further using Tracer (13) (Tracer v1.4, available from http://beast.bio.ed.ac.uk/Tracer). Although the variance between some chains was significantly greater than the variance within

2000 1600 D–ln L

with a particular disease or host and whether any strains of S. zooepidemicus clustered with S. equi. Finally, we included several S. zooepidemicus and S. equi isolates that had been previously subtyped using existing PCR subtyping (Chanter et al., 1997; Walker & Timoney, 1998) and SeM subtyping methods (Kelly et al., 2006), respectively, to provide validation of the new MLST scheme.

1200

To determine if recombination plays an important role in S. zooepidemicus diversity we performed statistical tests of congruence. The Shimodaira–Hasegawa test compares ML trees constructed using data from each locus and tests if they are significantly different from each other. In all cases, topologies were highly significantly different (P,0.001) (Table 2). The randomization test compares these tree topologies and tests whether any congruence between the loci is better than that seen against random trees. In most cases, trees drawn from any two loci are slightly more

MLST genes Random trees

800 400

20

40

60

80 100 120 140 160 180 200 ML tree

Fig. 1. Congruence of the spi ML tree with trees generated from other MLST loci and 200 random trees.

Table 2. Tests for congruence among S. zooepidemicus isolates Locus

”ln L of ML tree

D”ln L of competing ML trees

P

D”ln L of random ML trees

arcC nrdE proS spi tdk tpi yqiL

928.42 1098.78 1227.43 1073.43 714.64 995.46 1110.06

808.47–956.60 1532.67–2017.40 1803.59–2138.14 1274.76–1684.47 475.26–594.50 1221.58–1565.14 1715.71–1980.95

,0.001 ,0.001 ,0.001 ,0.001 ,0.001 ,0.001 ,0.001

920.30–1116.60 1836.98–2271.61 2069.47–2337.77 1326.50–1955.34 572.09–640.75 1419.31–1667.39 1820.33–2125.06

http://mic.sgmjournals.org

3019

K. Webb and others

Fig. 2. Population snapshot of the S. zooepidemicus group. Clusters of related STs and individual unlinked STs within the study sample are displayed as a single eBURST diagram. Clusters of linked isolates correspond to clonal complexes. Primary founders (blue) are positioned centrally in each cluster, and subgroup founders are shown in red.

chains, combining all the runs produced a good sample of the posterior and a 50 % consensus tree was considered robust. This analysis was in concordance with eBURST, identifying all 23 eBURST complexes. The analysis of the MLST data by ClonalFrame, however, linked several of the groups identified by eBURST. For example, the two SLV pairs ST-16/140 and ST-134/173 now clustered together. Of particular note was the observation that the S. equi SLV pair ST-151/179 now clustered with the S. zooepidemicus ST-19 and ST-137/149 complexes, ST-6, ST-49, ST-56 and ST-133. Unlike with eBURST, ST-18 did not cluster with ST134 and ST-173 in the consensus ClonalFrame analysis, although distant linkage was seen in individual runs. ST-18 has 20 nucleotide differences in its proS-1 relative to the proS-15 of ST-173 and ST-134. Nineteen of these differences occur in the 39 region of the sequenced fragment from nucleotide 200 onwards and a single difference at nucleotide 43. A maximum chi-squared test (performed in START2) between these two alleles identifies a possible recombination event at nucleotide 199, but this is not deemed statistically significant (P50.134). Comparison of MLST with other typing methods PCR subtyping methods (Chanter et al., 1997; Walker & Timoney, 1998) have previously been conducted on a subset of the S. zooepidemicus isolates reported here 3020

(Newton et al., 2008). For isolates of related ST, 39 (57 %) also had concordant 16S–23S-RNA intergenic spacer and M-protein hypervariable region PCR profiles, whereas 30 displayed distinct PCR profiles (http://pubmlst.org/szooepidemicus/). If only the 16S–23S-RNA intergenic spacer PCR was used, 57 isolates (82 %) had concordant related STs, compared with 42 isolates (61 %) if the M-protein hypervariable region PCR was used. Twenty-four S. equi isolates comprising seventeen different SeM subtypes were included in the study. Twenty-three of these were identified as ST-179; the remaining isolate (SeM subtype 6) was found to be a SLV of ST-179, ST-151. S. zooepidemicus associated with disease A total of 145 isolates of S. zooepidemicus associated with disease in the UK, USA, Spain and Ireland between 1976 and 2007, and 24 S. equi isolates from cases of strangles in the UK, USA, Australia, Canada and Ireland between 1981 and 2006, were included in the study. Some trends in the disease association of clusters of STs were apparent following analysis using the ClonalFrame software (Fig. 3). Of 46 independent isolates in the ST-71 complex, 40 (87 %) were isolated from the respiratory tract (P,0.001). This group contained isolates from the UK and USA over a long time period. Other clusters of STs that were also associated with respiratory isolates include 13 of 17 isolates Microbiology 154

MLST of Streptococcus zooepidemicus

Fig. 3. A majority-rules consensus tree generated from 12 independent runs of ClonalFrame, each with 250 000 iterations, and imported into MEGA to display as a radial tree. The ST-151/179 and ST-71 clusters are labelled for clarity. *Isolates from outbreaks of acute fatal haemorrhagic pneumonia in dogs.

(76 %) in the ST-8 cluster, 16 of 17 isolates (94 %) in the ST-106/164 cluster, all 8 isolates in the ST-103 complex, all 11 isolates in the ST-119 complex and all 5 isolates in the ST-117/148 cluster. The S. zooepidemicus STs that were found to cluster with S. equi by ClonalFrame analysis were significantly associated with cases of uterine infection or abortion (P,0.001). Of the 11 S. zooepidemicus isolates in this cluster, two ST-49 isolates originated from repeated swabs of the same wound infection. Taking into account the small number of repeated isolates (six) across the S. zooepidemicus study http://mic.sgmjournals.org

population (253), of 10 independent S. zooepidemicus isolates in this cluster, 8 (80 %) were associated with uterine infection and abortion. This cluster accounted for 18 % (8 of 44) of all independent isolates from uterine infection or abortion in this study compared with only 1 % (2 of 203) of other independent S. zooepidemius isolates studied. Isolates from eight independent outbreaks of acute fatal haemorrhagic pneumonia in dogs were investigated in this study. An isolate from an outbreak in the UK during 1999 (ST-123) was a double locus variant (DLV) of an isolate 3021

K. Webb and others

from similar outbreak of disease in the USA during 2005 (ST-129). Four isolates in the ST-10 cluster from an outbreak in the UK during 2001 were found to be related to an isolate from an outbreak in the USA during 1993 (ST-169) by ClonalFrame analysis. Three isolates from a recent outbreak in the USA during 2006 (ST-173) were single locus variants (SLVs) of four ST-18 isolates recovered from greyhounds affected in a UK outbreak during 2008. Of the canine STs identified in this study, an isolate of ST-2 was also recovered from an equine tracheal wash; an isolate of ST-10 was also recovered from an equine wound infection; an isolate of ST-162 was also recovered from a case of purulent rhinitis in a foal; and a SLV of ST-173 and a DLV of ST-18, ST-134, was recovered from a case of equine mastitis.

DISCUSSION MLST is a well-established molecular typing method that has been used successfully for the determination of the population structure of many bacterial species (Coffey et al., 2006; Enright & Spratt, 1998; Enright et al., 2001; Maiden et al., 1998). Here we report the development of an unambiguous and discriminatory MLST scheme for the S. zooepidemicus group. The data generated from the S. zooepidemicus and S. equi isolates in the present study have been used to construct a database on the main MLST website that permits the direct comparison of data generated in other laboratories (http://pubmlst.org/szooepidemicus/). This database is fully accessible and searchable and can be accessed and added to by other researchers. We believe that such a scheme is of particular value for the study of a zoonotic pathogen such as S. zooepidemicus, which is associated with a variety of diseases in a large number of different animal hosts. The statistical tests of congruence identified that the topologies of ML trees constructed using data from each locus were significantly different (P,0.001). If each gene tree had the same topology (i.e. phylogenetic congruence), as expected under entirely clonal evolution, then they should not differ significantly in likelihood. The data collected in this study suggested that higher levels of recombination may be occurring at the spi locus than at the other six loci examined. This is probably best illustrated by the particularly high prevalence of the S. equi spi allele 45 in the S. zooepidemicus isolates examined in this study (40 of 253 isolates: 16 %), compared with the other six S. equi alleles, which had the following prevalences: arcC (0/253), nrdE (0/253), proS (1/253), tdk (0/253), tpi (8/253: 3 %) and yqiL (0/253). One explanation for this observation is the occurrence of recombination with S. equi around the spi locus. Now that the S. zooepidemicus genome sequence has been completed, further studies are warranted to determine if there are coding sequences near the spi locus of S. equi that may be of value to S. zooepidemicus. In the current study there was no statistically significant 3022

association of the presence of the spi allele 45 and the origin of the isolate. S. zooepidemicus was identified as a diverse species by eBURST analysis. Only two eBURST groups, of respectively five STs (based around ST-71) and four STs (based around ST-118), were identified. In addition, nine groups of three STs and 12 pairs of STs were found to cluster together. ClonalFrame analysis identified all of the 23 eBURST groups; however, the ability of this software to account for the effect of mutation as well as homologous recombination events on the clonal pattern of inheritance generated a majority-rules consensus tree that linked many eBURST groups, including: ST-16/140 and ST-134/173, and the ST-5 complex and ST-7/48; of particular note was the observation that the S. equi SLV pair ST-151/179 now clustered with the S. zooepidemicus ST-19 and ST-137/149 complexes, ST-6, ST-49, ST-56 and ST-133, albeit via deep branches on the radial tree. ClonalFrame added several STs to extend many of the eBURST clusters, including ST-20 with the ST-118 complex, and ST-11, ST-46, ST-96, ST-113 and ST-153 with the ST-8 complex. ClonalFrame did not group ST-18 with the ST134/173 cluster in the consensus tree, although it was distantly linked in trees generated from individual runs. This is probably because, even though ST-18 and ST-173 are SLVs of each other, the pattern of nucleotide differences between the proS alleles of these strains is likely to have arisen through at least two genetic events. Overall, on examination of the sequence differences at each of the seven loci used in the S. zooepidemicus MLST scheme, the ClonalFrame analysis presented in Fig. 3 appears to most accurately represent the S. zooepidemicus population structure and best accounts for the evolutionary events occurring in this bacterial species. The clustering of isolates by MLST was in good agreement with those obtained previously with PCR typing methods. As expected, of the two S. zooepidemicus PCR typing methods, the 16S–23S-RNA intergenic spacer PCR was noticeably more concordant with related STs than the Mprotein hypervariable region PCR scheme. The SeM-typing scheme was found to have a much better ability to differentiate isolates of S. equi than MLST. S. equi is believed to have evolved relatively recently from an ancestral strain of S. zooepidemicus (Jorm et al., 1994) and as MLST schemes identify variation in housekeeping genes that accumulates very slowly in the population it was not surprising to find that the majority of S. equi isolates studied were of a single ST. In contrast, the SeM gene is known to be subjected to strong selective pressure (dN/ dS55.93) (Waller & Jolley, 2007) and is likely to accumulate variation much more quickly, so different strains of S. equi can be differentiated using this method. One of the key objectives of this study was to identify subtypes of S. zooepidemicus that shared a closer evolutionary ancestor with the S. equi biovar than the S. zooepidemicus H70 genome sequencing strain. ClonalFrame analysis identified nine STs that clustered with S. equi as illustrated Microbiology 154

MLST of Streptococcus zooepidemicus

in the majority-rules consensus tree shown in Fig. 3. The determination of the S. zooepidemicus population genetic structure presented here will enable research towards unravelling the key genetic events in the evolution of S. equi to focus on a much smaller number of related S. zooepidemicus subtypes that share a closer evolutionary ancestor. Of particular interest was the observation that rather than S. zooepidemicus STs isolated from cases of nonstrangles lymph node abscesses, the S. zooepidemicus STs that most closely clustered with S. equi were significantly more likely to be associated with cases of uterine infection or abortion in horses (P,0.001). These data provide one possible explanation for earlier observations that found no evidence of colonization of the equine respiratory tract by S. equi prior to invasion (Sweeney et al., 2005). We also found that isolates from six outbreaks of acute fatal haemorrhagic pneumonia in dogs in the UK or USA were genetically related, suggesting that certain subtypes of S. zooepidemicus may be more adept at causing disease in dogs. These data provide the strongest evidence yet that a reclassification of S. equi subsp. zooepidemicus and S. equi subsp. equi is required to accurately reflect the diversity and origins of this important group of pathogens. A reclassification was previously suggested by a study that utilized MLEE to define the genetic structure of 70 isolates of S. equi and 177 isolates of S. zooepidemicus (Jorm et al., 1994). However, the isolates included in this study were of limited geographical diversity and the MLEE scheme developed had only the capability to differentiate 41 different ETs. Sixty-nine of the 70 S. equi subsp. equi isolates fell into ET 12, which suggested that they were members of a single clone and that S. zooepidemicus may be the archetypal species from which the clone designated subspecies equi was derived. The MLST data presented in this report provide strong evidence in agreement with this earlier MLEE study and suggest that these pathogens should be referred to as Streptococcus zooepidemicus and Streptococcus zooepidemicus subsp. equi. In conclusion, we have developed the first unambiguous typing method for the determination of the population structure of S. zooepidemicus and linked several related STs to particular disease states. It is hoped that the data presented here can be used as a framework for the identification of virulence genes that form the genetic basis for the selection of host and site of infection.

sequencing project and the Horserace Betting Levy Board funded the S. zooepidemicus genome sequencing project. The pilot study to select the most suitable seven loci for the S. zooepidemicus MLST scheme was funded by the Animal Health Trust. This work was supported by a project grant from the Horserace Betting Levy Board and the European Breeders Fund.

REFERENCES Bradley, S. F., Gordon, J. J., Baumgartner, D. D., Marasco, W. A. & Kauffman, C. A. (1991). Group C streptococcal bacteremia: analysis of

88 cases. Rev Infect Dis 13, 270–280. Breiman, R. F. & Silverblatt, F. J. (1986). Systemic Streptococcus equi

infection in a horse handler – a case of human strangles. West J Med 145, 385–386. Brooks, D. E., Andrew, S. E., Biros, D. J., Denis, H. M., Cutler, T. J., Strubbe, D. T. & Gelatt, K. N. (2000). Ulcerative keratitis caused by

beta-hemolytic Streptococcus equi in 11 horses. Vet Ophthalmol 3, 121–125. Chalker, V. J., Brooks, H. W. & Brownlie, J. (2003). The association of

Streptococcus equi subsp. zooepidemicus with canine infectious respiratory disease. Vet Microbiol 95, 149–156. Chanter, N., Collin, N., Holmes, N., Binns, M. & Mumford, J. (1997).

Characterization of the Lancefield group C Streptococcus 16S–23S RNA gene intergenic spacer and its potential for identification and sub-specific typing. Epidemiol Infect 118, 125–135. Coffey, T. J., Pullinger, G. D., Urwin, R., Jolley, K. A., Wilson, S. M., Maiden, M. C. & Leigh, J. A. (2006). First insights into the evolution of

Streptococcus uberis: a multilocus sequence typing scheme that enables investigation of its population biology. Appl Environ Microbiol 72, 1420–1428. Didelot, X. & Falush, D. (2007). Inference of bacterial microevolution

using multilocus sequence data. Genetics 175, 1251–1266. Downar, J., Willey, B. M., Sutherland, J. W., Mathew, K. & Low, D. E. (2001). Streptococcal meningitis resulting from contact with an

infected horse. J Clin Microbiol 39, 2358–2359. Elsayed, S., Hammerberg, O., Massey, V. & Hussain, Z. (2003).

Streptococcus equi subspecies equi (Lancefield group C) meningitis in a child. Clin Microbiol Infect 9, 869–872. Enright, M. C. & Spratt, B. G. (1998). A multilocus sequence typing

scheme for Streptococcus pneumoniae: identification of clones associated with serious invasive disease. Microbiology 144, 3049–3060. Enright, M. C., Spratt, B. G., Kalia, A., Cross, J. H. & Bessen, D. E. (2001). Multilocus sequence typing of Streptococcus pyogenes and the

relationships between emm type and clone. Infect Immun 69, 2416– 2427. Feil, E. J., Holmes, E. C., Bessen, D. E., Chan, M. S., Day, N. P., Enright, M. C., Goldstein, R., Hood, D. W., Kalia, A. & other authors (2001). Recombination within natural populations of pathogenic

bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proc Natl Acad Sci U S A 98, 182–187.

ACKNOWLEDGEMENTS We gratefully acknowledge Professor John Timoney (University of Kentucky) and Professor Joe Brownlie (The Royal Veterinary College), who provided isolates for inclusion in this study. We are also grateful to Dr Julian Parkhill and Dr Matt Holden of the Wellcome Trust Sanger Institute, who provided help and expertise in the use of the Artemis Comparison Tool and the S. equi and S. zooepidemicus genome sequencing data. We thank Xavier Didelot (University of Warwick) for advice and guidance with interpreting the ClonalFrame results. The Horse Trust funded the S. equi genome http://mic.sgmjournals.org

Feil, E. J., Li, B. C., Aanensen, D. M., Hanage, W. P. & Spratt, B. G. (2004). eBURST: inferring patterns of evolutionary descent among

clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol 186, 1518–1530. Hashikawa, S., Iinuma, Y., Furushita, M., Ohkura, T., Nada, T., Torii, K., Hasegawa, T. & Ohta, M. (2004). Characterization of group C and

G streptococcal strains that cause streptococcal toxic shock syndrome. J Clin Microbiol 42, 186–192. Holmes, E. C., Urwin, R. & Maiden, M. C. (1999). The influence of

recombination on the population structure and evolution of 3023

K. Webb and others the human pathogen Neisseria meningitidis. Mol Biol Evol 16, 741– 749.

pneumonia in intensively housed (shelter) dogs caused by Streptococcus equi subsp. zooepidemicus. Vet Pathol 45, 51–53.

Hong, C. B., Donahue, J. M., Giles, R. C., Jr, Petrites-Murphy, M. B., Poonacha, K. B., Roberts, A. W., Smith, B. J., Tramontin, R. R., Tuttle, P. A. & Swerczek, T. W. (1993). Etiology and pathology of equine

Popescu, G. A., Fuerea, R. & Benea, E. (2006). Meningitis due to an

placentitis. J Vet Diagn Invest 5, 56–63.

unusual human pathogen: Streptococcus equi subspecies equi. South Med J 99, 190–191.

Jolley, K. A., Feil, E. J., Chan, M. S. & Maiden, M. C. (2001). Sequence

Salasia, S. I., Wibawan, I. W., Pasaribu, F. H., Abdulmawjood, A. & Lammler, C. (2004). Persistent occurrence of a single Streptococcus

type analysis and recombinational tests (START). Bioinformatics 17, 1230–1231.

equi subsp. zooepidemicus clone in the pig and monkey population in Indonesia. J Vet Sci 5, 263–265.

Jolley, K. A., Chan, M. S. & Maiden, M. C. (2004). mlstdbNet –

Selander, R. K., Caugant, D. A., Ochman, H., Musser, J. M., Gilmour, M. N. & Whittam, T. S. (1986). Methods of multilocus enzyme

distributed multi-locus sequence typing (MLST) databases. BMC Bioinformatics 5, 86. Jorm, L. R., Love, D. N., Bailey, G. D., McKay, G. M. & Briscoe, D. A. (1994). Genetic structure of populations of beta-haemolytic

Lancefield group C streptococci from horses and their association with disease. Res Vet Sci 57, 292–299. Kelly, C., Bugg, M., Robinson, C., Mitchell, Z., Davis-Poynter, N., Newton, J. R., Jolley, K. A., Maiden, M. C. & Waller, A. S. (2006).

Sequence variation of the SeM gene of Streptococcus equi allows discrimination of the source of strangles outbreaks. J Clin Microbiol 44, 480–486. Ladlow, J., Scase, T. & Waller, A. (2006). Canine strangles case reveals

electrophoresis for bacterial population genetics and systematics. Appl Environ Microbiol 51, 873–884. Sharp, M. W., Prince, M. J. & Gibbens, J. (1995). S. zooepidemicus

infection and bovine mastitis. Vet Rec 137, 128. Smith, K. C., Blunden, A. S., Whitwell, K. E., Dunn, K. A. & Wales, A. D. (2003). A survey of equine abortion, stillbirth and neonatal death in

the UK from 1988 to 1997. Equine Vet J 35, 496–501. Soedarmanto, I., Pasaribu, F. H., Wibawan, I. W. & Lammler, C. (1996). Identification and molecular characterization of serological

group C streptococci isolated from diseased pigs and monkeys in Indonesia. J Clin Microbiol 34, 2201–2204.

a new host susceptible to infection with Streptococcus equi. J Clin Microbiol 44, 2664–2665.

Stevenson, R. G. (1974). Streptococcus zooepidemicus infection in

Las Heras, A., Vela, A. I., Fernandez, E., Legaz, E., Dominguez, L. & Fernandez-Garayzabal, J. F. (2002). Unusual outbreak of clinical

Sweeney, C. R., Timoney, J. F., Newton, J. R. & Hines, M. T. (2005).

sheep. Can J Comp Med 38, 243–250.

mastitis in dairy sheep caused by Streptococcus equi subsp. zooepidemicus. J Clin Microbiol 40, 1106–1108.

Streptococcus equi infections in horses: guidelines for treatment, control and prevention of strangles. J Vet Intern Med 19, 123–134.

Lindmark, H., Jonsson, P., Engvall, E. & Guss, B. (1999). Pulsed-field

Swofford, D. L. (2003).

PAUP*: Phylogenetic analysis using parsimony (and other methods). Sunderland, MA: Sinauer Associates.

gel electrophoresis and distribution of the genes zag and fnz in isolates of Streptococcus equi. Res Vet Sci 66, 93–99.

Timoney, J. F. (1993). Strangles. Vet Clin North Am Equine Pract 9,

Maiden, M. C. (2006). Multilocus sequence typing of bacteria. Annu

365–374.

Rev Microbiol 60, 561–588.

Walker, J. A. & Timoney, J. F. (1998). Molecular basis of variation in

Maiden, M. C., Bygraves, J. A., Feil, E., Morelli, G., Russell, J. E., Urwin, R., Zhang, Q., Zhou, J., Zurth, K. & other authors (1998).

protective SzP proteins of Streptococcus zooepidemicus. Am J Vet Res 59, 1129–1133.

Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A 95, 3140–3145.

Waller, A. S. & Jolley, K. A. (2007). Getting a grip on strangles: recent

Nei, M. & Gojobori, T. (1986). Simple methods for estimating the

Wood, J. L., Burrell, M. H., Roberts, C. A., Chanter, N. & Shaw, Y. (1993). Streptococci and Pasteurella spp. associated with disease of

numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3, 418–426. Newton, J. R., Laxton, R., Wood, J. L. & Chanter, N. (2008). Molecular

epidemiology of Streptococcus zooepidemicus infection in naturally occurring equine respiratory disease. Vet J 175, 338–345. Pesavento, P. A., Hurley, K. F., Bannasch, M. J., Artiushin, S. & Timoney, J. F. (2008). A clonal outbreak of acute fatal hemorrhagic

3024

progress towards improved diagnostics and vaccines. Vet J 173, 492– 501.

the equine lower respiratory tract. Equine Vet J 25, 314–318. Wood, J. L., Newton, J. R., Chanter, N. & Mumford, J. A. (2005).

Association between respiratory disease and bacterial and viral infections in British racehorses. J Clin Microbiol 43, 120–126. Edited by: T. J. Mitchell

Microbiology 154