Journal of Medical Microbiology (2006), 55, 1061–1070
DOI 10.1099/jmm.0.46460-0
Genotyping of Campylobacter jejuni using seven single-nucleotide polymorphisms in combination with flaA short variable region sequencing Erin P. Price,1 Venugopal Thiruvenkataswamy,1 Lance Mickan,2 Leanne Unicomb,3 Rosa E. Rios,4 Flavia Huygens1 and Philip M. Giffard1 1
Cooperative Research Centre for Diagnostics, Queensland University of Technology (Gardens Point Campus), GPO Box 2434, Brisbane, Queensland 4001, Australia
Correspondence Philip M. Giffard
2
[email protected]
Institute of Medical and Veterinary Science, Adelaide, Australia
3
OzFoodNet, Hunter New England Population Health, Wallsend, Australia and the National Centre for Epidemiology and Population Health, Australian National University, Canberra, Australia
4
Microbiological Diagnostic Unit, University of Melbourne, Melbourne, Australia
Received 8 December 2005 Accepted 26 April 2006
This investigation describes the development of a generally applicable, bioinformatics-driven, single-nucleotide polymorphism (SNP) genotyping assay for the common bacterial gastrointestinal pathogen Campylobacter jejuni. SNPs were identified in silico using the program ‘Minimum SNPs’, which selects for polymorphisms providing the greatest resolution of bacterial populations based on Simpson’s index of diversity (D). The high-D SNPs identified in this study were derived from the combined C. jejuni/Campylobacter coli multilocus sequence typing (MLST) database. Seven SNPs were found that provided a D of 0?98 compared with full MLST characterization, based on 959 sequence types (STs). The seven high-D SNPs were interrogated using allele-specific real-time PCR (AS kinetic PCR), which negates the need for expensive labelled primers or probes and requires minimal assay optimization. The total turnaround time of the SNP typing assay was approximately 2 h. Concurrently, 69 C. jejuni isolates were subjected to MLST and flagellin A short variable region (flaA SVR) sequencing and combined with a population of 84 C. jejuni and C. coli isolates previously characterized by these methods. Within this collection of 153 isolates, 19 flaA SVR types (D=0?857) were identified, compared with 40 different STs (D=0?939). When MLST and flaA SVR sequencing were used in combination, the discriminatory power was increased to 0?959. In comparison, SNP typing of the 153 isolates alone provided a D of 0?920 and was unable to resolve a small number of unrelated isolates. However, addition of the flaA SVR locus to the SNP typing procedure increased the resolving power to 0?952 and clustered isolates similarly to MLST/flaA SVR. This investigation has shown that a seven-member C. jejuni SNP typing assay, used in combination with sequencing of the flaA SVR, efficiently discriminates C. jejuni isolates.
INTRODUCTION Acute gastrointestinal infection with Campylobacter jejuni, and to a lesser extent Campylobacter coli, continues to be the leading cause of bacterial diarrhoeal illness worldwide. The majority of gastroenteritis cases caused by C. jejuni are sporadic, with few identified outbreaks (Pebody et al., 1997). Transmission to humans has been linked to multiple Abbreviations: AS, allele-specific; AS kinetic PCR, allele-specific realtime PCR; CC, clonal complex; D, Simpson’s index of diversity; flaA SVR, flagellin A short variable region; gDNA, genomic DNA; MLST, multilocus sequence typing; SLV, single-locus variant; SNP, single-nucleotide polymorphism; ST, sequence type; Tm, melting temperature.
46460 G 2006 SGM
Printed in Great Britain
sources, including contaminated or undercooked food, domestic animals and livestock, and contaminated water (Altekruse et al., 1999; Kapperud et al., 2003). The sporadic nature of campylobacter enteritis, the high diversity of C. jejuni, and the incomplete understanding of associations between genotype and virulence potential conspire to make the utility of Campylobacter typing a matter of ongoing debate. Currently, molecular typing of C. jejuni is seen as useful in two broad contexts: firstly, to further understand the population structure, epidemiology and transmission of C. jejuni; and secondly, to routinely monitor this organism within diagnostic or food production and processing facilities (van Belkum, 2003). 1061
E. P. Price and others
Multilocus sequence typing (MLST) has in recent years emerged as a powerful tool for determining the population structure of many bacterial pathogens, including C. jejuni and C. coli (Maiden et al., 1998; Dingle et al., 2001a). MLSTbased studies have shown that C. jejuni is highly diverse, revealing a weakly clonal population structure as a result of high levels of intra-locus recombination (Dingle et al., 2001a; Suerbaum et al., 2001). Despite continued advances in DNA-sequencing technology and robotics, MLST remains expensive and impractical for routine monitoring of C. jejuni outside major research facilities. To combat these shortfalls, the use of informative single-nucleotide polymorphisms (SNPs) has been described as a cost-effective alternative to full MLST characterization (Best et al., 2004, 2005; Robertson et al., 2004). SNPs are readily amenable to highthroughput real-time PCR or medium-density array methodologies, as these platforms facilitate rapid and single-step interrogation of multiple targets (Shi, 2001). The aims of this study were twofold. Firstly, MLST and flagellin A short variable region (flaA SVR) sequencing was performed on 69 Australian C. jejuni and C. coli isolates to gain further insight into the population structure of Australian C. jejuni. Addition of the flaA SVR locus to MLST for outbreak investigations provides high discriminatory power that is comparable to that of PFGE (Sails et al., 2003; Mellmann et al., 2004; Clark et al., 2005). Secondly, a singlestep SNP-based genotyping method was developed using the approach described by Robertson et al. (2004). A set of seven SNPs was derived from the C. jejuni MLST database that provided a high Simpson’s index of diversity (D), calculated with respect to MLST. An allele-specific real-time PCR (AS kinetic PCR) methodology was developed to interrogate these high-D SNPs.
METHODS Campylobacter isolates. A collection of 151 C. jejuni and two C. coli strains, including 84 previously characterized by MLST and flaA SVR (O’Reilly et al., 2006), was used in the current study. Isolates were obtained from the stools of human patients with sporadic gastroenteritis within the Hunter region of New South Wales, Australia, between 1999 and 2001. Within the 153 isolates were representatives of sequence type (ST)-21, ST-22, ST-42, ST-45, ST-48, ST-52, ST-61, ST-206, ST-257, ST-353, ST-354, ST-443, ST-460 and ST-658 MLST clonal complexes, and six singleton STs, ST-449 (n=1), ST-525 (n=6), ST-526 (n=1), ST-530 (n=7), ST-531 (n=10) and ST-555 (n=2). Preparation of genomic DNA. Both boiled cell and kit-based
extraction methodologies were employed to prepare genomic DNA (gDNA) templates for sequencing and kinetic PCR assays. All isolates were subcultured onto Campylobacter medium containing lysed horse blood and Campylobacter antibiotic supplement (Oxoid) and incubated at 38–40 uC for 48 h under microaerophilic conditions. For boiled cell extracts, a small loopful of culture was resuspended in 400 ml double-distilled H2O, boiled for 6–10 min at 100 uC in a thermal cycler and centrifuged at 13 400 g for 5–10 min to remove particulate debris. The supernatant was used for subsequent PCR amplification. For kit-extracted gDNA, template was extracted using the DNeasy Tissue kit (Qiagen) as per manufacturer’s instructions. All templates were stored at 220 uC until use. 1062
Nucleotide sequencing. MLST was performed as previously
described (Dingle et al., 2001a; O’Reilly et al., 2006) and made use of the database at http://pubmlst.org/campylobacter/. flaA SVR sequencing was carried out on all isolates in this study according to methods described elsewhere (Nachamkin et al., 1993; Meinersmann et al., 1997). flaA sequences were assigned allelic designations based on previous ST submissions to the flaA SVR database (http:// outbreak.ceid.ox.ac.uk/campylobacter/). Identifying generalized high-D SNPs in the C. jejuni MLST database. Highly informative, generalized C. jejuni SNPs were
identified using the ‘Minimum SNPs’ software package (versions 2.034 and 2.0415), which has been described in detail elsewhere (Robertson et al., 2004). The D function of Minimum SNPs applies Simpson’s index of diversity (D) (Hunter & Gaston, 1988) to identify and score outputted SNP sets. Alleles and corresponding STs from the C. jejuni MLST database (http://pubmlst.org/campylobacter/) were used as input for the Minimum SNPs software. As part of the current study, an updated version of the Minimum SNPs software was developed. The revised program (version 2.042) allows the user to specify SNPs to be included or excluded from the SNP set being assembled. AS kinetic PCR assay design. Both allele-specific (AS) and con-
sensus primers were designed using Primer Express version 2.0 software (Applied Biosystems) and the CLUSTALX version 1.8 alignment tool. The specificity of amplification was conferred by placement of the 39 end of the AS primers directly over and matching the polymorphic base. To enhance allele specificity, additional subterminal and antepenultimate mismatches were incorporated into the AS primers (Newton et al., 1989; He´zard et al., 1997). All primers were designed around a calculated melting temperature (Tm) of 59?0 uC and minimal potential for primer-dimer formation. Primer sequences were aligned against the two C. jejuni genomes NCTC 11168 (GenBank accession NC_002163) and RM1221 (GenBank accession NC_003912) using the BLASTN tool (http://www.ncbi.nih.gov/), to ensure gene-specific amplification. The AS and common primer sequences used for interrogating the seven high-D SNPs are listed in Table 1. AS kinetic PCR assays were performed using a Rotor-Gene 3000 realtime PCR apparatus (Corbett Robotics). Each reaction contained 5 pmol of each primer, 1 ml gDNA, 16 Platinum SYBR Green qPCR SuperMix-UDG (Invitrogen) and distilled water to a total volume of 10 ml. Cycle conditions were 95 uC for 2 min, followed by 40 cycles of (95 uC for 1 s; combined annealing and extension at 61 uC for 10 s). Dissociation curves spanning 61–95 uC were generated after all runs to detect primer-dimer formation (seen as a distinct peak at or below 75 uC) and to confirm correct melting profiles for all products. Negative controls containing H2O in place of gDNA template were used for each primer set. For each AS reaction, all available MLST alleles (Table 2) were tested in duplicate. eBURST analysis. The web-based eBURST (based upon related
sequence types) version 2 program, accessed at http://eburst.mlst. net/ (Feil et al., 2004; Spratt et al., 2004) was used to aid visualization of the relationship between high-D SNP and MLST profiles generated for the 153 C. jejuni/C. coli isolates in this study.
RESULTS AND DISCUSSION flaA SVR typing of C. jejuni and C. coli isolates The utility of flaA SVR typing in combination with MLST was assessed on the 153 C. jejuni and C. coli isolates used in this study. Within the 40 different STs from this collection, 19 flaA types were identified (Table 3). Addition of the flaA Journal of Medical Microbiology 55
Campylobacter jejuni genotyping using SNPs
Table 1. AS primers used for interrogation of high-D SNPs in C. jejuni All primers were designed with a Tm of 59±1 uC. Primers lacking a base in bold type at the 39 end are common to the AS reactions; nucleotides in lower case indicate mismatches in certain alleles; underlined bases indicate deliberately incorporated mismatches. R=purine (A or G); Y=pyrimidine (C or T). SNP
Cumulative D
Primer name
Primer sequence (5§R3§)
Primer length (bp)
Amplicon size (bp)
aspA174
0?543
glyA267
0?766
glnA369
0?875
gltA12
0?929
uncA189
0?953
pgm348
0?971
tkt297
0?980
aspA174-A aspA174-G aspA174-T aspA174-For glyA267-A glyA267-G glyA267-For glnA369-C glnA369-T glnA369-Rev gltA12-A1 gltA12-A2 gltA12-G gltA12-For uncA189-C uncA189-T1 uncA189-T2 uncA189-Rev1 uncA189-Rev2 pgm348-A pgm348-G pgm348-Rev tkt297-C tkt297-T tkt297-Rev
CGTATCTTGAGTCGCCTCCATT CcGTaTCtTGAGTcGCcTCTATC CtCCtGTATCTTGAGTtGCTTCGATA GCTATtGGAACgGGtATTAATTCtCA TgcGGAAAAGGACTTGGATGT TgtGGaAAaGgacTTGGaTGC TGtGGaGCgAGtGCtTATGC CGTTCTGAAATGGtGCAAACC CGcTCTGAAATGgTGCAAACT gCTTGcCCTTGtGCAACTTC gCATACCTTCATGGATAAAAGAACGT TGcATACCCTCATGtATATAAGAGCGT CATACCTTCATGgATAAAAGAACGC CCGTGGCTATCCTATAGAGTGGC aGGtCGtGAaGCTTATCCaTGC CAgGTCGTGAAGCTTATCCAAGT GTCGCGAGGCTTAtCCTTGT AgaaCCaGCACCTAATTCATcATTTAG CCcGCaCCTAAcTCATCATTTAA AATGGTGGAAATTTtGGtGGaTAA ggtGGAAAtTTtGGYGGaTAG gaaAgCATtAAaGCacTAAATTGcAA GCAGGaCTTCACAAACTTGATCAC CTTTAgCAGGaCTTCAtAaaCTTgAGAAT GCAtTTTTAcATYTTCRttAAAGGCTA
22 22 26 26 21 21 20 21 21 20 26 27 25 23 22 23 20 27 23 24 21 26 24 29 27
123 123 127 2 143 143 2 80 80 2 149 150 148 2 95 95 88 2 2 110 107 2 103 108 2
SVR locus to ST identity did not subdivide the most abundant ST, ST-48 (n=23), with all isolates harbouring the flaA1 allele (peptide 67). Similarly, STs 5 (n=3), 21 (n=2), 25 (n=2), 52 (n=5), 197 (n=2), 451 (n=2), 524 (n=2), 525 (n=6), 527 (n=2), 530 (n=7) and 532 (n=2) were unable to be further resolved at the flaA SVR locus. Conversely, STs 42 (n=4), 50 (n=7), 161 (n=4), 227
Table 2. Alleles at C. jejuni MLST loci used for validation of the AS kinetic PCR assay Housekeeping locus aspA glnA gltA glyA pgm tkt uncA
Alleles present in 40 STs used in this study 1, 2, 4, 7, 8, 9, 14, 33, 74, 75 1, 2, 3, 4, 7, 10, 17, 21, 25, 30, 39, 45, 71, 77, 79, 80 1, 2, 3, 4, 5, 6, 10, 12, 21, 65 1, 2, 3, 4, 10, 15, 27, 62, 79, 93, 94 1, 2, 3, 4, 5, 6, 7, 10, 11, 13, 19, 22, 42, 86, 89, 95, 105, 111 1, 3, 5, 7, 9, 12, 25, 37, 51, 59, 67, 80, 81 1, 3, 5, 6, 12, 17, 23
http://jmm.sgmjournals.org
(n=5), 257 (n=17), 354 (n=3), 523 (n=7), 528 (n=18) and 531 (n=10) contained two or more flaA SVR genotypes within each ST. In combination, the MLST and flaA SVR typing generated 63 different genotypes within the collection (D=0?959). The incomplete concordance between the MLST and flaA SVR types concurs with other investigations (Dingle et al., 2001b, 2002; Duim et al., 2003) and is likely a result of both the high recombination frequency of C. jejuni and the hypervariable nature of the flaA locus (Wassenaar et al., 1995; Duim et al., 2003). Dingle et al. (2002) have previously examined the association between MLST CCs and flaA SVR types in a large collection of European C. jejuni isolates. The relationship between CCs and flaA SVR types in the current study is largely consistent with the European collection, suggesting that the population structure of Australian C. jejuni is similar to that found in Europe. Identification of generally applicable high-D SNPs from the C. jejuni MLST database It has been previously demonstrated that a small number of SNPs derived from MLST data may be used to define either entire bacterial populations, such as Staphylococcus aureus and Neisseria meningitidis (Robertson et al., 2004; Huygens 1063
NA,
Not applicable.
ST*
Clonal complex
No. of isolates
Polymorphic profiles
flaA SVR types
aspA174 A/G/T
ST SNP profiles 5 ST-353 527 ST-353 524 ST-353 537 ST-353 21 ST-21 53 ST-21 190 ST-21 569 ST-21 50 ST-21
3 2 2 1 2 1 1 1 7
Journal of Medical Microbiology 55
451 536 25 45 529 233 538 567 42 48 51 52 161 70 61 197 257
ST-21 ST-21 ST-45 ST-45 ST-45 ST-45 ST-45 ST-22 ST-42 ST-48 ST-443 ST-52 ST-52 ST-52 ST-61 ST-257 ST-257
2 1 2 1 1 1 1 1 4 23 1 5 4 1 1 2 17
532 354 528 533 227
ST-257 ST-354 ST-354 ST-52 ST-206
2 3 18 1 5
11 11 11 11 8 1 1 1 1, 8, 10, 350 1 10 1 5 9 1 12 9 1, 9 1 2 4 2, 4, 10 4 14 12 1, 2, 4, 8, 12, 20 12 18, 20, 37 1, 11, 20 1 1, 10
D CT
glyA267 A/G
DCT
glnA369D C/T 2 77 17 17 1 1 1 80 1
gltA12D
uncA189D
pgm348
DCT
A/G
D CT
C/T
DCT
A/G
16?64 17?65 2 2 2 2 2 16?82 18?99
A1, 5 A1, 5 A1, 5 A1, 5 A1, 1 A1, 21 A1, 5 A1, 5 G, 12
2 10?39 2 2 2 10?54 2 2 8?12
C, 6 C, 6 C, 1 C, 6 T1, 5 T1, 5 T1, 5 T1, 5 T1, 5
8?62 2 9?3 2 2 2 2 2 2
G, G, G, A, A, A, A, A, A,
tkt297
D CT
C/T
D CT
10 10 10 11 2 2 2 2 2
2 2 6 2 2 2 2 2 2
C, 3 C, 3 C, 3 C, 81 C, 1 C, 1 C, 3 C, 3 C, 1
2 2 2 17?31 16?65 2 16?2 2 2
A, A, G, A, A, A, A, A, A,
7 7 8 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2
A, A, A, A, G, G, G, G, G,
2 2 2 2 3 3 3 3 3
2 2 2 2 2 2 2 2 14?9
C, C, C, C, C, C, C, C, C,
A, A, G, G, G, A, G, G, G, A, A, G, G, G, G, G, G,
2 2 4 4 4 2 4 1 1 2 7 9 14 9 1 75 9
2 2 10?84 2 2 2 2 2 2 2 6?98 11?67 12?04 2 11?88 8?41 11?74
G, G, G, G, G, G, G, G, G, A, G, G, G, G, A, A, A,
3 3 1 4 4 4 4 4 4 2 15 10 10 10 2 62 62
2 2 11?03 2 2 2 2 2 9?69 2 15 2 15?59 2 2 2 2
C, 1 C, 17 T1, 7 T1, 7 T1, 79 T1, 7 T1, 7 T1, 3 C, 2 T1, 4 C, 17 C, 25 C, 21 C, 25 T1, 4 C, 2 C, 2
2 2 2 4?26 3?19 2 2 4?27 2 3?97 18?78 19?4 19?53 2 2 2 20?9
G, 2 G, 12 A1, 10 A1, 10 A1, 10 A1, 10 A1, 10 A1, 6 A1, 3 A1, 1 G, 2 G, 2 G, 2 G, 2 G, 2 G, 4 G, 4
2 2 2 8?58 2 2 2 11?54 12?41 14?07 2 2 2 2 2 2 13?23
T1, 5 T1, 5 C, 1 C, 1 C, 1 C, 1 C, 1 C, 1 C, 3 T1, 5 C, 12 C, 6 C, 6 C, 6 T2, 17 C, 6 C, 6
6?7 2 10?34 9?5 2 2 2 2 9?06 6?61 8?85 8?95 9?56 2 21?8 9?11 2
A, A, G, G, G, G, G, G, G, G, A, A, A, A, A, A, A,
2 2 1 1 42 1 42 3 5 7 23 22 86 95 6 4 4
2 2 9?7 2 9?08 2 2 6?57 5?94 6?18 5?36 5?47 5?8 5?51 5?46 5?62 2
C, 3 C, 1 T, 7 T, 7 T, 51 T, 7 C, 25 C, 3 T, 9 C, 1 C, 3 C, 3 C, 3 C, 3 C, 3 T, 5 T, 5
2 2 2 2 10?28 10?64 17?26 2 10?13 2 2 2 2 2 2 2 2
G, G, G, G, A,
9 8 74 9 2
2 11?59 11?94 2 7?42
A, A, A, A, A,
62 2 2 94 2
2 2 2 7?27 7?34
C, 2 C, 10 C, 10 C, 17 T1, 4
2 19?16 2 2 3?99
G, 4 G, 2 G, 2 G, 2 A1, 5
2 2 9?97 2 2
C, 6 C, 6 C, 6 C, 6 T1, 5
2 2 9?47 2 6?8
A, A, A, A, A,
105 11 11 86 2
5?67 2 2 2 5?85
T, 5 C, 12 C, 12 C, 3 C, 1
9?39 16?09 2 2 2
STs not distinguished (%)
0?63 1?04 1?77 5?74
2?92
5?11
1?56 2?29 2?19 1?46 1?77 2?92
2?61 3?23
1?98
1?46
E. P. Price and others
1064
Table 3. Rotor-Gene 3000 AS kinetic PCR results for the seven high-D C. jejuni and C. coli SNPs
http://jmm.sgmjournals.org
Table 3. cont. ST*
Clonal complex
No. of isolates
flaA SVR types
Polymorphic profiles aspA174 A/G/T
312 523
ST-658 ST-658
1 7
535 449 531
ST-460
1 1 10
NA NA
525 NA 6 526 NA 1 530 NA 7 555 NA 2 Mean±SD DCT values A
A/G
glnA369D
gltA12D
uncA189D
DCT
C/T
DCT
A/G
D CT
1 G, 14 2 G, 4 2 1, 2, 11, A, 2 2 G, 93 10?1 71, 90 4 G, 9 2 A, 2 2 33 A, 7 2 A, 62 2 1, 2, 5, A, 2 8?97 A, 62 9?18 20 2 A, 7 6?8 G, 10 2 3 A, 2 2 G, 27 14?14 8 G, 9 2 G, 10 2 16 T, 33 23?15 G, 79 19?79 from pooled results for each polymorphism: 7?54±0?99 7?93±1?08
C, 45 T1, 4
18?91 2
G, 2 A1, 1
2 13?01
T1, 30 T1, 71 T1, 71
3?81 2 3?4
G, 2 A1, 5 A1, 5
C, 2 T1, 15 C, 2 T2, 39
2 2?78 2 14?32
A1, 5 G, 4 G, 2 A2, 65
NA
C/T
pgm348
tkt297
DCT
A/G
D CT
C/T
D CT
C, 6 C, 6
2 9?23
G, 19 A, 11
5?98 2
C, 3 C, 3
2 2
0?52 0?94
2 2 10?49
C, 6 C, 6 C, 6
2 2 7?92
A, 89 A, 11 A, 11
5?49 2 5?72
C, 59 C, 67 C, 67
2 17?01 2
1?88 1?04
2 2 11?59 28?23
C, 1 C, 23 T1, 5 T2, 17
T, C, C, T,
10?38 15?67 17?83 2
0?52 1?36 0?52 4?17
2 9?93 2 28?99
A1: 11?37±1?74 A2: 28?23
NA
A, A, A, G,
11 13 86 111
2 5?52 2 10?33
37 80 59 7
5?59±0?16
NA
16?75±0?73
NA
NA
18?68±1?30
NA
9?26±0?60
NA
11?26±1?21 23?15
13?78±3?38
NA
10?73±2?19
NA
7?47±1?89
NA
NA
T1: 3?71±0?53 T2: 14?32
NA
T1: 6?70±0?10 T2: 25?40±5?08
NA
10?16±0?47
*STs were grouped according to their corresponding SNP profile at the seven high-D SNPs. The number following each nucleotide in the SNP profile indicates the corresponding allelic designations based on the MLST database (http://pubmlst.org/campylobacter/). DTwo allelic states present at the AS primer binding site, indicated by a 1 or 2 following the expected polymorphism T (glnA369), A (gltA12) or T (uncA189).
1065
Campylobacter jejuni genotyping using SNPs
C G T
D CT
glyA267
STs not distinguished (%)
E. P. Price and others
et al., 2004; Stephens et al., 2006), single hyperinvasive N. meningitidis STs (Robertson et al., 2004) or clonal lineages of C. jejuni associated with human disease (Best et al., 2004, 2005). In the current study, the Minimum SNPs software package (Robertson et al., 2004) was used to extract SNPs from the entire C. jejuni/C. coli MLST database. Minimum SNPs concatenates the sequence of multiple MLST loci to form a single locus alignment for each ST, and outputs sets of SNPs that provide a high D with respect to the sequences in the alignment. In this context, D is calculated as the probability that any two sequences chosen at random are discriminated by the SNP set. Minimum SNPs was used to derive high-D SNP sets from STs 1–959. SNPs within the flaA SVR locus were also identified using Minimum SNPs, but were not considered for further analysis, as the sequence diversity was exceptionally high and assay design proved impractical. Within the MLST loci, AS primer design for certain high-D C. jejuni SNPs was also challenging due to high levels of sequence diversity surrounding the SNP. This prompted the implementation of the ‘include’ and ‘exclude’ functions of Minimum SNPs to facilitate the replacement of SNPs for which assay design was difficult. After taking into account primer design restrictions, the SNP set arrived at (with cumulative D in parentheses) was aspA174 (0?543), glyA267 (0?766), glnA369 (0?875), gltA12 (0?929), uncA189 (0?953), pgm348 (0?971), tkt297 (0?980), uncA3 (0?985), aspA414 (0?988), glnA108 (0?990), gltA320 (0?992), gltA396 (0?993), glnA19 (0?994) and pgm494 (0?995). By excluding certain SNPs that failed the design criteria, a greater number of SNPs were required to achieve a comparable D between the above SNP set and an SNP set assembled without assay design constraints (results not shown). However, the total number of polymorphisms, and therefore the cost of the assay, was comparable between the two SNP sets due to the high number of tri- and tetra-allelic SNPs selected for in the unconstrained SNP set. It should be noted that for the sixth SNP, uncA189, a third polymorphism (A) is present in allele 56 of uncA. However, sequence alignments of the uncA-56 allele with those of other Campylobacter species (CampyBLAST, http://www.vge.ac.uk/; Miller et al., 2005; French et al., 2005) showed greater homology to Campylobacter lari RM2100 (87 %) than to C. jejuni RM1221 (86 %), C. jejuni NCTC 11168 (86 %) or C. coli (85 %), suggesting that uncA-56 may have originated in a Campylobacter species other than C. jejuni or C. coli. Further, the uncA-56 allele is not present in any ST to date. Therefore, the A polymorphism was excluded from uncA189 assay design. The SNP assay developed in the current study differs from the assays described by Best et al. (2004, 2005). In the latter studies, 15 SNPs were selected based on their ability to diagnose six major C. jejuni CCs: ST-21, ST-45, ST-48, ST-61, 1066
ST-206 and ST-257. Whilst useful for rapidly determining whether isolates belong to the major CC lineages, the assay is limited in its utility for the entire C. jejuni/C. coli population, with 43 C. jejuni/C. coli CCs currently recognized (http://pubmlst.org/campylobacter/, last accessed 3 April 2006). In contrast, the SNPs described in the current investigation are broadly applicable, as they were selected without bias towards particular CCs on the basis of providing a high D across the entire species. The seven-member SNP set allows rapid determination of whether any two isolates are ‘same or different’. Such SNPs facilitate the efficient examination of transmission and epidemiological linkage hypotheses across the breadth of C. jejuni diversity.
AS kinetic PCR analysis of the high-D SNPs The choice of method for SNP interrogation, AS kinetic PCR, was driven in part by the high sequence diversity at the MLST loci. Other real-time PCR-based methods, such as TaqMan (Livak, 1999) or molecular beacons (Mhlanga & Malmberg, 2001), impose constraints on assay design, and require fluorescently labelled oligonucleotides and probes. AS primers were designed for the first seven high-D SNPs (D=0?980). It was considered that seven SNPs provided a good compromise between assay size and resolving power; the addition of an extra seven SNPs increases the resolving power to D=0?994. The AS kinetic PCR results for the seven high-D SNPs are shown in Table 3. The cycles to threshold (CT) difference between amplicons, termed DCT, was used to discriminate between the more efficient matched (lower CT) and less efficient mismatched (higher CT) products in the kinetic PCR assay (Heid et al., 1996; Germer & Higuchi, 2003). Among the 40 different STs and 85 alleles tested, the polymorphisms at each of the seven SNPs were unambiguously determined and fell in the expected orientation (Table 3). For glnA369, uncA189 and gltA12, primer binding site diversity gave differential DCT values within certain polymorphisms (Fig. 1). At uncA189 and gltA12, primer degeneracy was accommodated in part by mixing separately synthesized AS primers, and examination of raw CT values showed that the difference in efficiency was conferred by the mismatched reaction (results not shown). For the glnA369 T polymorphism, however, a reproducible distinction between allele 39 and alleles 3, 4, 7, 30, 71 and 79 was observed, despite no degenerate AS primer incorporation (Fig. 1). The only feasible source of this effect is a single mismatch in the 59 region of both the AS primer and the common primer at positions 219 and 215, respectively. This was an unforeseen finding, as allele specificity is generally only influenced by mismatches at the 39 end of the AS primer (Wu et al., 1989; Germer & Higuchi, 1999; Papp et al., 2003). By using the DCT alone, the varying efficiency of the mismatched reactions for glnA369-T, gltA12-A and uncA189-T permitted subdivision of a single polymorphism at these SNPs into two states. Journal of Medical Microbiology 55
Campylobacter jejuni genotyping using SNPs
Fig. 1. Alignments of partial (A) uncA, (B) gltA and (C) glnA allele sequences. The arrows indicate the location of the SNPs (uncA189, gltA12 or glnA369) interrogated by AS kinetic PCR. Shading indicates polymorphisms located in the binding sites of the AS primers, with the sequences of the AS primers shown above the alignment. For glnA369, the entire PCR product is indicated, with both AS and common primers shown above the alignment. The average (Avg.) DCT for the polymorphisms of all three SNPs, including sequence variants within primer binding sites of a polymorphism (labelled as A1/A2 or T1/T2), is also indicated.
eBURST analysis of seven high-D SNPs In order to assess the correlation between the SNP genotypes and the C. jejuni/C. coli population structure, the relationships of 59 C. jejuni and C. coli STs were depicted using eBURST, and the concordance between high-D SNP types and CCs was determined (Fig. 2). The STs examined by eBURST included the 40 STs from this study, nine previously identified Australian STs (22, 62, 66, 128, 362, 492, 493, 494 and 534) and 10 CC founders not identified in our collection (STs 41, 49, 177, 179, 283, 403, 433, 443, 508 and 573). The seven SNPs defined 33 profiles within the 59 C. jejuni and C. coli STs (Fig. 1). Of these, 26 were present within the 24 clonal complexes. The seven high-D SNPs were able to differentiate the majority of unrelated isolates (i.e. belonging to different CCs), and were able to discriminate most singleton STs. Of the nine singletons, six contained unique SNP profiles, whereas STs 449 and 531 remained indistinguishable at the seven SNPs, and ST-534 shared its polymorphic profile with ST-51 (ST-443 CC). In many cases, the SNP profiles were able to distinguish members within one CC, such as between the single-locus variant (SLV) STs 48 and 492. On the other hand, some CC founders, e.g. STs 41 and 179, were not discriminated by the seven SNPs. Other unrelated STs, such as STs 533 (ST-52 CC), 354 and 528 http://jmm.sgmjournals.org
(ST-354 CC), and STs 538 (ST-45 CC) and 567 (ST-22 CC), were also unresolved. A previously described, high-D SNP set in S. aureus showed good correlation with population structure defined by MLST, despite providing a lower level of discrimination (D=0?950) (Robertson et al., 2004; Stephens et al., 2006). The generation of novel STs in S. aureus occurs primarily by single mutation events, with recombination occurring infrequently (Feil et al., 2003). Without recombination, the states of the SNPs that differentiate between SLVs of a CC retain a highly biased distribution and therefore provide little resolving power. As expected, the high-D S. aureus SNPs were effective at discriminating the major CCs, but provided essentially no resolving power within the complexes. Based on these observations, it is likely that the high-D S. aureus SNP set reflects ancient SNPs that have disseminated throughout the species by recombination, and are thus powerful markers for delineating CCs (Feil et al., 2003; Stephens et al., 2006). C. jejuni and C. coli differ from S. aureus in that recombination is much more frequent, with novel STs primarily arising by recombination rather than de novo mutation (Dingle et al., 2001a; Suerbaum et al., 2001; Schouls et al., 1067
E. P. Price and others
Fig. 2. An eBURST population snapshot of 58 C. jejuni and C. coli STs, spanning 24 clonal complexes. The dotted boxes represent clonal complexes as defined by the C. jejuni MLST database; the solid boxes represent STs grouped according to their corresponding seven-nucleotide high-D SNP profile at aspA174, glyA267, glnA369, gltA12, pgm348, uncA189 and tkt297. SLVs are connected by lines. Clonal complexes are listed alphabetically as follows: (a) ST-21 complex; (b) ST-61 complex; (c) ST-573 complex; (d) ST-403 complex; (e) ST-22 complex; (f) ST-45 complex; (g) ST-283 complex; (h) ST-443 complex; (i) ST-353 complex; (j) ST-508 complex; (k) ST-257 complex; (l) ST-354 complex; (m) ST-52 complex; (n) ST-362 complex; (o) ST-433 complex; (p) ST-177 complex; (q) ST-41 complex; (r) ST-179 complex; (s) ST-49 complex; (t) ST-460 complex; (u) ST-658 complex; (v) ST-42 complex; (w) ST-48 complex; (x) ST-206 complex; and (y) singletons (STs currently unassigned to a clonal complex).
2003). Although there was concordance between the SNP profiles and the MLST-defined population structure, there were certain anomalies, with some unrelated STs generating the same SNP profile. Addition of extra SNPs to the profiles was unsuccessful at resolving these irregularities (results not shown). The inability of the high-D SNPs to precisely delineate CCs may be due to the weakly clonal nature of Campylobacter, and suggests that defining CCs in a similar fashion to highly clonal bacteria like S. aureus should be done with caution. Comparison of MLST, flaA SVR and SNP typing The discriminatory power of MLST, flaA SVR sequencing and high-D SNP typing was determined singly and in combination, using the 153 isolates from this study. Through use of a single genotyping technique, MLST provided the highest D of 0?939 within the isolates examined in this study. 1068
flaA SVR alone was unable to resolve the 153 isolates to the same degree as MLST (D=0?857), and has been shown here and by others to provide limited correlation with the clonal complexes defined by MLST (Dingle et al., 2001b, 2002). Nevertheless, flaA SVR is useful for distinguishing closely related strains (Dingle et al., 2005). The SNP typing method alone was unable to resolve some unrelated STs in the Australian collection and provided a lower level of discrimination than MLST (D=0?920). However, when the flaA SVR locus was added to the seven-member SNP profiles, the resolution increased to 0?952, comparable to MLST/flaA SVR (D=0?959). SNP typing/flaA SVR of the 153 isolates in this study discriminated unrelated isolates that remained indistinguishable using the SNP method alone, such as between the singleton STs 449 and 531, and STs 538 and 567. In only one circumstance was SNP/flaA SVR unable to differentiate between two unrelated STs, STs 533 and 528 (Table 3). It was a striking finding that there Journal of Medical Microbiology 55
Campylobacter jejuni genotyping using SNPs
was little difference in resolving power between MLST/flaA SVR and SNP/flaA SVR, which highlights the utility of combinatorial typing approaches for high-resolution genotyping of bacteria such as Campylobacter.
Dingle, K. E., Van Den Braak, N., Colles, F. M., Price, L. J., Woodward, D. L., Rodgers, F. G., Endtz, H. P., van Belkum, A. & Maiden, M. C. (2001b). Sequence typing confirms that Campylobacter
Conclusion
Dingle, K. E., Colles, F. M., Ure, R., Wagenaar, J. A., Duim, B., Bolton, F. J., Fox, A. J., Wareing, D. R. & Maiden, M. C. (2002). Molecular
MLST in combination with flaA SVR sequencing provides high-resolution genetic fingerprints for both longitudinal and short-term investigations of Campylobacter. We have tested the hypothesis that a small SNP set derived from MLST data of C. jejuni and C. coli can underpin a comparative genotyping method with high resolving power, relative to MLST. It was found that the discriminatory powers of MLST- and SNP-based genotyping were not greatly different, and when flaA SVR sequencing was combined with these methods, there was little difference in resolving power between MLST/flaA SVR and SNP typing/flaA SVR. An AS kinetic PCR-based method for interrogating the SNPs was shown to be rapid and robust.
ACKNOWLEDGEMENTS This work was supported by the Cooperative Research Centres programme of the Australian Federal Government, the Queensland Department of Innovation and Information Economy, and was also funded under the OzFoodNet programme of work, which is an initiative of the Australian Government Department of Health and Ageing. E. P. P. is in receipt of a postgraduate studentship from the Institute for Health and Biomedical Innovation, Queensland University of Technology. We wish to acknowledge the Campylobacter Subtyping Study Group for kindly providing the C. jejuni and C. coli isolates used in this study, and Geoff Hogg for helpful discussion. This publication made use of the C. jejuni and C. coli MLST website (http://pubmlst.org/ campylobacter/) developed by K. Jolley and M. S. Chan and sited at the University of Oxford; the development of the site has been funded by the Wellcome Trust.
jejuni strains associated with Guillain–Barre´ and Miller–Fisher syndromes are of diverse genetic lineage, serotype, and flagella type. J Clin Microbiol 39, 3346–3349.
characterization of Campylobacter jejuni clones: a basis for epidemiologic investigation. Emerg Infect Dis 8, 949–955. Dingle, K. E., Colles, F. M., Falush, D. & Maiden, M. C. (2005).
Sequence typing and comparison of population biology of Campylobacter coli and Campylobacter jejuni. J Clin Microbiol 43, 340–347. Duim, B., Godschalk, P. C., van den Braak, N. & 9 other authors (2003). Molecular evidence for dissemination of unique Campylo-
bacter jejuni clones in Curac¸ao, Netherlands Antilles. J Clin Microbiol 41, 5593–5597. Feil, E. J., Cooper, J. E., Grundmann, H. & 9 other authors (2003).
How clonal is Staphylococcus aureus? J Bacteriol 185, 3307–3316. Feil, E. J., Li, B. C., Aanensen, D. M., Hanage, W. P. & Spratt, B. G. (2004). EBURST: inferring patterns of evolutionary descent among
clusters of related bacterial genotypes from multilocus sequence typing data. J Bacteriol 186, 1518–1530. French, N., Barrigas, M., Brown, P. & 7 other authors (2005). Spatial
epidemiology and natural population structure of Campylobacter jejuni colonizing a farmland ecosystem. Environ Microbiol 7, 1116–1126. Germer, S. & Higuchi, R. (1999). Single-tube genotyping without
oligonucleotide probes. Genome Res 9, 72–78. Germer, S. & Higuchi, R. (2003). Homogeneous allele-specific PCR
in SNP genotyping. Methods Mol Biol 212, 197–214. Heid, C. A., Stevens, J., Livak, K. J. & Williams, P. M. (1996). Real
time quantitative PCR. Genome Res 6, 986–994. He´zard, N., Cornillet, P., Droulle´, C., Gillot, L., Potron, G. & Nguyen, P. (1997). Factor V Leiden: detection in whole blood by
ASA PCR using an additional mismatch in antepenultimate position. Thromb Res 88, 59–66. Hunter, P. R. & Gaston, M. A. (1988). Numerical index of the dis-
criminatory ability of typing systems: an application of Simpson’s index of diversity. J Clin Microbiol 26, 2465–2466. Huygens, F., Stephens, A. J., Nimmo, G. R. & Giffard, P. M. (2004).
REFERENCES Altekruse, S. F., Stern, N. J., Fields, P. I. & Swerdlow, D. L. (1999).
Campylobacter jejuni – an emerging foodborne pathogen. Emerg Infect Dis 5, 28–35. Best, E. L., Fox, A. J., Frost, J. A. & Bolton, F. J. (2004). Identification
of Campylobacter jejuni multilocus sequence type ST-21 clonal complex by single-nucleotide polymorphism analysis. J Clin Microbiol 42, 2836–2839.
mecA locus diversity in methicillin-resistant Staphylococcus aureus isolates in Brisbane, Australia, and the development of a novel diagnostic procedure for the Western Samoan phage pattern clone. J Clin Microbiol 42, 1947–1955. Kapperud, G., Espeland, G., Wahl, E. & 7 other authors (2003).
Factors associated with increased and decreased risk of Campylobacter infection: a prospective case-control study in Norway. Am J Epidemiol 158, 234–242. Livak, K. J. (1999). Allelic discrimination using fluorogenic probes
Best, E. L., Fox, A. J., Frost, J. A. & Bolton, F. J. (2005). Real-time
and the 59 nuclease assay. Genet Anal 14, 143–149.
single-nucleotide polymorphism profiling using Taqman technology for rapid recognition of Campylobacter jejuni clonal complexes. J Med Microbiol 54, 919–925.
Maiden, M. C., Bygraves, J. A., Feil, E. & 10 other authors (1998).
Clark, C. G., Bryden, L., Cuff, W. R., Johnson, P. L., Jamieson, F., Ciebin, B. & Wang, G. (2005). Use of the Oxford multilocus
Multilocus sequence typing: a portable approach to the identification of clones within populations of pathogenic microorganisms. Proc Natl Acad Sci U S A 95, 3140–3145. Meinersmann, R. J., Helsel, L. O., Fields, P. I. & Hiett, K. L. (1997).
sequence typing protocol and sequencing of the flagellin short variable region to characterise isolates from a large outbreak of waterborne Campylobacter sp. strains in Walkerton, Ontario, Canada. J Clin Microbiol 43, 2080–2091.
Discrimination of Campylobacter jejuni isolates by fla gene sequencing. J Clin Microbiol 35, 2810–2814.
Dingle, K. E., Colles, F. M., Wareing, D. R. & 7 other authors (2001a). Multilocus sequence typing system for Campylobacter jejuni.
J Clin Microbiol 39, 14–23.
typing of flaB is a more stable screening tool than typing of flaA for monitoring of Campylobacter populations. J Clin Microbiol 42, 4840–4842.
http://jmm.sgmjournals.org
1069
Mellmann, A., Mosters, J., Bartelt, E., Roggentin, P., Ammon, A., Friedrich, A. W., Karch, H. & Harmsen, D. (2004). Sequence-based
E. P. Price and others Mhlanga, M. M. & Malmberg, L. (2001). Using molecular beacons to
detect single-nucleotide polymorphisms with real-time PCR. Methods 25, 463–471. Miller, W. G., On, S. L. W., Wang, G., Fontanoz, S., Lastovica, A. J. & Mandrell, R. E. (2005). Extended multilocus sequence typing system
for Campylobacter coli, C. lari, C. upsaliensis, and C. helveticus. J Clin Microbiol 43, 2315–2329. Nachamkin, I., Bohachick, K. & Patton, C. M. (1993). Flagellin gene
typing of Campylobacter jejuni by restriction fragment length polymorphism analysis. J Clin Microbiol 31, 1531–1536. Newton, C. R., Graham, A., Heptinstall, L. E., Powell, S. J., Summers, C., Kalsheker, N., Smith, J. C. & Markham, A. F. (1989). Analysis of
any point mutation in DNA. The amplification refractory mutation system (ARMS). Nucleic Acids Res 17, 2503–2516. O’Reilly, L. C., Inglis, T. J., Unicomb, L., the Australian Campylobacter Subtyping Study Group (2006). Australian multicentre
comparison of subtyping methods for the investigation of Campylobacter infection. Epidemiol Infect (epub ahead of print).
outbreaks of gastroenteritis caused by Campylobacter jejuni. J Clin Microbiol 41, 4733–4739. Schouls, L. M., Reulen, S., Duim, B., Wagenaar, J. A., Willems, R. J., Dingle, K. E., Colles, F. M. & Van Embden, J. D. (2003). Comparative
genotyping of Campylobacter jejuni by amplified fragment length polymorphism, multilocus sequence typing, and short repeat sequencing: strain diversity, host range, and recombination. J Clin Microbiol 41, 15–26. Shi, M. M. (2001). Enabling large-scale pharmacogenetic studies by
high-throughput mutation detection and genotyping technologies. Clin Chem 47, 164–172. Spratt, B. G., Hanage, W. P., Li, B., Aanensen, D. M. & Feil, E. J. (2004). Displaying the relatedness among isolates of bacterial
species – the eBURST approach. FEMS Microbiol Lett 241, 129–134. Stephens, A. J., Huygens, F., Inman-Bamber, J., Price, E. P., Nimmo, G. R., Schooneveldt, J., Munckhof, W. & Giffard, P. M. (2006).
Papp, A. C., Pinsonneault, J. K., Cooke, G. & Sadee, W. (2003).
Methicillin resistant Staphylococcus aureus genotyping using a small set of polymorphisms. J Med Microbiol 55, 43–51.
Single nucleotide polymorphism genotyping using allele-specific PCR and fluorescence melting curves. Biotechniques 34, 1068–1072.
Suerbaum, S., Lohrengel, M., Sonnevend, A., Ruberg, F. & Kist, M. (2001). Allelic diversity and recombination in Campylobacter jejuni.
Pebody, R. G., Ryan, M. J. & Wall, P. G. (1997). Outbreaks of campylobacter infection: rare events for a common pathogen. Commun Dis Rep CDR Rev 7, R33–R37.
J Bacteriol 183, 2553–2559.
Robertson, G. A., Thiruvenkataswamy, V., Shilling, H., Price, E. P., Huygens, F., Henskens, F. A. & Giffard, P. M. (2004). Identification
Wassenaar, T. M., Fry, B. N. & van der Zeijst, B. A. (1995). Variation
and interrogation of highly informative single nucleotide polymorphism sets defined by bacterial multilocus sequence typing databases. J Med Microbiol 53, 35–45. Sails, A. D., Swaminathan, B. & Fields, P. I. (2003). Utility of multi-
locus sequence typing as an epidemiological tool for investigation of
1070
van Belkum, A. (2003). High-throughput epidemiologic typing in
clinical microbiology. Clin Microbiol Infect 9, 86–100. of the flagellin gene locus of Campylobacter jejuni by recombination and horizontal gene transfer. Microbiology 141, 95–101. Wu, D. Y., Ugozzoli, L., Pal, B. K. & Wallace, R. B. (1989). Allele-
specific enzymatic amplification of beta-globin genomic DNA for diagnosis of sickle cell anemia. Proc Natl Acad Sci U S A 86, 2757–2760.
Journal of Medical Microbiology 55