System. Appl. Microbiol. 27, 175–185 (2004) http://www.elsevier-deutschland.de/syapm
Use of the Genomic Signature in Bacterial Classification and Identification Tom Coenye and Peter Vandamme Laboratorium voor Microbiologie, Ghent University, Gent, Belgium Received: July 3, 2003
Summary In this study we investigated the correlation between dinucleotide relative abundance values (the genomic signature) obtained from bacterial whole-genome sequences and two parameters widely used for bacterial classification, 16S rDNA sequence similarity and DNA-DNA hybridisation values. Twenty-eight completely sequenced bacterial genomes were included in the study. The correlation between the genomic signature and DNA-DNA hybridisation values was high and taxa that showed less than 30% DNA-DNA binding will in general not have dinucleotide relative abundance dissimilarity (δ*) values below 40. On the other hand, taxa showing more than 50% DNA-DNA binding will not have δ* values higher than 17. Our data indicate that the overall correlation between genomic signature and 16S rDNA sequence similarity is low, except for closely related organisms (16S rDNA similarity > 94%). Statistical analysis of δ* values between different subgroups of the Proteobacteria indicate that the β- and γ-Proteobacteria are more closely related to each other than to the other subgroups of the Proteobacteria and that the αand ε-Proteobacteria form clearly separate subgroups. Using the genomic signature we have also predicted DNA-DNA binding values for fastidious or unculturable endosymbionts belonging to the genera Rickettsia, Wigglesworthia and Buchnera. Key words: genomic signature – DNA-DNA hybridisation – 16S rDNA sequencing
Introduction The percent DNA-DNA hybridisation is considered the “gold standard” for species delineation: prokaryotic species are considered to be groups of strains sharing 50 to 70% DNA reassociation and 5 to 7% difference in thermal stability between the homologuous and heterologuous duplexes [67]. Within many well-defined species, DNA reassociation values are even above 70% [55, 73]. Lower but significant DNA relatedness (i.e. in the 30 to 50% range) denotes the range of hybridisation values below the species level while values below 30% can be considered nonsignificant [22, 70]. The direct sequencing of 16S rDNA molecules by PCR technology provides a phylogenetic framework which serves as the backbone of microbial taxonomy. However, the resolution of 16S rDNA analysis between closely related species is generally low and there is no treshold value of 16S rRNA sequence similarity for species recognition [19, 61]. Organisms with less than 97% 16S rRNA sequence similarity will generally not give DNA association values of more than 60% [61], although extensive within-species 16S rDNA sequence diversity has been reported for members of the ε-Proteobacteria [27, 72].
Despite the fact that sequencing of the 16S rDNA and DNA-DNA hybridisation experiments are the cornerstones of modern microbial taxonomy [58, 63, 73] there are problems associated with both methods. DNA reassociation values do not represent the actual sequence similarity since DNA heteroduplexes will only form between strands that show at least 80% sequence complementarity; therefore a difference of 20% of sequence similarity may be spread out between 0 and 100% DNA reassociation [55]. The comparison of 16S rDNA sequences has been particularly useful for taxa above the rank of species, but, as mentioned above, sequence similarities of 16S rDNA are not sufficient to define bacterial species. Analysis of complete bacterial genome sequences is shedding a new light on bacterial taxonomy and phylogeny. However, there is no consensus on how this wealth of information revealed by whole-genome sequencing can be applied to bacterial classification. Alignments and analysis of large numbers of conserved genes give inconsistent results [17, 54]; in addition, this approach suffers from similar drawbacks as the 16S rDNA approach. Wholegenome comparisons based on the presence and absence 0723-2020/04/27/02-175 $ 30.00/0
176
T. Coenye and P. Vandamme
of orthologous genes or families of genes [5, 18], presence and absence of conserved insertions and deletions [24–26], differences in gene content [59], or distances between genes in genomes [30] have been proposed as alternatives. In addition, DNA microarray technology has facilitated the comparison of complete genome sequences (see for example references 13 and 41). In the analysis of dinucleotide relative abundance values, the complete sequence of the genome is used, without the prior requirement for alignment [8, 31, 32]. Dinucleotide relative abundance values are constant within a genome. It has been hypothesised that this is due to factors that work on them which are constant throughout the genome. It has also been postulated that the set of dinucleotide relative abundance values constitute a genomic signature that reflects the pressures of these factors [31]. In the present study we investigate the correlation between the genomic signature, 16S rDNA sequence similarity and DNA-DNA hybridisation experiments.
Material and Methods Whole-genome sequence data The complete genome sequences used in this study are shown in Table 1. Species included in this study were selected to represent a variety of major bacterial lineages, including the α-, β-, γ-
and ε-Proteobacteria and the Gram-positive organisms. In addition, for all species included extensive DNA-DNA hybridisation data and/or multiple 16S rDNA sequences were available. 16S rDNA sequence data 16S rDNA sequences for all species (Table 1) were downloaded from the GenBank database. Multiple sequences from the same species were included, whenever possible, to account for intraspecies variability. The number of 16S rDNA sequences included for each species are shown in parentheses: Caulobacter crescentus (1), Rickettsia prowazekii (1), Rickettsia conorii (1), Xanthomonas campestris (7), Xanthomonas axonopodis (3), Pseudomonas putida (10), Pseudomonas aeruginosa (10), Yersinia pestis (14), Yersinia enterocolitica (11), Salmonella enterica sv. Typhi (2), Salmonella enterica sv. Typhimurium (3), Escherichia coli (11), Shigella flexneri (2), Buchnera sp. APS (1), Buchnera aphidicola (1), Wigglesworthia glossinidia (1), Bordetella pertussis (4), Bordetella parapertussis (4), Bordetella bronchiseptica (6), Campylobacter jejuni (12), Helicobacter pylori (16), Staphylococcus aureus (12) and Streptococcus pneumoniae (3). Sequences were included if their length was at least 1300 nucleotides and if they contained less than 0.5% ambiguities. If multiple sequences from the same strain were present, the most recent one was included. Phylogenetic analyses and bootstrap analysis (1000 replicates) was performed using the Kodon software package (Applied Maths, Kortrijk, Belgium); a phylogenetic tree was constructed using the neighbour-joining method [57].
Table 1. Whole-genome sequences investigated in this study. Organism
Strain designation
Abbreviation
GenBank accession no.
Phylogenetic position
Reference
Caulobacter crescentus Rickettsia prowazekii Rickettsia conorii Xanthomonas campestris Xanthomonas axonopodis Pseudomonas putida Pseudomonas aeruginosa Yersinia pestis Yersinia pestis Yersinia enterocolitica Salmonella enterica sv. Typhi Salmonella enterica sv. Typhimurium Escherichia coli Escherichia coli Escherichia coli Escherichia coli Shigella flexneri sv. 2a Buchnera sp. Buchnera aphidicola Wigglesworthia glossinidia Bordetella pertussis Bordetella parapertussis Bordetella bronchiseptica Campylobacter jejuni Helicobacter pylori Helicobacter pylori Staphylococcus aureus Streptococcus pneumoniae
CB15 Madrid E Malish 7 ATCC 33913 306 KT2440 PAO1 CO92 KIM 8081 CT18 LT2 K12 MG1655 O157:H7 Sakai O157:H7 CFT073 301 APS Sg brevipalpis Tohama I 12822 RB50 NCTC 11168 26695 J99 Mu50 TIGR4
Cc Rp Rc Xc Xa Pp Pa Yp Yp Ye St Sty Ec Ec Ec Ec Sf Bs Ba Wg Bpe Bpa Bb Cj Hp Hp Sa Sp
AE005673 AJ235269 AE006914 AE008922 AE008923 AE015451 AE004091 AL590842 AE009952
α α α γ γ γ γ γ γ γ γ γ γ γ γ γ γ γ γ γ β β β ε ε ε Gram-positive Gram-positive
41 4 42 15 15 39 60 47 16 unpublisheda 45 36 6 27 48 72 35 55 61 1 unpublisheda unpublisheda unpublisheda 46 63 3 32 62
a
AL513382 AE006468 U00096 BA000007 AE005174 AE014075 AE005674 AP000398 AE013218 BA000021
AL111168 AE000511 AE001439 BA000017 AE005672
These sequence data were produced by the Sanger Institute and can be obtained from ftp://ftp.sanger.ac.uk/pub/pathogens/ye/, ftp://ftp.sanger.ac.uk/pub/pathogens/bp/, ftp://ftp.sanger.ac.uk/pub/pathogens/bpa/ and ftp://ftp.sanger.ac.uk/pub/pathogens/bb/, respectively.
++ +
+
– – –– –– + ––– +
––– +
+ + ++ –
–––
+++
+++
++ –
–– –– + –– + – –– + – +++ +++ ––
AA AC AT CA CC CG GC GG GT TA TG TT
Overrepresentation is indicated by + (1.23 ≤ ρ* < 1.30), ++ (1.30 ≤ ρ* < 1.50) and +++ (ρ* ≥ 1.50), while underrepresentation is indicated by – (0.70 < ρ* ≤ 0.78), – – (0.50 < ρ* ≤ 0.70) and – – – (ρ* ≤ 0.50).
– –– – –
–– –– ––
– –
+ ++ –– ++
+
+ ++
++
++
+
+ +
+ +
++ ++ ++
++
– +++ +++
+++ ++ –– –– +++
+++
––
+ ++ –– +++ ++ – +++
Sty St Ye Yp Pa Pp Xa Xc Rc Rp Cc XY
Table 2. Significantly over- or underrepresented dinucleotides in the genomes investigated.
Ec
Sf
Bs
Ba
Bb Bpa Bpe Wg
Cj
Hp
Sa
Sp
Genomic Signature in Bacterial Taxonomy
177
DNA-DNA hybridisation data DNA-DNA hybridisation data were taken from previously published studies. Data for members of the family Enterobacteriaceae were taken from references 7, 9–11, 14 and 43. DNADNA hybridisation data for Pseudomonas and Xanthomonas species were taken from references 46 and 47, and 72, respectively. Data for H. pylori were taken from reference 77. Finally, data for species belonging to the genus Bordetella were taken from reference 34. Previous studies have indicated that there are differences in the DNA-DNA hybridisation values obtained depending on the methods used. The DNA-DNA binding values described in this study were obtained with the hydroxyapatite method, the optical renaturation method and/or membrane filter hybridisation methods and several studies have indicated that these different methods correlate well [23, 29, 33]. Determination of dinucleotide relative abundance values We determined the dinucleotide relative abundance value for each genome. Sequences were concatenated with their inverted complementary sequence using revseq, yank and union (part of the EMBOSS package, http://www.hgmp.mrc.ac.uk/software/ EMBOSS). Mononucleotide and dinucleotide frequencies were calculated using Artemis 4.0 [56] and compseq (EMBOSS), respectively. Dinucleotide relative abundances ρ*XY were calculated using the equation ρ*XY = fXY/fXfY where fXY denotes the frequency of dinucleotide XY and fX and fY denote the frequencies of X and Y, respectively [32]. Statistical theory and data from previous studies [31, 32] indicate that the normal range of ρ*XY, is between 0.78 and 1.23. In this study we used the refined criteria of discrimination proposed by Karlin et al. [31]. Overrepresentation is indicated by + (1.23 ≤ ρ* < 1.30), ++ (1.30 ≤ ρ* < 1.50) and +++ (ρ* ≥ 1.50), while underrepresentation is indicated by – (0.70 < ρ* ≤ 0.78), – – (0.50 < ρ* ≤ 0.70) and – – – (ρ* ≤ 0.50). The dissimilarities in relative abundance of dinucleotides between two sequences (f and g) were calculated using the equation described by Karlin et al. [31]: δ*(f,g)=1/16Σ|ρ*XY(f) – ρ*XY(g)| (multiplied by 1000 for convenience), where the sum extends over all dinucleotides. For convenience we calculated global δ*(f,g) values based on the global relative abundance values, instead of calculating δ*(f,g) values for every pairwise comparison of 50 kb windows between genomes averaging these individual comparisons. Statistical analysis Statistical analyses were performed using the SPSS 11.0.1 software package (SPSS Inc., Chicago, IL).
Results Phylogenetic tree A phylogenetic tree based on representative 16S rDNA sequences of all taxa included in this study is shown in Fig. 1. Dinucleotide relative abundances Significantly over- or underrepresented dinucleotides are shown in Table 2. The values for AG, CT, GA and TC were in the normal range for all genomes investigated. GC is overrepresented in most genomes, while TA is underrepresented in most genomes. The genomes of both Xanthomonas species, the C. jejuni genome and both H. pylori genomes each had seven dinucleotides that were
178
T. Coenye and P. Vandamme
under- or overrepresented. The genomes of C. crescentus, R. prowazekii, R. conorii, P. aeruginosa, Y. pestis, E. coli, S. flexneri, S. aureus and S. pneumoniae had only two dinucleotides that were under- or overrepresented, while the genomes of both S. enterica serovars had only one dinculeotide that was under- or overrepresented. The
genomes of R. prowazekii, R. conorii, B. aphidicola, Buchnera sp. APS and W. glossinidia had very high ρ*AT values (2.40, 2.02, 2.70, 2.67 and 3.31, respectively) (data not shown). Overall correlation between 16S rDNA sequence similarities and δ* values Average 16S rDNA sequence similarities and δ* values for all taxa examined are shown in Table 3. A scatterplot of this Table is shown in Fig. 2. The correlation between both parameters is low (r2 = 0.410) but still significant (P < 0.001). Logarithmic transformations did not significantly increase the correlation (data not shown). The data can be subdivided into two groups (Fig. 2): a first group with low δ* values and high 16S rDNA sequence similarities (δ* < 86, 16S rDNA similarity > 94%) and a second group with moderate to low 16S rDNA sequence similarities (< 94% and varying δ* values). There is a strong linear correlation between the two parameters for the data in the first group (r2 = 0.931, P < 0.001) (Fig. 2); no significant correlation is found between both parameters for the data in the other group (data not shown). Overall correlation between DNA-DNA hybridisation values and δ* values
Fig. 1. Phylogenetic tree based on 16S rDNA sequences of the taxa studied. Scale-bar indicates 10% sequence dissimilarity.
DNA-DNA hybridisation values and δ* values for all taxa examined are shown in Table 4. A scatterplot of this Table is shown in Fig. 3. The correlation between both parameters is high (r2 = 0.829, P < 0.001) and the relationship between them was best described by the equation: DNA-DNA binding value = –0.00007 δ*3 + 0.0212 δ*2 – 2.1553 δ* + 88.779.
α-Proteobacteria C. crescentus and both Rickettsia species occupy a distinct position in the phylogenetic tree based on 16S rDNA sequences (Fig. 1). R. prowazekii and R. conorii are closely related to each other (98.47% 16S rDNA sequence similarity and δ* value of 81.36), and more distantly related to C. crescentus (16S rDNA sequence similarity > 85%, δ* value higher than 250). δ* values of αproteobacterial species with representatives of other major bacterial lineages range from 53.11 (P. aeruginosa) to 300.68 (C. jejuni).
Fig. 2. a) Scatterplot of the relationship between 16S rDNA sequence similarities and δ* values. See text for details. b) Scatterplot of the relationship between 16S rDNA sequence similarities and δ* values for the subset indicated in Fig 2a. See text for details.
DNA-DNA hybridisation value
Fig. 3. Scatterplot of the relationship between DNA-DNA binding values and δ* values.
Table 3. Average 16S rDNA sequence similarities (lower triangle) and δ* values (upper triangle) of the taxa examined.
Genomic Signature in Bacterial Taxonomy 179
– 86.0 ± 2.8 (2) 43.0 ± 4.0 (5) –
– 3.28 35.75 ± 4.42 (4) 38.64
91.4 ± 4.4 (5)
DNA-DNA hybridisation values 85.8 ± 8.6 (6) 74.7 ± 9.2 (10) 94.5 ± 7.8 (2) 73.5 ± 7.6 (15) 85.4 ± 9.5 (10
DNA-DNA hybridisation values 82.3 ± 5.0 (3)
B. pertussis B. parapertussis B. bronchiseptica
H. pylori
Y. pestis Y. enterocolitica S. e. Typhi S. e. Typhimurium E. coli S. flexneri
Y. pestis Y. enterocolitica S. e. Typhi S. e. Typhimurium E. coli S. flexneri
– 8.8
δ* values – 3.98 10.93 δ* values 9.41
5.31 ± 3.10 (6) 8.27 ± 1.08 (4)
94.0 (1) 86.1 ± 5.0 (14)
– 38.17 ± 3.87 (4) 41.28
94.0 ± 4.5 (4) 39.0 ± 6.0 (10) 29.8 ± 14.6 (5)
–
δ* values – 85.15
DNA-DNA hybridisation values 62.3 ± 11.3 (11) 12.3 ± 9.3 (10) 95.1 ± 4.4 (8)
P. putida P. aeruginosa
DNA-DNA hybridisation values 98.1 ± 1.8 (11) 39.0 ± 13.5 (5) 84.35 ± 7.9 (218) – – 20.7 ± 2.7 (4) – 20.5 ± 4.2 (4) – – – δ* values 0.46 20.35 ± 0.14 (2) – 79.53 ± 0.41 (2) 77.51 82.15 ± 0.03 (2) 80.4 57.62 ± 3.13 (8) 63.93 ± 2.45 (4) 60.36 ± 0.70 (2) 66.90
–
δ* values – 17.20
DNA-DNA hybridisation values 87.0 ± 7.0 (36) 24.0 ± 6.0 (20) 77.0 ± 15.0 (215)
X. campestris X. axonopodis
Table 4. DNA-DNA hybridisation values and δ* values of the taxa examined.
–
–
83.0 (1)
180 T. Coenye and P. Vandamme
Genomic Signature in Bacterial Taxonomy
181
Table 5. Average δ* values between major bacterial lineages.
1. α-Proteobacteria 2. β-Proteobacteria 3. γ-Proteobacteria 4. ε-Proteobacteria 5. Gram-positive organisms
1.
2.
3.
4.
5.
– 195.38 ± 78.78 192.51 ± 61.29 190.25 ± 73.45 160.47 ± 37.89
– 135.17 ± 18.00 282.60 ± 23.66 132.78 ± 60.29
– 211.04 ± 46.48 150.97 ± 44.26
– 141.50 ± 89.05
–
β-Proteobacteria
ε-Proteobacteria
The genus Bordetella contains three named species, but DNA-DNA hybridisation studies have indicated that these belong to a single genomic species (21, 31, 65). This is confirmed by their very similar relative dinucleotide abundance values (3.98 ≤ δ* ≤ 10.93).
C. jejuni and H. pylori both belong to the ε-Proteobacteria; the δ* value between both species is 124.88. The ε-Proteobacteria are well-separated from the other Proteobacteria based on 16S rDNA sequence data [20] (Fig. 1) and this is confirmed by the high δ* values of both taxa with other Proteobacteria (Table 3). The whole-genome sequence of two H. pylori strains has been determined; their relative dinucleotide abundances are very similar (δ* = 9.41).
γ-Proteobacteria All taxa belonging to the γ-Protebacteria group together in the 16S rDNA-based phylogenetic tree (Fig. 1). δ* values between different species within the γ-Proteobacteria range from 17.20 to 419.71 (Table 3). Both representatives of the genus Xanthomonas group closely together in the phylogenetic tree and have similar relative dinucleotide abundance values (δ* = 17.20). DNA-DNA hybridisation values between X. campestris and X. axonopodis are low (Table 4). P. putida and P. aeruginosa also group together in the phylogenetic tree but the δ* value between both of these taxa is higher (85.15). Representatives of both taxa showed no significant DNA-DNA hybridisation values (Table 4). Members of the family Enterobacteriaceae form a well-separated subcluster in the phylogenetic tree. This is confirmed by the observed δ* values between the different enterobacterial species, which are in the relatively narrow range of 20.35 to 82.15. Both S. enterica serovars clearly belong to the same species based on DNA-DNA hybridisation studies (14, Table 4) and this is confirmed by their very similar dinucleotide relative abundance values. The same is true for E. coli and S. flexneri [10, 33, 49] (Table 4). Average δ* values between different E. coli and Y. pestis strains are 5.31 and 0.46, respectively (Table 4). The Buchnera species investigated group together in the 16S rDNA based phylogenetic tree and have similar relative dinucleotide abundance values (δ* = 17.24). They are most closely related to the Enterobacteriaceae, albeit at a relatively low level of 16S rDNA sequence similarity (88.15–89.78%). This is reflected in high δ* values, ranging from 177.61 to 228.79 (Table 3). W. glossinidia is only distantly related to the other γ-Proteobacteria (16S rDNA sequence similarity < 87%). This is reflected by high δ* values (Table 3). The relatively low δ* values between W. glossinidia and both Buchnera species can be attributed to the overrepresentation of AT in these genomes.
Gram-positive organisms The Gram-positive organisms S. aureus and S. pneumoniae can clearly separated from the other organisms included in this study by their 16S rDNA sequence. Both organisms are also different from all other organisms based on differences in relative dinucleotide abundances (which are higher than 100 and 83, respectively).
Discussion Correlation between δ* values, 16S rDNA sequence similarity and DNA-DNA hybridisation experiments Our data clearly show that, overall, there is only a low (albeit significant) correlation between δ* values and 16S rDNA sequence similarity. Nevertheless, there is an almost perfect linear correlation between both parameters for taxa that share more than 94% 16S rDNA sequence similarity (Fig. 2). 97% 16S rDNA sequence similarity roughly corresponds to a δ* value of 40. The overall correlation between δ* values and DNA-DNA hybridisation values is very high (Fig. 3). Data from the present study indicate that taxa which show more than 50% DNA-DNA binding do not have δ* values above 17, while taxa which show more than 70% DNA-DNA binding do not have δ* values above 11. In addition, taxa that show less than 30% DNA-DNA binding do not have δ* values below 40. Application of relative dinucleotide abundance values in bacterial taxonomy Based on 16S and 23S rDNA sequences, five major lines of descent are recognised within the Proteobacteria (designated α, β, γ, δ and ε, respectively) [20, 40, 62, 76]. The δ- and ε-Proteobacteria are clearly separated from
182
T. Coenye and P. Vandamme
each other and from the other subgroups [20, 37, 71]. The separation of the β and γ subgroups is less clear and it has been suggested that the β-Proteobacteria are actually a subgroup of the γ-Proteobacteria [13, 20, 76]. The α-Proteobacteria are more closely related to the β- and γ-Proteobacteria than to the two other subgroups but clearly form a separate lineage [20, 71]). Based on the sequences of conserved macromolecules and protein signatures, the Proteobacteria are clearly different from other major prokaryotic lineages, including Gram-positive organisms [24, 76]. From the comparison of the average δ* values between representatives of the major groups (Table 5), it is obvious that the β- and γ-Proteobacteria are more similar to each other than to the α- and ε-Proteobacteria (P < 0.001) and that the α-Proteobacteria are more similar to the ε-Proteobacteria than are the β- (P < 0.01) or γ-Proteobacteria (P < 0.05). These data indicate that the α-Proteobacteria are not more similar to the β- or γ-Proteobacteria than to the ε-Proteobacteria, contradicting data previously obtained from the comparison of conserved macromolecules. Interestingly, these data also indicate that the β- and γ-Proteobacteria appear closer related to the Gram-positive organisms than to the ε-Proteobacteria (P < 0.001); this is not the case for the α-Proteobacteria. However, it should be noted that the number of taxa investigated for the α- and β-Proteobacteria is relatively low and that this may influence the statistical analysis. Based on partial genome sequences, Karlin et al [31, 32] observed some general trends; of these trends, the general underrepresentation of TA and the overrepresentation of GC in β-, γ- and ε-Proteobacteria were confirmed in the present study. The overrepresentation of AT previously observed in α-proteobacterial genomes is obvious in both Buchnera genomes and the W. glossinidia genome but is not seen in the C. crescentus genome, and, in contrast to previous reports, dinucleotide extremes can be observed in the S. aureus genome (Table 2). At present there are organisms that have not been included in DNA-DNA hybridisation experiments because they can not be cultured in a convenient way. These include members of the genus Rickettsia (obligately intracellular microorganisms belonging to the α-Proteobacteria [73]) and of the genera Buchnera and Wigglesworthia (obligate endosymbionts of aphids belonging to the γ-Proteobacteria) [2]. In the absence of DNA-DNA hybridisation values and biochemical characterisation, these organisms are classified according to the phylogenetic relationships as deduced from the sequence analysis of conserved macromolecules. The whole-genome sequences of several of these intracellular parasites have become available over the past few years [1, 4, 45, 58, 64] allowing to compare their dinucleotide relative abundance values (Table 2). Using the previously determined equation describing the relationship between DNA-DNA binding value and δ* value, and using the δ* values shown in Table 3, we estimate (i) a DNA-DNA binding value of approximately 58% between Buchnera sp. APS and B. aphidicola, (ii) a DNA-DNA binding value of approximately 13% between both Rickettsia species, and (iii) a DNADNA binding value of approximately 25% and 30% be-
tween W. glossinidia and Buchnera sp. APS and B. aphidicola, respectively. This suggests that Buchnera species isolated as endosymbionts from different aphid species belong to the same genomic species [67]. In addition, our data confirm the results from a previous study based on 16S rDNA sequences in which it was shown that Wigglesworthia is different from Buchnera [2]. The taxonomy of the genus Rickettsia is at present mainly based on 16S rDNA sequence data, but the limited number of DNA-DNA hybridisation data available suggest that there are many synonymous species within this genus [53]. Our data indicate that at least the two major phylogenetic groups within the genus Rickettsia (the typhus group [including R. prowazekii] and the spotted fever group [including R. conorii]) belong to different genomic species as defined by DNA-DNA hybridisation criteria [67, 73]. Our data also clearly indicate that AT is highly overrepresented in the genomes of R. prowazekii, R. conorii, B. aphidicola, Buchnera sp. APS and W. glossinidia (Table 2), which may reflect an adaptation to life as endosymbionts of eukaryotic organisms.
Conclusions Our data indicate that the overall correlation between genomic signature and 16S rDNA sequence similarity is low but significant, except for closely related organisms (16S rDNA sequence similarity > 94%) where the correlation is high. The correlation between genomic signature and DNA-DNA hybridisation values was high and taxa that show less than 30% DNA-DNA binding will in general not have δ* values below 40. On the other hand, taxa showing more than 50% DNA-DNA binding will not have δ* values higher than 17. Statistical analysis of δ* values between different subgroups of the Proteobacteria indicate that the β- and γ-Proteobacteria are more closely related to each other than to the other subgroups of the Proteobacteria and that the α- and ε-Proteobacteria form clearly separate subgroups. We have also shown that the genomic signature can be used to predict DNA-DNA binding values when DNA-DNA hybridisation data can not be obtained. Acknowledgements T. C. and P. V. are indebted to the Fund for Scientific Research – Flanders (Belgium) for a position as postdoctoral fellow and research grants, respectively. T.C. also acknowledges the support from the Belgian Federal Government (Federal Office for Scientific, Technical and Cultural Affairs).
References 1. Akman, L., Yamashita, A., Watanabe, H., Oshima, K., Shiba, T., Hattori, M., Aksoy, S.: Genome sequence of the endocellular obligate symbiont of tsetse flies, Wigglesworthia glossinidia. Nat. Genet. 32, 402–407 (2002). 2. Aksoy, S.: Wigglesworthia gen. nov. and Wigglesworthia glossinidia sp. nov., taxa consistsing of the mycetocyte-as-
Genomic Signature in Bacterial Taxonomy
3.
4.
5. 6.
7.
8. 9.
10. 11.
12. 13.
14. 15.
sociated, primary endosymbionts of tsetse flies. Int. J. Syst. Bacteriol. 45, 848–851 (1995). Alm R.A., Ling, L.S., Moir, D.T., King, B.L., Brown, E.D., Doig, P.C., Smith, D.R., Noonan, B., Guild, B.C., de Jonge, B.L., Carmel, G., Tummino, P.J., Caruso, A., Uria-Nickelsen, M., Mills, D.M., Ives, C., Gibson, R., Merberg, D., Mills, S.D., Jiang, Q., Taylor, D.E., Vovis, G.F., Trust, T.J.: Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397, 176–180 (1999). Andersson, S.G.E., Zomorodipour, A., Andersson, J.O., Sicheritz-Ponten, T., Alsmark, U.C.M., Podowski, R.M., Näslund, A.K., Eriksson, A.S., Winkler, H.H., Kurland, C.G.: The genome sequence of Rickettsia prowazekii and the origin of mitochondria. Nature 396, 133–140 (1998). Bansal, A.K., Meyer, T.E.: Evolutionary analysis by wholegenome comparisons. J. Bacteriol. 184, 2260–2272 (2002). Blattner F.R., Plunkett, G., Bloch, C.A., Perna, N.T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J.D., Rode, C.K., Mayhew, G.F., Gregor, J., Davis, N.W., Kirkpatrick, H.A., Goeden, M.A., Rose, D.J., Mau, B., Shao, Y.: The complete genome sequence of Escherichia coli K-12. Science 277, 1453–1474 (1997). Bercovier, H., Mollaret, H.H., Alonso, J.M., Brault, J., Fanning, G.R., Steigerwalt, A.G., Brenner, D.J.: Intra- and interspecies relatedness of Yersinia pestis by DNA hybridization and its relationship to Yersinia pseudotuberculosis. Curr. Microbiol. 4, 225–229 (1980). Burge, C., Campbell, A.M., Karlin, S.A.: Over- and underrepresentation of short oligonucleotides in DNA sequences. Proc. Natl. Acad. Sci. USA 89, 1358–1362 (1992). Brenner, D.J., Fanning, G.R., Johnson, K.E., Citarella, R.V., Falkow, S.: Polynucleotide sequence relationships among members of the Enterobacteriaceae. J. Bacteriol. 98, 637–650 (1969). Brenner, D.J., Fanning, G.R., Miklos, G.V., Steigerwalt, A.G.: Polynucleotide sequence relatedness among Shigella species. Int. J. Syst. Bacteriol. 23, 1–7 (1973). Brenner, D.J., Ursing, J., Bercovier, H., Steigerwalt, A.G., Fanning, G., Alonso, J.M., Mollaret, H.H.: Deoxyribonucleic acid relatedness in Yersinia enterocolitica and Yersinia enterocolitica-like organisms. Curr. Microbiol. 4, 195–200 (1980). Campbell, A., Mrazek, J., Karlin, S.: Genome signature comparisons among prokaryote, plasmid, and mitochondrial DNA. Proc. Natl. Acad. Sci. USA 96, 9184–9189 (1999). Cho, J.C., Tiedje, J.M.: Bacterial species determination from DNA-DNA hybridization by using genome fragments and DNA microarrays. Appl Environ Microbiol. 67, 3677–3682 (2001). Crosa, J.H., Brenner, D.J., Ewing, W.H., Falkow, S. Molecular relationships among the Salmonelleae. J. Bacteriol. 115, 307–315 (1973). da Silva, A.C., Ferro, J.A., Reinach, F.C., Farah, C.S., Furlan, L.R., Quaggio, R.B., Monteiro-Vitorello, C.B., Van Sluys, M.A., Almeida, N.F., Alves, L.M., do Amaral, A.M., Bertolini, M.C., Camargo, L.E., Camarotte, G., Cannavan, F., Cardozo, J., Chambergo, F., Ciapina, L.P., Cicarelli, R.M., Coutinho, L.L., Cursino-Santos, J.R., El-Dorry, H., Faria, J.B., Ferreira, A.J., Ferreira, R.C., Ferro, M.I., Formighieri, E.F., Franco, M.C., Greggio, C.C., Gruber, A., Katsuyama, A.M., Kishi, L.T., Leite, R.P., Lemos, E.G., Lemos, M.V., Locali, E.C., Machado, M.A., Madeira, A.M., Martinez-Rossi, N.M., Martins, E.C., Meidanis, J., Menck, C.F., Miyaki, C.Y., Moon, D.H., Moreira, L.M., Novo, M.T., Okura, V.K., Oliveira, M.C., Oliveira, V.R., Pereira, H.A., Rossi, A., Sena, J.A., Silva, C., de Souza,
16.
17. 18. 19. 20.
21. 22.
23.
24.
25.
26. 27.
28.
29.
30. 31.
183
R.F., Spinola, L.A., Takita, M.A., Tamura, R.E., Teixeira, E.C., Tezza, R.I., Trindade dos Santos, M., Truffi, D., Tsai, S.M., White, F.F., Setubal, J.C., Kitajima, J.P.: Comparison of the genomes of two Xanthomonas pathogens with differing host specificities. Nature 417, 459–463 (2002). Deng, W., Burland, V., Plunkett, G., Boutin, A., Mayhew, G.F., Liss, P., Perna, N.T., Rose, D.J., Mau, B., Zhou, S., Schwartz, D.C., Fetherston, J.D., Lindler, L.E., Brubaker, R.R., Plano, G.V., Straley, S.C., McDonough, K.A., Nilles, M.L., Matson, J.S., Blattner, F.R., Perry, R.D.: Genome sequence of Yersinia pestis KIM. J Bacteriol. 184, 4601–4611 (2002). Feng, D.F., Cho, G., Doolittle, R.F.: Determining divergence times with a protein clock: update and reevaluation. Proc. Natl. Acad. Sci. USA 94, 13028–13033 (1997). Fitz-Gibbon, S., House, C.H.: Whole genome-based phylogenetic analysis of free-living microorganisms. Nucl. Acids Res. 27, 4218–4222 (1999). Fox, G.E., Wisotzkey, J.D., Jurtshuk, P.: How close is close : 16S rRNA sequence identity may not be sufficient to guarantee species identity. Int. J. Syst. Bacteriol. 42, 166–170 (1992). Garrity, G.M., Holt, J.G.: The road map to the Manual, pp. 119-141. In: D.R. Boone and R.W. Castenholz (ed.), Bergey’s manual of systematic bacteriology 2nd edition, vol 1. Springer Verlag, New York, NY 2001. Gerlach, G., von Witzingerode, F., Middendorf, B., Gross, R.: Evolutionary trends in the genus Bordetella. Microb. Infect. 3, 61–72 (2001). Goodfellow, M., O’Donell, A.G.: Roots of bacterial systematics, pp. 3–54. In: M. Goodfellow and A.G. O’Donell (ed.), Handbook of new bacterial systematics. Academic press, London, UK 1993. Goris, J., Suzuki, K., De Vos, P., Nakase, T., Kersters, K.: Evaluation of a microplate DNA-DNA hybridisation method compared with the initial renaturation method. Can. J. Microbiol. 44, 1148–1153 (1998). Gupta, R.S.: Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among Archaebacteria, Eubacteria and Eukaryotes. Microbiol. Mol. Biol. Rev. 62, 1435–1491 (1998). Gupta, R.S.: The branching order and phylogenetic placement of species from completed bacterial genomes, based on conserved indels found in various proteins. Int. Microbiol. 4, 187–202 (2001). Gupta, R.S., Griffiths, E.: Critical issues in bacterial phylogeny. Theor. Popul. Biol. 61, 423–434 (2002). Harrington, C.S., On, S.L.W.: Extensive 16S ribosomal RNA gene sequence diversity in Campylobacter hyointestinalis strains : taxonomic and applied implications. Int. J. Syst. Bacteriol. 49, 1171–1175 (1999). Hayashi, T., Makino, K., Ohnishi, M., Kurokawa, K., Ishii, K., Yokoyama, K., Han, C.G., Ohtsubo, E., Nakayama, K., Murata, T., Tanaka, M., Tobe, T., Iida, T., Takami, H., Honda, T., Sasakawa, C., Ogasawara, N., Yasunaga, T., Kuhara, S., Shiba, T., Hattori, M., Shinagawa, H.: Complete genome sequence of enterohemorrhagic Escherichia coli O157:H7 and genomic comparison with a laboratory strain K-12. DNA Res. 8, 11–22 (2001). Hüss, V.A.R., Festl, H., Schleifer, K.H.: Studies on the spectrophotometric determination of DNA hybridisation from renatruration rates. System. Appl. Microbiol. 4, 184–192 (1983). Huynen, M.A., Bork, P.: Measuring genome evolution. Proc. Natl. Acad. Sci. USA 95, 5849–5856 (1998). Karlin, S., Mrazek, J., Campbell, A.M.: Compositional biases of bacterial genomes and evolutionary implications. J. Bacteriol. 179, 3899–3913 (1997).
184
T. Coenye and P. Vandamme
32. Karlin, S., Campbell, A.M., Mrazek, J.: Comparative DNA analysis across diverse genomes. Annu. Rev. Genet. 32, 185–225 (1998). 33. Keswani, J., Whitman, W.B.: Relationship of 16S rRNA sequence similarity to DNA hybridisation in prokaryotes. Int. J. Syst. Evol. Microbiol. 51, 667–678 (2001). 34. Kloos, W.E., Mohapatra, N., Dobrogosz, W.J., Ezell, J.W., Manclark, C.R.: Deoxyribonucleotide sequence relationships among Bordetella species. Int. J. Syst. Bacteriol. 31, 173–176 (1981). 35. Kuroda, M., Ohta, T., Uchiyama, I., Baba, T., Yuzawa, H., Kobayashi, I., Cui, L., Oguchi, A., Aoki, K., Nagai, Y., Lian, J., Ito, T., Kanamori, M., Matsumaru, H., Maruyama, A., Murakami, H., Hosoyama, A., Mizutani-Ui, Y., Takahashi, N.K., Sawano, T., Inoue, R., Kaito, C., Sekimizu, K., Hirakawa, H., Kuhara, S., Goto, S., Yabuzaki, J., Kanehisa, M., Yamashita, A., Oshima, K., Furuya, K., Oshino, C., Shiba, T., Hattori, M., Ogasawara, N., Hayashi, H., Hiramatsu, K.: Whole genome sequencing of meticillin-resistant Staphylococcus aureus. Lancet. 357, 1225–1240 (2001). 36. Lan, R., Reeves, P.R.: Escherichia coli in disguise : molecular origins of Shigella. Microb. Infect. 4, 1125–1132 (2002). 37. Ludwig, W., Klenk, H.P.: Overview: a phylogenetic backbone and taxonomic framework for procaryotic systematics, pp. 49–65. In: D.R. Boone and R.W. Castenholz (ed.), Bergey’s manual of systematic bacteriology 2nd edition, vol 1. Springer Verlag, New York, NY, 2001. 38. Jin, Q., Yuan, Z., Xu, J., Wang, Y., Shen, Y., Lu, W., Wang, J., Liu, H., Yang, J., Yang, F., Zhang, X., Zhang, J., Yang, G., Wu, H., Qu, D., Dong, J., Sun, L., Xue, Y., Zhao, A., Gao, Y., Zhu, J., Kan, B., Ding, K., Chen, S., Cheng, H., Yao, Z., He, B., Chen, R., Ma, D., Qiang, B., Wen, Y., Hou, Y., Yu, J.: Genome sequence of Shigella flexneri 2a: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157. Nucl. Acids Res. 30, 4432–4441 (2002). 39. McClelland, M., Sanderson, K.E., Spieth, J., Clifton, S.W., Latreille, P., Courtney, L., Porwollik, S., Ali, J., Dante, M., Du, F., Hou, S., Layman, D., Leonard, S., Nguyen, C., Scott, K., Holmes, A., Grewal, N., Mulvaney, E., Ryan, E., Sun, H., Florea, L., Miller, W., Stoneking, T., Nhan, M., Waterston, R., Wilson, R.K.: Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature. 413, 852–856 (2001). 40. Murray, R.G.E., Brenner, D.J., Colwell, R.R., De Vos, P., Goodfellow, M., Grimont, P.A.D., Pfennig, N., Stackebrandt, E., Zavarzin, G.A.: Report of the ad hoc committee on approaches to taxonomy within the Proteobacteria. Int. J. Syst. Bacteriol. 40, 213–215 (1990). 41. Murray, A.E., Lies, D., Li, G., Nealson, K., Zhou, J., Tiedje, J.M.: DNA/DNA hybridization to microarrays reveals genespecific differences between closely related microbial genomes. Proc. Natl. Acad. Sci. USA. 98, 9853–9858 (2001). 42. Nelson, K.E., Weinel, C., Paulsen, L.T., Dodson, R.J., Hilbert, H., Martins Dos Santos, V.A., Fouts, D.E., Gill, S.R., Pop, M., Holmes, M., Brinkac, L., Beanan, M., DeBoy, R.T., Daugherty, S., Kolonay, J., Madupu, R., Nelson, W., White, O., Peterson, J., Khouri, H., Hance, I., Lee, P.C., Holtzapple, E., Scanlan, D., Tran, K., Moazzez, A., Utterback, T., Rizzo, M., Lee, K., Kosack, D., Moestl, D., Wedler, H., Lauber, J., Stjepandic, D., Hoheisel, J., Straetz, M., Heim, S., Kiewitz, C., Eisen, J., Timmis, K.N., Dusterhoft, A., Tümmler, B., Fraser, C.M.: Complete genome sequence and comparative analysis of the metabolically versa-
43.
44.
45.
46.
47.
48.
49.
50.
51.
tile Pseudomonas putida KT2440. Environ. Microbiol. 4, 799–808 (2002). Neubauer, H., Aleksic, S., Hensel, A., Finke, E.J., Meyer, H.: Yersinia enterocolitica 16S rRNA gene types belong to the same genospecies but form three homology groups. Int. J. Med. Microbiol. 290, 61–64 (2000). Nierman, W.C., Feldblyum, T.V., Laub, M.T., Paulsen, I.T., Nelson, K.E., Eisen, J.A., Heidelberg, J.F., Alley, M.R., Ohta, N., Maddock, J.R., Potocka, I., Nelson, W.C., Newton, A., Stephens, C., Phadke, N.D., Ely, B., DeBoy, R.T., Dodson, R.J., Durkin, A.S., Gwinn, M.L., Haft, D.H., Kolonay, J.F., Smit, J., Craven, M.B., Khouri, H., Shetty, J., Berry, K., Utterback, T., Tran, K., Wolf, A., Vamathevan, J., Ermolaeva, M., White, O., Salzberg, S.L., Venter, J.C., Shapiro, L., Fraser, C.M.: Complete genome sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98, 4136–4141 (2001). Ogata, H., Audic, S., Renesto-Audiffren, P., Fournier, P.E., Barbe, V., Samson, D., Roux, V., Cossart, P., Weissenbach, J., Claverie, J.M., Raoult, D.: Mechanisms of evolution in Rickettsia conorii and R. prowazekii. Science 193, 2093–2098 (2001). Palleroni, N.J., Ballard, R.W., Ralston, E., Doudoroff, M.: Deoxyribonucleic acid homologies among some Pseudomonas species. J. Bacteriol. 110, 1–11 (1972). Paleroni, N.J., Kunisawa, R., Contopolou, R., Doudoroff, M.: Nucleic acid homologies in the genus Pseudomonas. Int. J. Syst. Bacteriol. 23, 333–339 (1973). Parkhill, J., Dougan, G., James, K.D., Thomson, N.R., Pickard, D., Wain, J., Churcher, C., Mungall, K.L., Bentley, S.D., Holden, M.T., Sebaihia, M., Baker, S., Basham, D., Brooks, K., Chillingworth, T., Connerton, P., Cronin, A., Davis, P., Davies, R.M., Dowd, L., White, N., Farrar, J., Feltwell, T., Hamlin, N., Haque, A., Hien, T.T., Holroyd, S., Jagels, K., Krogh, A., Larsen, T.S., Leather, S., Moule, S., O’Gaora, P., Parry, C., Quail, M., Rutherford, K., Simmonds, M., Skelton, J., Stevens, K., Whitehead, S., Barrell, B.G.: Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413, 848–852 (2001). Parkhill, J., Wren, B.W., Mungall, K., Ketley, J.M., Churcher, C., Basham, D., Chillingworth, T., Davies, R.M., Feltwell, T., Holroyd, S., Jagels, K., Karlyshev, A.V., Moule, S., Pallen, M.J., Penn, C.W., Quail, M.A., Rajandream, M.A., Rutherford, K.M., van Vliet, A.H., Whitehead, S., Barrell, B.G.: The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences. Nature 403, 665–668 (2000). Parkhill, J., Wren, B.W., Thomson, N.R., Titball, R.W., Holden, M.T., Prentice, M.B., Sebaihia, M., James, K.D., Churcher, C., Mungall, K.L., Baker, S., Basham, D., Bentley, S.D., Brooks, K., Cerdeno-Tarraga, A.M., Chillingworth, T., Cronin, A., Davies, R.M., Davis, P., Dougan, G., Feltwell, T., Hamlin, N., Holroyd, S., Jagels, K., Karlyshev, A.V., Leather, S., Moule, S., Oyston, P.C., Quail, M., Rutherford, K., Simmonds, M., Skelton, J., Stevens, K., Whitehead, S., Barrell, B.G.: Genome sequence of Yersinia pestis, the causative agent of plague. Nature. 413, 523–527 (2001). Perna, N.T., Plunkett, G., Burland, V., Mau, B., Glasner, J.D., Rose, D.J., Mayhew, G.F., Evans, P.S., Gregor, J., Kirkpatrick, H.A., Posfai, G., Hackett, J., Klink, S., Boutin, A., Shao, Y., Miller, L., Grotbeck, E.J., Davis, N.W., Lim, A., Dimalanta, E.T., Potamousis, K.D., Apodaca, J., Anantharaman, T.S., Lin, J., Yen, G., Schwartz, D.C., Welch, R.A., Blattner, F.R.: Genome sequence of enterohaemorrhagic Escherichia coli O157 : H7. Nature 409, 529–533 (2001).
Genomic Signature in Bacterial Taxonomy 52. Pupo, G.M., Lan, R., Reeves, P.R.: Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics. Proc. Natl. Acad. Sci. USA 19, 10567–10572 (2000). 53. Rivera, M.C., Jain, R., Moore, J.E., Lake, J.A.: Genomic evidence for two functionally distinct gene classes. Proc. Natl. Acad. Sci. USA 95, 6239–6244 (1998). 54. Roux, V., Rydkina, E., Eremeeva, M., Raoult, D.: Citrate synthase gene comparison, a new tool for phylogenetic analysis, and its application for the Rickettsiae. Int. J. Syst. Bacteriol. 47, 252–261 (1997). 55. Rossello-Mora, R., Amann, R.: The species concept for prokaryotes. FEMS Microbiol. Rev. 25, 39–67 (2001). 56. Rutherford, K., Parkhill, J., Crook, J., Horsnell, T., Rice, P., Rajandream, M.A., Barell, B.G.: Artemis: sequence visualisation and annotation. Bioinformatics 16, 944–945 (2000). 57. Saitou, N., Nei, M.: The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987). 58. Shigenobu, S., Watanabe, H., Hattori, M., Sakaki, Y., Ishikawa, H.: Genome sequence of the endocellular bacterial symbiont of pahids Buchnera sp. APS. Nature 407, 81–86 (2000). 59. Snel, B., Bork, P., Huynen, M.A.: Genome phylogeny based on gene content. Nat. Genet. 21, 108–110 (1999). 60. Stackebrandt, E., Frederiksen, W., Garrity, G.M., Grimont, P.A., Kampfer, P., Maiden, M.C., Nesme, X., RosselloMora, R., Swings, J., Trüper, H.G., Vauterin, L., Ward, A.C., Whitman, W.B.: Report of the ad hoc committee for the re-evaluation of the species definition in bacteriology. Int. J. Syst. Evol. Microbiol. 52, 1043–1047 (2002). 61. Stackebrandt, E., Goebel, B.M.: Taxonomic note: a place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. Int. J. Syst.Bacteriol. 44, 846–849 (1994). 62. Stackebrandt, E., Murray, R.G.E., Trüper, H.G.: Proteobacteria classis. nov., a name for the phylogenetic taxon that includes the “purple bacteria and their relatives”. Int. J. Syst. Bacteriol. 38, 321–325 (1988). 63. Stover, C.K., Pham, X.Q., Erwin, A.L., Mizoguchi, S.D., Warrener, P., Hickey, M.J., Brinkman, F.S., Hufnagle, W.O., Kowalik, D.J., Lagrou, M., Garber, R.L., Goltry, L., Tolentino, E., Westbrock-Wadman, S., Yuan, Y., Brody, L.L., Coulter, S.N., Folger, K.R., Kas, A., Larbig, K., Lim, R., Smith, K., Spencer, D., Wong, G.K., Wu, Z., Paulsen, I.T., Reizer, J., Saier, M.H., Hancock, R.E., Lory, S., Olson, M.V.: Complete genome sequence of Pseudomonas aeruginosa PA01, an opportunistic pathogen. Nature 406, 959–964 (2000). 64. Tamas, I., Klasson, L., Canback, B., Naslund, A.K., Eriksson, A.S., Wernegreen, J.J., Sandstrom, J.P., Moran, N.A., Andersson, S.G.: 50 million years of genomic stasis in endosymbiotic bacteria. Science 296, 2376–2379 (2002). 65. Tettelin, H., Nelson, K.E., Paulsen, I.T., Eisen, J.A., Read, T.D., Peterson, S., Heidelberg, J., DeBoy, R.T., Haft, D.H., Dodson, R.J., Durkin, A.S., Gwinn, M., Kolonay, J.F., Nelson, W.C., Peterson, J.D., Umayam, L.A., White, O., Salzberg, S.L., Lewis, M.R., Radune, D., Holtzapple, E., Khouri, H., Wolf, A.M., Utterback, T.R., Hansen, C.L., McDonald, L.A., Feldblyum, T.V., Angiuoli, S., Dickinson, T., Hickey, E.K., Holt, I.E., Loftus, B.J., Yang, F., Smith, H.O., Venter, J.C., Dougherty, B.A., Morrison, D.A., Hollingshead, S.K., Fraser, C.M.: Complete genome sequence of a virulent isolate of Streptococcus pneumoniae. Science 293, 498–506 (2001).
185
66. Tomb, J.F., White, O., Kerlavage, A.R., Clayton, R.A., Sutton, G.G., Fleischmann, R.D., Ketchum, K.A., Klenk, H.P., Gill, S., Dougherty, B.A., Nelson, K., Quackenbush, J., Zhou, L., Kirkness, E.F., Peterson, S., Loftus, B., Richardson, D., Dodson, R., Khalak, H.G., Glodek, A., McKenney, K., Fitzegerald, L.M., Lee, N., Adams, M.D., Hickey, E.K., Berg, D.E., Gocayne, J.D., Utterback, T.R., Peterson, J.D., Kelley, J.M., Cotton, M.D., Weidman, J.M., Fuji, C., Bowman, C., Watthey, L., Walin, E., Hayes, W.S., Borodovsky, M., Karp, P.D., Smith, H.O., Fraser, C.M., Venter, J.C.: The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388, 539–547 (1997). 67. Ursing, J.B., Rossello-Mora, R.A., Garcia-Valdes, E., Lalucat, J.: Taxonomic note: a pragmatic approach to the nomenclature of phenotypically similar genomic groups. Int. J. Syst. Bacteriol. 45, 604 (1995). 68. Wayne, L.G., Brenner, D.J., Colwell, R.R., Grimont, P.A.D., Kandler, O., Krichevsky, M.I., Moore, L.H., Moore, W.E.C., Murray, R.G.E., Stackebrandt, E., Starr, M.P., Trüper, H.G.: Report of the ad hoc committee on reconciliation of approaches to bacterial systematics. Int. J. Syst. Bacteriol. 37, 463–464 (1987). 69. Weiss, E., Moulder, J.W.: Order I. Rickettsiales Gieszczkiewickz 1939, 25AL, pp. 687–701. In: N.R. Krieg and J.G. Holt (ed.), Bergey’s manual of systematic bacteriology, vol. 1. The Williams and Wilkins Co., Baltimore, Md. 1984. 70. Welch, R.A., Burland, V., Plunkett, G., Redford, P., Roesch, P., Rasko, D., Buckles, E.L., Liou, S.R., Boutin, A., Hackett, J., Stroud, D., Mayhew, G.F., Rose, D.J., Zhou, S., Schwartz, D.C., Perna, N.T., Mobley, H.L., Donnenberg, M.S., Blattner, F.R.: Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc. Natl. Acad. Sci. USA 99, 17020–17024 (2002). 71. Vancanneyt, M., Vandamme, P., Kersters, K.: Differentiation of Bordetella pertussis, B. parapertussis and B. bronchiseptica by whole-cell protein electrophoresis and fatty acid analysis. Int. J. Syst. Bact. 45, 843–847 (1995). 72. Vandamme, P., Harrington, C.S., Jalava, K., On, S.L.W.: Misidentifying helicobacters: the Helicobacter cinaedi example. J. Clin. Microbiol. 38, 2261–2266 (2000). 73. Vandamme, P., Pot, B., Gillis, M., De Vos, P., Kersters, K., Swings, J.: Polyphasic taxonomy, a consensus approach to bacterial systematics. Microbiol. Rev. 60, 407–438 (1996). 74. Van de Peer, Y., Neefs, J.M., De Rijk, P., De Vos, P., De Wachter, R.: About the order of divergence of the major bacterial taxa during evolution. Syst. Appl. Microbiol. 17, 32–38 (1994). 75. Vauterin, L., Hoste, B., Kersters, K., Swings, J.: Reclassification of Xanthomonas. Int. J. Syst. Bacteriol. 45, 472–489 (1995). 76. Woese, C.R.: Bacterial evolution. Microbiol. Rev. 51, 221–271 (1987). 77. Yoshimura, H.H., Evans, D.G., Graham, D.Y.: DNA-DNA hybridization demonstrates apparent genetic differences between Helicobacter pylori from patients with duodenal ulcer and asymptomatic gastritis. Dig. Dis. Sci. 38, 1128–1131 (1993).
Corresponding author: Tom Coenye, Laboratorium voor Microbiologie, Ghent University, K.L. Ledeganckstraat 35, B-9000 Gent, Belgium Tel.: ++32 9 264 5114; Fax: ++32 9 264 5092; e-mail:
[email protected]