roseum hsp70 genes were translated and added to a multiple alignment of hsp70 protein sequences representing different groups of species or homologs. The.
JOURNAL OF BACTERIOLOGY, Jan. 1997, p. 345–357 0021-9193/97/$04.0010 Copyright q 1997, American Society for Microbiology
Vol. 179, No. 2
Sequencing of Heat Shock Protein 70 (DnaK) Homologs from Deinococcus proteolyticus and Thermomicrobium roseum and Their Integration in a Protein-Based Phylogeny of Prokaryotes RADHEY S. GUPTA,* KEVIN BUSTARD, MIZIED FALAH,
AND
DAVINDRA SINGH
Department of Biochemistry, McMaster University, Hamilton, Ontario, Canada L8N 3Z5 Received 23 May 1996/Accepted 1 November 1996
The 70-kDa heat shock protein (hsp70) sequences define one of the most conserved proteins known to date. The hsp70 genes from Deinococcus proteolyticus and Thermomicrobium roseum, which were chosen as representatives of two of the most deeply branching divisions in the 16S rRNA trees, were cloned and sequenced. hsp70 from both these species as well as Thermus aquaticus contained a large insert in the N-terminal quadrant, which has been observed before as a unique characteristic of gram-negative eubacteria and eukaryotes and is not found in any gram-positive bacteria or archaebacteria. Phylogenetic analysis of hsp70 sequences shows that all of the gram-negative eubacterial species examined to date (which includes members from the genera Deinococcus and Thermus, green nonsulfur bacteria, cyanobacteria, chlamydiae, spirochetes, and a-, b-, and g-subdivisions of proteobacteria) form a monophyletic group (excluding eukaryotic homologs which are derived from this group via endosybitic means) strongly supported by the bootstrap scores. A closer affinity of the Deinococcus and Thermus species to the cyanobacteria than to the other available gram-negative sequences is also observed in the present work. In the hsp70 trees, D. proteolyticus and T. aquaticus were found to be the most deeply branching species within the gram-negative eubacteria. The hsp70 homologs from gram-positive bacteria branched separately from gram-negative bacteria and exhibited a closer relationship to and shared sequence signatures with the archaebacteria. A polyphyletic branching of archaebacteria within gram-positive bacteria is strongly favored by different phylogenetic methods. These observations differ from the rRNA-based phylogenies where both gram-negative and gram-positive species are indicated to be polyphyletic. While it remains unclear whether parts of the genome may have variant evolutionary histories, these results call into question the general validity of the currently favored three-domain dogma. nus level and above is based solely or mainly on 16S rRNA sequence analysis (only a small part of the genome), and unless a similar evolutionary relationship is supported by other gene sequences, the inferences derived from it may not necessarily represent the true species phylogeny. We are using the sequence data for the 70-kDa heat shock protein (hsp70) to examine the evolutionary relationships between various representative taxa because it is an essential protein and its sequences are highly conserved among various species (8, 30, 37, 49). In fact, it is the most conserved protein known to date that is found in all biota (30); thus, this molecule seems ideally suited for examination of the deep evolutionary relationships among species that go back several billion years to the beginning of life. Our recent studies based on hsp70 have indicated important differences from the phylogenetic inferences based on rRNA phylogenies. The sequence features and phylogenies based on hsp70 sequences suggest a close evolutionary relationship between gram-positive bacteria and the archaebacteria (29–34). A similar unexpected relationship derives from a number of other gene and protein sequences (5, 11, 25, 70, 75, 83). To examine further the evolutionary relationship within the prokaryotes, we now report the cloning and sequencing of the hsp70 genes from two species, viz., Deinococcus proteolyticus and Thermomicrobium roseum, which are members of the Deinococcus and green nonsulfur groups of bacteria, respectively. In the phylogenetic trees based on rRNA, these groups are among the earliest-branching lineages within the eubacterial domain, branching much earlier than various gram-positive bacteria (10, 38, 50, 56, 60, 84, 85). Sequence features and phylogenetic analyses of the hsp70 sequences from these species, as well as from Thermus aquaticus,
The classification of prokaryotes has posed a major challenge for biologists for centuries (see references 71 and 79). Given their enormous diversity, which dates back to the beginning of life, the classical taxonomical criteria based on morphology and physiology have generally proven inadequate to satisfactorily define their evolutionary relationships (86). However, in the past few decades, the ability to clone and sequence nucleic acids as well as their products has led to the field of molecular phylogeny, in which the evolutionary relationships among organisms are derived from the extent of similarity in the primary sequences of either protein or nucleic acid homologs (55, 84, 85, 89). These techniques have now become the preferred criteria for bacterial classification (1, 3, 5, 7, 8, 16–18, 22, 40, 44, 55, 56, 75, 77, 78, 80, 81, 84, 85). Most extensive studies with regard to the molecular classification of bacterial species have been based on the smallsubunit rRNA, or 16S rRNA (22, 56, 84). These studies have led to the concept of division of prokaryotes into two distinct monophyletic groups or domains, designated archaebacteria (or Archaea) and eubacteria (or Bacteria) (see reference 87). Within the eubacterial domain, phylogeny based on 16S rRNA has recognized several main divisions. These include members of the Thermotogales, green nonsulfur bacteria, deinococci, cyanobacteria, low-G1C and high-G1C gram-positive bacteria, flavobacteria and bacteroids, green sulfur bacteria, spirochetes, chlamydiae and planctomycetes, and purple bacteria or proteobacteria (22, 56, 84, 85). However, it should be noted that the current bacterial phylogenetic classification at the ge-
* Corresponding author. 345
346
GUPTA ET AL.
J. BACTERIOL.
FIG. 1. Southern blot analysis of D. proteolyticus and T. roseum DNAs hybridized to the cloned hsp70 probes specific for these species. (a) D. proteolyticus genomic DNA digested with ApaI (lanes 1 and 19), ClaI (lanes 2 and 29), and KpnI (lanes 3 and 39). The 70-kb KpnI fragment (arrowhead) was cloned. (b) T. roseum genomic DNA digested with PstI (lanes 1 and 19) and NcoI (lanes 2 and 29). Lanes 1 to 3 show the ethidium bromide-stained gels, whereas lanes 19, 29, and 39 are from the autoradiographs of the Southern blots. Lanes L, DNA size ladders.
which is related to the Deinococcus group, presented here confirm that all of these species belong to the gram-negative group of bacteria. Furthermore, unlike rRNA-based phylogenies, which indicate that both gram-negative and gram-positive eubacteria are polyphyletic, in the hsp70 tree all of the gram-negative bacteria formed a phylogenetically coherent group that is completely distinct from the gram-positive bacteria. Some implications of these results regarding the evolutionary relationships among prokaryotic species are discussed. MATERIALS AND METHODS Bacterial strains. D. proteolyticus (ATCC 35074; type strain) and T. roseum (ATCC 27502; type strain) were purchased from the American Type Culture Collection (ATCC), Rockville, Md. The D. proteolyticus cells were grown at 308C in tryptone-glucose-yeast extract-methionine (medium 679; ATCC), and the T. roseum cells were grown at 708C in the thermobacterium medium (medium 655; ATCC). High-molecular-weight DNA from these species was prepared by standard procedures (82). PCR amplification. Degenerate oligonucleotide primers in opposite orientations were synthesized for two conserved regions in the hsp70 family of proteins (QATKDAG and NPDEAVA). These primers have been used successfully in the past to clone hsp70 genes from a number of different archaebacterial, eubacterial, and eukaryotic species (18, 24, 32–34). PCR amplification using the above set of primers and either D. proteolyticus or T. roseum genomic DNA was carried out as previously described (24, 32). The amplified DNA fragments of the expected size were purified from agarose gels and subcloned in the plasmid vector pCR1000 by using a TA cloning kit (Invitrogen Corp., San Diego, Calif.). The DNA inserts from a number of colonies were sequenced to determine their relatedness to a consensus hsp70 sequence. A BLAST search on the sequence was performed to verify that the amplified hsp70 gene fragments were indeed from a novel source. These cloned fragments were used to make [32P]dATPradiolabeled probes for use in Southern blot and colony screening experiments. hsp70 cloning. D. proteolyticus and T. roseum genomic DNAs were digested with several restriction enzymes and run on separate 0.8% agarose gels. After the DNA was transferred to nylon membranes, the blots were hybridized to the [32P]dATP-labeled hsp70-specific probe from the same species. Based on the results of the Southern blots, regions containing the appropriate size range of DNA fragments were excised, and a library of the fragments was made in a plasmid vector. After transformation of Escherichia coli cells, the colonies were screened by hybridization to the hsp70-specific probe and positive clones were identified. After nested deletions were made, both strands of the cloned fragments were sequenced with a T7 sequencing kit (Pharmacia).
Phylogenetic analysis. The nucleotide sequences for the D. proteolyticus and T. roseum hsp70 genes were translated and added to a multiple alignment of hsp70 protein sequences representing different groups of species or homologs. The species names, accession numbers, and references for various hsp70 sequences used in this work are given in Table 1. The overall alignment of sequences was similar to that described earlier (18, 32, 34). The phylogenetic analysis was carried out with the sequence region corresponding to that between amino acid 4 (beginning with VIGID) and amino acid 532 (i.e., ending with RNQAE) in the T. roseum sequence, which could be aligned with minimum ambiguity in all hsp70 homologs (see references 30 and 33). The alignment contained a consensus length of 531 amino acids. This sequence alignment was analyzed by two phylogenetic methods, neighbor joining (NEIGHBOR) and maximum parsimony (PROTPARS), which are part of the PHYLIP 3.5 (20, 65) program package. For neighbor-joining analysis, the alignment was bootstrapped 1,000 times, using the BOOT program to determine the confidence level of different branching orders (19). The CONSENSE program was used to generate the consensus phylogenetic tree from the bootstrapped results. The maximum parsimony analysis was initially carried out to determine the most parsimonious tree(s) possible for the sequences. This alignment was also subjected to 100 bootstraps using the SEQ BOOT program, and a consensus parsimony tree was constructed.
RESULTS Cloning of the hsp70 homologs. When PCR was carried out with T. roseum or D. proteolyticus DNA and the set of degenerate oligonucleotide primers for two highly conserved regions of hsp70, strong amplification of an expected 0.65-kb fragment was observed (results not shown). The amplified fragments were subcloned and sequenced, and after it was confirmed that they were hsp70 related and novel, they were employed as probes for isolating the genomic clones. In Southern blots of D. proteolyticus genomic DNA digested with different restriction enzymes, the radiolabeled probe hybridized to single fragments in the size range of 3.5 to 12 kb (Fig. 1a). Specific hybridization to a fragment of about 7.0 kb was observed for KpnI-digested DNA (Fig. 1a, lanes 3 and 39), and this fragment was subcloned. Restriction digest analysis and sequencing of the ends of the fragment indicated that the sequence corresponding to hsp70 was present at the 59 end of the insert. This region of the insert was completely sequenced. It contained an open reading frame of 1,854 bp that corre-
VOL. 179, 1997
hsp70-BASED PHYLOGENY OF PROKARYOTES
347
TABLE 1. Species names and accession numbers of hsp70 sequences a
Group and species
Archaebacteria Halobacterium marismortui Halobacterium cutirubrum Thermoplasma acidophilum Methanosarcina mazei Gram positive High-G1C group Mycobacterium leprae Mycobacterium tuberculosis Mycobacterium paratuberculosis Streptomyces coelicolor Streptomyces griseus Low-G1C group Bacillus subtilis Bacillus megaterium Clostridium acetobutylicum Clostridium perfringens Erysiopelothrix rhusiopathiae Lactococcus lactis Mycoplasma genitalium Staphylococcus aureus
Accession no.b
Type strain
SP/Q01100 GB/L35530 GB/L35529 SP/P27094
32 33 33 46
T T
48 GenBank, unpublished 73 12 35
SP/P17820 SP/P05646 SP/P30721 SP/P26823 GB/M98865 GB/X75428 GB/U39711 GB/D30690
T T T T T T T T
36 74 52 24 59 41 23 54
Proteobacteria (type) Escherichia coli (g) Haemophilus ducreyi (g) Haemophilus influenzae (g) Haemophilus actinomycetemcomitans (g) Francisella tularensis (g) Pseudomonas cepacia (b) Rhizobium meliloti (a) Caulobacter crescentus (a) Brucella ovis (a) Agrobacterium tumefaciens (a)
SP/P04475 GB/U25996 1169375 1542849 1352284 GB/L36603 GB/L36602 SP/P20442 GB/M95799 1027504
T T T T T T T T T T
4 GenBank, unpublished 21 GenBank, unpublished 88 18 18 26 14 67
Chlamydiae and spirochetes Borrelia burgdorferi Chlamydia trachomatis Chlamydia pneumoniae
SP/P28608 SP/P17821 SP/P27542
T
2 6 45
Cyanobacteria Synechococcus sp. strain PCC 7942
SP/P19993 GB/X58406 GB/Q00488 GB/L08201 GB/O14499
T T T
Reference
T
GB/D29668 GB/D28551
52 52
Green nonsulfur bacteria Thermomicrobium roseum
GB/80216
T
This work
Deinococcus-Thermus group Deinococcus proteolyticus Thermus aquaticus subsp. thermophilus
GB/80215 GB/Y07826
T T
This work GenBank, unpublished
Eukaryotic organelles (mitochondria and chloroplasts) Saccharomyces cerevisiae (m) Drosophila melanogaster (m) Porphyra umbilicalis (chl) Pea (chl)
SP/P12398 SP/P29845 SP/P30723 GB/L03299
15 64 62 47
Eukaryotic cytosolic Saccharomyces cerevisiae (er) Human (er) Giardia lamblia (er) Giardia lamblia Maize Saccharomyces cerevisiae Human
SP/P16474 SP/P11021 GB/V04875 GB/V04874 SP/P11143 SP/P10591 SP/P11142
53 76 34 34 63 68 39
a b
m, chl, and er, mitochondrial, chloroplast, and endoplasmic reticulum resident homologs of hsp70, respectively. GB and SP, GenBank and Swissprot databases, respectively.
348
GUPTA ET AL.
J. BACTERIOL.
FIG. 2. Nucleotide and deduced amino acid sequences of hsp70 genes from D. proteolyticus. The sequence is complete except for about 10 or 11 amino acids from the N-terminal end.
sponded to hsp70 (Fig. 2). Based upon comparison with other hsp70 sequences, all of the hsp70 sequence was present except for about 10 or 11 amino acids from the N-terminal end. In Southern blots of T. roseum genomic DNA digested with various restriction enzymes, the T. roseum probe hybridized to a single fragment of about 3.0 kb in the XhoI-digested DNA (not shown). Cloning and sequencing of this fragment revealed an open reading frame containing 1,650 bp from the 59 end of hsp70. To clone the missing region, a probe from the 39 end of the above insert (NcoI-XhoI) was utilized and hybridized to NcoI-digested genomic DNA. A 1.4-kb fragment which hybridized to this probe (Fig. 1b, lanes 2 and 29) was cloned, and it contained all of the missing region. The complete sequence of the T. roseum hsp70 gene and its deduced amino acid sequence are shown in Fig. 3.
Codon usage. The hsp70 genes from D. proteolyticus and T. roseum contained 63 and 66% G1C, respectively, which is in agreement with the reported G1C contents of these species (10, 50, 60). In accordance with their high G1C contents, the codon usage in both these species showed a strong preference for either G or C in the third position, with values of 80.3 and 85.2% for D. proteolyticus and T. roseum, respectively. However, even though these two species had similar overall G1C contents and G1C usage in the third position, a number of interesting differences in the codon usage were observed (Table 2). For instance, for codons encoding serine, proline, threonine, alanine, and glycine, D. proteolyticus showed a definite preference for C in the third codon positions, whereas a G was found to be the preferred base for T. roseum. Other notable differences in codon usage between these two species were
VOL. 179, 1997
hsp70-BASED PHYLOGENY OF PROKARYOTES
349
FIG. 3. Nucleotide and deduced amino acid sequences of hsp70 gene from T. roseum.
observed for the codons encoding glutamic acid, isoleucine, phenylalanine, and threonine. For some of these amino acids, D. proteolyticus, but not T. roseum, showed a preference for either T (ATT, isoleucine) or A (GAA, glutamic acid) in the third position, which is unexpected for an organism with a high G1C content. Sequence alignment and signature sequences. The hsp70 sequences from D. proteolyticus and T. roseum were aligned with various known archaebacterial and eubacterial and some representative eukaryotic sequences. The overall alignment was very similar to that published previously for a limited number of species (18, 32, 34). hsp70 homologs from various species exhibited extensive identity or similarity throughout their length, except in the C-terminal 50 to 100 amino acids (see references 33 and
34). The global alignment of hsp70 sequences has led to the identification of a number of signature sequences that are characteristic of various groups of species (30–34). A portion of the hsp70 sequence alignment depicting one striking sequence feature of this protein family is presented in Fig. 4. We have previously shown that all of the hsp70 homologs from gram-negative bacteria and eukaryotes contained a 23- to 27amino-acid insert in their N-terminal quadrant, which is not found in any of the gram-positive bacteria or archaebacterial homologs (30–33). As seen in Fig. 4, the hsp70 proteins from both D. proteolyticus and T. roseum, as well as the recently reported sequence from T. aquaticus (Table 1), contained a related insert in precisely the same position as found in hsp70 sequences from all other gram-negative eubacterial species.
350
GUPTA ET AL.
J. BACTERIOL. TABLE 2. Codon usage in hsp70 sequences from D. proteolyticus and T. roseuma No. of codons
Amino acid
Phe Phe Leu Leu Leu Leu Leu Leu Ile Ile Ile Met Val Val Val Val Ser Ser Ser Ser Pro Pro Pro Pro Thr Thr Thr Thr Ala Ala Ala Ala
No. of codons
Codon
TTT TTC TTA TTG CTT CTC CTA CTG ATT ATC ATA ATG GTT GTC GTA GTG TCT TCC TCA TCG CCT CCC CCA CCG ACT ACC ACA ACG GCT GCC GCA GCG
Amino acid D. proteolyticus
T. roseum
7* 10 0 0 0 5 0 39 15* 20 0 9 0 13 0 38 2 10* 1 4 2 19* 2 5 22* 36* 0 2 10 44* 5 7
0 14 0 1 0 20 1 30 0 46* 1 12 4 23 2 21 0 3 1 11* 2 7 3 26* 1 20 6 21* 7 22 5 28*
Tyr Tyr Stop Stop His His Gln Gln Asn Asn Lys Lys Asp Asp Glu Glu Cys Cys Stop Trp Arg Arg Arg Arg Ser Ser Arg Arg Gly Gly Gly Gly
Codon
TAT TAC TAA TAG CAT CAC CAA CAG AAT AAC AAA AAG GAT GAC GAA GAG TGT TGC TGA TGG CGT CGC CGA CGG AGT AGC AGA AGG GGT GGC GGA GGG
D. proteolyticus
T. roseum
0 5 1 0 0 2 1 30 1 29 7 30 1 41 39* 18 0 0 0 2 12 26 6 2 0 13 0 2 6 39 3 3
2 9 0 0 2 7 3 27 1 17 0 30 5 32 12 44* 0 0 1 0 3 29 9 11 0 5 0 0 18 21 6 12
a Although the G1C contents of D. proteolyticus and T. roseum are very similar (63 and 66%, respectively), strong preferences for certain codons (asterisks) are seen in the two species.
Phylogenetic analysis. The phylogenetic analysis was carried out on a consensus length of 531 positions which could be aligned without ambiguity in all species and homologs listed in Table 1. Most available eubacterial and archaebacterial sequences were included in this analysis. However, since detailed phylogenetic analyses of eukaryotic cytoplasmic and organellar hsp70 sequences have been reported previously (18, 34), only a few representatives from these groups were included. A consensus neighbor-joining tree obtained after 1,000 bootstraps based on these sequences is shown in Fig. 5A. The overall branching pattern in this unrooted tree is very similar to that seen in earlier work (30, 33). A clear distinction between the eukaryotic and prokaryotic species was seen in 100% of the bootstraps. Furthermore (in 87.3% of the bootstraps), the eukaryotes grouped with the gram-negative bacteria, whereas the archaebacteria were observed to branch with the gram-positive bacteria (Fig. 5A). For the gram-positive group of bacteria, the low- and high-G1C-content species formed separate clades. The four archaebacterial species examined showed a polyphyletic branching within (or with) the gram-positive group. The halobacterial hsp70 homologs branched with the high-G1C subgroup (73.8% of the time), whereas Methanosarcina mazei showed an affinity for the low-G1C gram-positive species (80.6% of the time). Thermoplasma acidophilum hsp70 branched separately from the other archaebacterial homologs. The polyphyletic branching of the archaebacteria within the gram-positive bacteria has previously been shown to be statistically strongly favored (30, 33).
Of the two species whose hsp70 homologs that have been cloned in the present work, D. proteolyticus showed greater affinity to the T. aquaticus species, which on the basis of 16S rRNA sequence analysis has been placed in the Deinococcus group (10, 50, 55, 56, 70, 84, 85). Further, the DeinococcusThermus group of species, which constituted the most deeply branching lineage within the gram-negative bacteria, showed a strong affinity (998 of 1,000 bootstraps) for the cyanobacterium-chloroplast group (Fig. 5A). The other species, T. roseum, formed the out-group (66.6% of the time) of the branch leading to the proteobacterium-chlamydia and spirochete clade, and it showed no significant affiliation to the Deinococcus-Thermus group. Parsimony analysis of the sequence alignment yielded two equally parsimonious trees, requiring a total of 5,230 amino acid substitutions. One of these trees is shown in Fig. 5B. The other tree differed from it in the position of T. roseum, which now formed the out-group of the clade consisting of eukaryotic cytosolic homologs, chlamydiae-spirochetes, and proteobacteria. The overall relationship between various species in the parsimony tree is very similar to that seen in the neighborjoining tree. The parsimony tree was also bootstrapped (100 times), and the bootstrap scores for the various nodes are indicated on the tree. As seen from Fig. 5B, the parsimony tree, similar to the neighbor-joining tree, supported (100% of the time) grouping of archaebacteria with the gram-positive bacteria and grouping of eukaryotes with the gram-negative bacteria. The archaebacteria again showed polyphyletic
VOL. 179, 1997
hsp70-BASED PHYLOGENY OF PROKARYOTES
351
FIG. 4. Excerpt from hsp70 sequence alignment showing one of the prominent sequence signatures shared by archaebacteria and gram-positive bacteria on the one hand and gram-negative bacteria and eukaryotes on the other. A, E, G2, G1, and O, sequences from archaebacteria, eukaryotes (cytosolic), gram-negative bacteria, gram-positive bacteria, and eukaryotic organellar sequences (viz., mitochondria and chloroplasts), respectively. The overall sequence alignment was very similar to that reported previously (18, 34). The numbers at the top refer to the position in the Halobacterium marismortui sequence. Identity with the amino acids in the top line (dashes) and the inserts in the N-terminal quadrant (boxed regions) are indicated. The eukaryotic cytosolic homologs are readily distinguished from organellar homologs by other sequence signatures that are not shown here (33, 34). The species names are listed in full in Table 1. m, chl, and er, mitochondrial, chloroplast, and endoplasmic reticulum resident forms of hsp70, respectively.
branching which is statistically significant (30, 33). Similar to the results obtained with the neighbor-joining method, D. proteolyticus was closely related to T. aquaticus (grouped together in 100% of the bootstraps), and both these species showed a strong affinity (96% of the time) for the cyanobacterium-chloroplast group. However, the placement of T. roseum in the parsimony tree was not resolved. In about equal numbers of bootstraps (between 20 and 25%), it branched either as the out-group of the chlamydia-proteobacterium group (as shown in Fig. 5B) or as the out-group of the eukaryotic clade or the out-group of the eukaryote-chlamydia-proteobacterium clade. DISCUSSION The phylogenetic classification of the prokaryotes, particularly at the genus level and above, is currently based largely on 16S rRNA sequences (55, 56, 84, 85). While 16S rRNA has played a key role in evolutionary studies and continues to do so, it is often forgotten that the phylogeny based on a single gene sequence does not necessarily represent species phylogeny unless supported by various other independent molecules.
It is in this context that we are examining hsp70 sequences to determine what they can tell us about the phylogenetic relationships between species. Similar to 16S rRNA, hsp70 is present in all species, the amino acid sequence is highly conserved, and it performs an essential function related to protein folding and transport and in protecting the organisms from heat- or stress-induced damage (8, 37, 49). Therefore, accumulation of comparative data for this molecule is important. All prokaryotic species studied to date, with a few exceptions where gene duplication has occurred (52, 66), contain only a single hsp70 homolog (see reference 33). In eukaryotic species, distinct hsp70 homologs are found in various compartments, but they are readily distinguished as orthologous, paralogous, or xenologous sequences based on sequence signatures (8, 29–34). The hsp70 molecules are of similar sizes in all species (between 605 and 650 amino acids), and the high degree of sequence conservation allows alignment of nearly the entire length of the molecule (i.e., .500 amino acids) with minimal ambiguity. Thus, the useful information content of the hsp70 molecule is very large. The above characteristics of the hsp70
352
GUPTA ET AL.
J. BACTERIOL.
FIG. 5. Phylogenetic analysis based on hsp70 sequences. (A) A consensus neighbor-joining tree obtained after 1,000 bootstraps. The numbers at the forks indicate the number of times the species to the right of the fork grouped together in the bootstrap trees. (B) Maximum parsimony tree for the same species. A single most parsimonious tree as shown was obtained. The bootstrap scores of various nodes that gave .50% value (out of 100 bootstraps) are shown. The trees shown here are unrooted and have been arbitrarily divided at the approximate midpoint. The archaebacterial species are indicated (asterisks). The species names are listed in full in Table 1.
sequences indicate that it should be an ideal molecule for investigating deep phylogenetic relationships, which is the reason that we have determined the sequences of hsp70 from D. proteolyticus and T. roseum. Recently, while the present paper was being reviewed, the hsp70 sequence from T. aquaticus became available, and it was also included in our analysis. The phylogenetic analysis of hsp70 sequences reported in this study and earlier studies has revealed a number of important differences from 16S rRNA-based phylogeny (8, 18, 29– 34). The most important of these differences concerns the central dogma of the rRNA-based phylogeny, i.e., that archaebacteria constitute a separate monophyletic domain, or third form of life, that is completely distinct from other eubacteria (55, 56, 84–87). Neither of these views (i.e., phylogenetic distinctness of archaebacteria or their monophyletic nature) is
supported by hsp70-based phylogenies or the sequence signatures of hsp70. These studies instead strongly indicate that archaebacteria bear a close evolutionary relationship to the gram-positive bacteria and show polyphyletic branching within them. The halobacterial species consistently branch with the high-G1C-content gram-positive species, whereas methanogenic and thermoacidophilic archaebacteria were affiliated with the low-G1C-content gram-positive species. Both of the above inferences, i.e., the branching of archaebacteria with gram-positive species and their polyphyly, are statistically supported by the various phylogenetic methods, e.g., protein similarity scores, neighbor joining, maximum parsimony, and maximum likelihood (25, 30, 33, 43). Further, all archaebacteria and gram-positive eubacteria that have been examined thus far have been found to lack a large, relatively conserved insert of
VOL. 179, 1997
hsp70-BASED PHYLOGENY OF PROKARYOTES
353
FIG. 5—Continued.
23 to 27 amino acids that is present in all other eubacterial as well as eukaryotic homologs, supporting a specific relationship between these two groups. The polyphyletic branching of archaebacteria with gram-positive bacteria is not unique to or an idiosyncracy of hsp70 sequences, since a number of other conserved protein sequences, viz., those of glutamate dehydrogenase and glutamine synthetase, point to a similar relationship (5, 11, 25, 61, 69, 75). Another important aspect in which the hsp70 phylogeny differs from that based on rRNA concerns the placement of gram-positive species within the eubacterial group. The trees based on rRNA generally place gram-positive species between different groups of gram-negative species (17, 55, 56, 78, 84, 85). The eubacterial subdivisions consisting of the order Thermotogales, green nonsulfur bacteria, deinococci, and cyanobacteria generally show deeper branching than gram-positive species, whereas other groups consisting of purple bacteria, planctomycetes, spirochetes, chlamydiae, and flavobacteria, etc., branch either lower than or similarly to gram-positive
species (17, 55, 56, 85). However, most published eubacterial phylogenies based on rRNA do not give any bootstrap scores or other measures by which the confidence of these branching orders could be assessed (55, 56, 84, 85). In a few cases, where bootstrap scores are indicated, the values for many of the critical nodes (e.g., those leading to gram-positive bacteria) are in the range of 25 to 50%, indicating that many of the proposed eubacterial branching orders lack statistical support (17, 78). Similar low bootstrap scores have been obtained in phylogenies based on 5S or large-subunit rRNA (16, 77). In the latter case, gram-positive bacteria were also indicated to be polyphyletic (16). The phylogenies based on several protein sequences that have been previously examined, e.g., EF-Tu, Rho, RecA, and DNA-dependent RNA polymerase, have not proven useful in resolving the branching order between various eubacterial subdivisions, particularly between gram-positive and gramnegative bacteria (7, 17, 40, 57, 58, 61). This is partly due to the facts that these proteins are not highly conserved and that for many of them only a limited representation of eubacterial
354
GUPTA ET AL.
species is available. Due to these factors, the observed bootstrap scores in these phylogenies are again not significant enough to either support or refute a particular branching order. In contrast to these studies, the sequence features of hsp70 sequences and the phylogenies based on them strongly indicate that gram-negative eubacteria form a phylogenetically coherent group. The branching of eukaryotic homologs within this group is not an anomaly but is very likely a reflection of their endosymbiotic origin from specific members of this lineage (30, 31, 33). For the hsp70 family of proteins, a good representation of sequences from nearly all of the major groups of prokaryotic and eukaryotic biota is now available. Within eubacteria, hsp70 sequences from a-, b-, and g-proteobacteria, chlamydiae, spirochetes, cyanobacteria, T. aquaticus, numerous low-G1C-content as well as high-G1C-content gram-positive bacteria, and chloroplast and mitochondrial homologs are known, as shown in Table 1. The present investigation adds to these data from the cloning and sequencing of hsp70 homologs from two additional species, D. proteolyticus and T. roseum, which, based on 16S rRNA, represent some of the deepest subdivisions (viz., Deinococcus and green nonsulfur bacteria) in eubacteria (56, 85), and they are structurally and biochemically aligned with gram-negative bacteria. The phylogenetic analysis of available hsp70 sequences by the neighbor-joining method showed that the gram-negative bacteria formed a monophyletic cluster (excluding the eukaryotic homologs) 873 times out of 1,000. In a parsimony tree based on hsp70 sequences, 100 of 100 times the gram-negative eubacteria formed a coherent group. These scores are highly significant, and the inference from these concerning the monophyly of gram-negative eubacteria is strongly reinforced by the presence of a prominent sequence signature (a relatively conserved insert of 23 to 25 amino acids found in the N-terminal quadrant of the protein) that is shared by all of the gram-negative species but not found in any of the gram-positive bacteria or archaebacteria. It is important to mention, however, that the phylogenetic coherence of the gram-negative eubacteria is not dependent on or affected by the presence or absence of this insert (30, 33). Besides hsp70, a monophyletic grouping of gram-negative eubacteria is also observed for a number of other conserved protein sequences. For the glutamine synthetase I protein family, a monophyletic grouping of gram-negative eubacteria was observed in 97 and 60% of the bootstraps in maximum parsimony and neighboring-joining trees, respectively (11, 61). The GroEL or hsp60 family of protein sequences (unrelated to hsp70) likewise strongly indicates a monophyletic grouping of gram-negative eubacteria (27, 28, 80, 81). For the hsp60 family, this inference is further strengthened by the observation that the homologs from various gram-negative species that have been studied contain an extra amino acid in the sequence (at position 153 of the E. coli sequence) in comparison to that of the gram-positive eubacteria (Fig. 6). Thus, both the sequence features and the phylogenetic analyses of a number of highly conserved protein sequences support the concept that the gram-negative eubacteria constitute a monophyletic domain. Except for the above differences, the phylogenies based on these protein sequences are conventional in showing a clustering of different subdivisions of proteobacteria, close linkage of D. proteolyticus and T. aquaticus species, branching of mycoplasmas with low-G1C gram-positive bacteria, and a specific relationship of mitochondrial homologs to the a-proteobacteria and of chloroplast homologs to the cyanobacteria (18, 27– 31, 33, 80, 81). The present investigation provides confirmation regarding
J. BACTERIOL.
the phylogenetic placement of the Deinococcus group of species. Deinococcus species are orange to red bacteria that show gram-positive staining (10, 50, 70). However, a number of characteristics of these species, including the presence of an outer membrane and a fatty acid profile that is rich in palmitoleate and lacking any branched-chain members, indicate that they are more similar to the gram-negative bacteria and that the gram-positive staining is probably due to the thickness of the peptidoglycan component (10, 50, 70). Thus, the taxonomical classification of the genus Deinococcus has remained uncertain, and this fact has been recognized in Bergey’s Manual of Systematic Bacteriology (70). Based on 16S rRNA phylogeny, Deinococcus has been placed in a separate group with the genus Thermus (e.g., T. aquaticus) which branches in between the order Thermotogales, green nonsulfur bacteria, and cyanobacteria (10, 38, 56, 84, 85). The results presented here confirm a close affinity between the Deinococcus and Thermus genera (group together 100% of the time), and they further indicate that this group of species are evolutionarily closely related to the cyanobacterial group. The high bootstrap scores of the latter affiliation in both neighbor-joining (998 times out of 1,000) and parsimony (96 times out of 100) trees indicate that this placement is strongly preferred and reliable. In both neighbor-joining and parsimony trees based on hsp70 sequences, the clade consisting of Deinococcus-Thermus and cyanobacterial species showed the deepest branching within gramnegative eubacteria and thus showed the closest relationship to the gram-positive bacteria. A close evolutionary relationship between cyanobacteria and gram-positive bacteria has also been observed for a number of other gene sequences, including those of GroEL, ATPases, RecA, and 16S and 23S rRNAs (1, 16, 17, 27, 44, 81). In the case of 16S rRNA, certain sequence signatures are found to be uniquely shared by cyanobacteria and gram-positive bacteria (84). Recently, strong evidence for a close relationship of the cyanobacterial species to the grampositive bacteria and archaebacteria has been provided by the observation that for a number of conserved protein sequences (viz., glutamate dehydrogenase, FtszA, and FGARAT), homologs from all three of these groups shared sequence signatures that were not present in other gram-negative bacteria (29). It may now be asked, and it will have to be determined, which of these phylogenies may be nearest to the true bacterial phylogeny. In the rRNA tree, both gram-negative and grampositive species are indicated as polyphyletic, whereas archaebacteria form a monophyletic cluster very distantly related to all other bacteria. In contrast, the phylogenies based on a number of protein sequences, viz., hsp70, glutamine synthetase I, and hsp60, strongly point to a monophyletic and coherent nature of gram-negative species. However, as indicated earlier, in the rRNA-based phylogenies, the branching order of grampositive species and most of the deep-seated divisions of eubacteria is not reliable. In contrast, the sequence characteristics and phylogenies based on the above protein sequences make a persuasive case that these two groups are clearly distinct. The phylogenetic distinctness of gram-positive and gramnegative species also lent some credence to the taxonomical classification of bacteria based on their Gram-staining characteristics (9, 71, 72, 79). In this context, it should be pointed out that the taxonomical classification based on Gram-staining characteristics is not unambiguous. In addition to the species which can be clearly classified as gram positive or gram negative, a number of species show gram-variable staining or other unusual characteristics (e.g., Deinococcus and Mycoplasma spp., etc.) and, based on this criterion alone, could not be correctly classified (9, 84). However, the gram-positive and
VOL. 179, 1997
hsp70-BASED PHYLOGENY OF PROKARYOTES
355
FIG. 6. Excerpt from hsp60 sequence alignment showing a single amino acid insert common to various gram-negative eubacteria and eukaryotic organelle sequences but absent in the gram-positive bacteria. The complete alignment of hsp60 sequences from representative species has been published previously (28). The numbers at the top indicate the position in the E. coli sequence. Identity with the amino acid in the top line (dashes) and the gaps in the gram-positive species (■) are indicated. Abbreviations: A., Arabidopsis; Ac., Acyrthosiphon; Ag., Agrobacterium; Am., Amoeba (endosymbiont); Bac., Bacillus; Bar., Bartonella; Br., Brassica; Ch., Chromatium; Chi., Chinese; Cla., Chlamydia; Clo., Clostridium; Co., Coxiella; Cy., Cyanidium; Eh., Ehrlichia; He., Heliothis; Hel., Heliobacteria; L., Lactococcus; Le., Legionella; Lep., Leptospira; Myc., Mycobacterium; N., Neisseria; Pl., Plasmodium; Po., Porphyromona; Pse., Pseudomonas; R., Rhizobium; Ri., Rickettsia; S., Saccharomyces; Sa., Salmonella; Sta., Staphylococcus; Str., Streptomyces; Syn. or Sy., Synechocystis; Th., Thermophillic; Ther., Thermus; Tr., Treponema or Trypanosoma; Y., Yersinia; Z., Zymomonas; c, chloroplast; m, mitochondria; e, endosymbiont. Other numbers or letters after the species name refer to particular genes. The accession numbers of the sequences are also indicated.
gram-negative groups that are being referred to here are phylogenetically defined and sequence characteristic-based clades (e.g., presence or absence of an insert in the hsp70 sequence) and the species which show gram-variable phenotypes could thus be appropriately characterized by using these criteria.
If eubacteria consist of two distinct groups, as suggested here, then the question arises as to how these two groups are related to the archaebacteria. As discussed previously, since the phylogenies based on a number of different protein sequences, including hsp70, point to a closer relationship be-
356
GUPTA ET AL.
tween archaebacteria and gram-positive bacteria compared to gram-negative bacteria, and since several archaebacterial species also show gram-positive staining (9, 42, 72), it seems logical to infer that within the prokaryotic domain, gram-positive bacteria are the closest relatives of archaebacteria. The deepest branching in the rRNA trees of members of the Thermotogales, which are now indicated to be gram positive (11), also lends support to this inference. A close evolutionary relationship between archaebacteria and gram-positive bacteria, and in fact the derivation of the former group of species from the latter, has previously been suggested by Cavalier-Smith (13). ACKNOWLEDGMENTS This work was supported by a research grant from the Medical Research Council of Canada. We thank two anonymous reviewers for many helpful comments towards improving the manuscript. REFERENCES 1. Amann, R., W. Ludwig, and K. H. Schleifer. 1988. b-Subunit of ATP synthase: a useful marker for studying the phylogenetic relationship of eubacteria. J. Gen. Microbiol. 134:2815–2821. 2. Anzola, J., B. J. Luft, G. Gorgone, R. J. Dattwyler, C. Soderberg, R. Lahesmaa, and G. Peltz. 1992. Borrelia burgdorferi HSP70 homolog: characterization of an immunoreactive stress protein. Infect. Immun. 60:3704–3713. 3. Balows, A., H. G. Tru ¨per, D. Martin, W. Harder, and K.-H. Schleifer (ed.). 1992. The prokaryotes, 2nd ed., p. 1–4126. Springer-Verlag, New York. N.Y. 4. Bardwell, J., and E. A. Craig. 1984. Major heat shock gene of Drosophila and the Escherichia coli heat inducible dnaK gene are homologous. Proc. Natl. Acad. Sci. USA 81:848–852. 5. Benachenhou-Lahfa, N., P. Forterre, and B. Labedan. 1993. Evolution of glutamate dehydrogenase genes: evidence for two paralogous protein families and unusual branching pattern of the archaebacteria in the universal tree of life. J. Mol. Evol. 36:335–346. 6. Birkelund, S., A. G. Lundemose, and G. Christiansen. 1990. The 75-kilodalton cytoplasmic Chlamydia trachomatis L2 polypeptide is a DnaK-like protein. Infect. Immun. 58:2098–2104. 7. Bocchetta, M., E. Ceccarelli, R. Creti, A. M. Sanangelantoni, O. Tiboni, and P. Cammarano. 1995. Arrangement and nucleotide sequence of the gene (fus) encoding elongation factor G (EF-G) from the hyperthermophilic bacterium Aquifex pyrophilus: phylogenetic depth of hyperthermophilic bacteria inferred from analysis of the EF-G/fus sequences. J. Mol. Evol. 41:803–812. 8. Boorstein, W. R., T. Ziegelhoffer, and E. A. Craig. 1994. Molecular evolution of the HSP70 multigene family. J. Mol. Evol. 38:1–17. 9. Breed, R. S., E. G. D. Murray, and N. R. Smith. 1957. Bergey’s manual of determinative bacteriology, p. 1–1094. The Williams & Wilkins Company, Baltimore, Md. 10. Brooks, B. W., R. G. E. Murray, J. L. Johnson, E. Stackebrandt, C. R. Woese, and G. E. Fox. 1980. Red-pigmented micrococci: a basis for taxonomy. Int. J. Syst. Bacteriol. 30:627–646. 11. Brown, J. R., Y. Masuchi, F. T. Robb, and W. F. Doolittle. 1994. Evolutionary relationships of bacterial and archaeal glutamine synthetase genes. J. Mol. Evol. 38:566–576. 12. Bucca, G., C. P. Smith, M. Alberti, G. Seidita, R. Passantino, and A. M. Pugila. 1993. Cloning and sequencing of the dnaK region of Streptomyces coelicolor A3. Gene 130:141–144. 13. Cavalier-Smith, T. 1987. The origin of eukaryote and archaebacterial cells. Ann. N. Y. Acad. Sci. 503:17–54. 14. Cellier, M. F., J. Teyssier, M. Nicolas, J. P. Liantard, J. Marti, and J. Sriwidada. 1992. Cloning and characterization of the Brucella ovis heat shock protein DnaK functionally expressed in Escherichia coli. J. Bacteriol. 174: 8036–8042. 15. Craig, E. A., J. Karmer, J. Shilling, M. Werner-Washburne, S. Holmes, J. Kosic-Smithers, and C. M. Nicolet. 1989. SSC1, an essential member of the yeast HSP70 multigene family, encodes a mitochondrial protein. Mol. Cell. Biol. 9:3000–3008. 16. De Rijk, P., Y. Van de Peer, I. Van den Broeck, and R. deWachter. 1995. Evolution according to large ribosomal subunit RNA. J. Mol. Evol. 41:366– 375. 17. Eisen, J. A. 1995. The RecA protein as a model molecule for molecular systematics studies of bacteria: comparison of trees of RecAs and 16S rRNAs from the same species. J. Mol. Evol. 41:1105–1123. 18. Falah, M., and R. S. Gupta. 1994. Cloning of the hsp70 (dnaK) genes from Rhizobium meliloti and Pseudomonas cepacia: phylogenetic analyses of mitochondrial origin based on a highly conserved protein sequence. J. Bacteriol. 176:7748–7753. 19. Felsenstein, J. 1985. Confidence limits in phylogenies: an approach using the bootstrap. Evolution 39:783–791.
J. BACTERIOL. 20. Felsenstein, J. 1994. PHYLIP, version 3.5. University of Washington, Seattle. 21. Fleischmann, R. D., M. D. Adams, O. White, R. A. Clayton, E. F. Kirkness, A. R. Kerlavage, C. J. Bult, J.-F. Tomb, B. A. Dougherty, J. M. Merrick, K. McKenney, G. Sutton, W. Fitzhugh, C. A. Fields, J. D. Scott, R. Shirley, L.-I. Liu, A. Glodek, J. M. Kelley, J. F. Weidman, C. A. Phillips, T. Spriggs, E. Hedblom, M. D. Cotton, T. R. Utterback, M. C. Hanna, D. T. Nguyen, D. M. Saudek, R. C. Brandon, L. D. Fine, J. L. Fritchman, J. L. Fuhrmann, N. S. M. Geoghagen, C. L. Gnehm, L. A. McDonald, K. V. Small, C. M. Fraser, H. O. Smith, and J. C. Venter. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496–512. 22. Fox, G. E., E. Stackebrandt, R. B. Hepsell, J. Gibson, J. Maniloff, T. A. Dyer, R. S. Wolfe, W. E. Balch, R. S. Tanner, L. J. Magrum, L. B. Zablen, R. Blakemore, R. Gupta, L. Bonen, B. J. Lewis, D. A. Stahl, K. R. Luehrsen, K. N. Chen, and C. R. Woese. 1980. The phylogeny of prokaryotes. Science 209:457–463. 23. Fraser, C. M., J. D. Gocayne, O. White, M. D. Adams, R. A. Clayton, R. D. Fleischmann, C. J. Bult, A. R. Kerlavage, G. Sutton, J. M. Kelley, J. L. Fritchman, J. F. Weidman, K. V. Small, M. Sandusky, J. L. Fuhrmann, D. T. Nguyen, T. R. Utterback, D. M. Saudek, C. A. Phillips, J. M. Merrick, J.-F. Tomb, B. A. Dougherty, K. F. Bott, P.-C. Hu, T. S. Lucier, S. N. Peterson, H. O. Smith, C. A. Hutchison III, and J. C. Venter. 1995. The minimal gene complement of Mycoplasma genitalium. Science 270:397–403. 24. Galley, K. A., B. Singh, and R. S. Gupta. 1992. Cloning of hsp70 gene from Clostridium perfringens using a general polymerase chain reaction based approach. Biochim. Biophys. Acta 1130:203–208. 25. Golding, G. B., and R. S. Gupta. 1995. Protein based phylogenies support a chimeric origin for the eukaryotic genome. Mol. Biol. Evol. 12:1–6. 26. Gomes, S. L., J. W. Gober, and L. Shapiro. 1990. Expression of the Caulobacter heat shock gene dnaK is developmentally controlled during growth at normal temperatures. J. Bacteriol. 172:3051–3059. 27. Gupta, R. S. 1995. Evolution of the chaperonin families (hsp60, hsp10 and tcp-1) of proteins and the origin of eukaryotic cells. Mol. Microbiol. 15:1–11. 28. Gupta, R. S. 1996. Evolutionary relationships of chaperonins, p. 27–64. In R. J. Ellis (ed.), The chaperonins. Academic Press, New York, N.Y. 29. Gupta, R. S. Protein phylogenies and signature sequences: evolutionary relationships within prokaryotes and between prokaryotes and eukaryotes. Antonie Leeuwenhoek, in press. 30. Gupta, R. S., and G. B. Golding. 1993. Evolution of hsp70 gene and its implications regarding relationships between archaebacteria, eubacteria and eukaryotes. J. Mol. Evol. 37:573–582. 31. Gupta, R. S., and G. B. Golding. 1996. The origin of the eukaryotic cell. Trends Biochem. Sci. 21:166–171. 32. Gupta, R. S., and B. Singh. 1992. Cloning of hsp70 gene from Halobacterium marismortui: relatedness of archaebacterial hsp70 to its eubacterial homolog and a model for the evolution of the hsp70 gene. J. Bacteriol. 174:4594–4604. 33. Gupta, R. S., and B. Singh. 1994. Phylogenetic analysis of 70 kD heat shock protein sequences suggests a chimeric origin for the eukaryotic cell nucleus. Curr. Biol. 4:1104–1114. 34. Gupta, R. S., K. Aitken, M. Falah, and B. Singh. 1994. Cloning of Giardia lamblia heat shock protein hsp70 homologs: implications regarding origin of eukaryotic cells and of endoplasmic reticulum. Proc. Natl. Acad. Sci. USA 91:2895–2899. 35. Hatada, Y. 1994. Cloning and nucleotide sequence of a hsp70 gene from Streptomyces griseus. J. Ferment. Bioeng. 77:461–477. 36. Hearne, C. M., and D. J. Ellar. 1989. Nucleotide sequence of a Bacillus subtilis gene homologous to the dnaK gene of Escherichia coli. Nucleic Acids Res. 17:8373. 37. Hendric, J. P., and F.-U. Hartl. 1993. Molecular chaperone functions of heat shock proteins. Annu. Rev. Biochem. 62:349–384. 38. Hensel, B., W. Demharter, O. Kandler, R. M. Kroppenstedt, and E. Stackebrandt. 1986. Chemotaxonomic and molecular-genetic studies of the genus Thermus: evidence for a phylogenetic relationship of Thermus aquaticus and Thermus ruber to the genus Deinococcus. Int. J. Syst. Bacteriol. 36:444–453. 39. Hunt, C., and R. I. Morimoto. 1985. Conserved features of ekaryotic hsp70 genes revealed by comparison with the nucleotide sequence of human hsp70. Proc. Natl. Acad. Sci. USA 82:6455–6459. 40. Iwabe, N., K. Kuma, M. Hasegawa, S. Osawa, and T. Miyata. 1989. Evolutionary relationship of archaebacteria, eubacteria and eukaryotes inferred from phylogenetic trees of duplicated genes. Proc. Natl. Acad. Sci. USA 86:9355–9359. 41. Jennifer, M., S. Barril, S. G. Kim, and C. A. Batt. 1994. Cloning and sequencing of the Iactocossus lactis subsp. lactis dnaK gene using a PCRbased approach. Gene 142:91–96. 42. Kandler, O., and H. Ko ¨nig. 1985. Cell envelopes of archaebacteria, p. 413– 457. In C. R. Woese and R. S. Wolfe (ed.), The bacteria: a treatise on structure and function, vol. 8. Archaebacteria. Academic Press, London, United Kingdom. 43. Karlin, S. 1995. Statistical significance of sequence patterns in proteins. Curr. Opin. Struct. Biol. 5:360–371. 44. Karlin, S., G. M. Weinstock, and V. Brendl. 1995. Bacterial classifications
VOL. 179, 1997
45.
46. 47.
48.
49.
50.
51.
52.
53.
54.
55. 56. 57.
58.
59.
60.
61. 62.
63.
64.
65. 66. 67.
derived from RecA protein sequence comparisons. J. Bacteriol. 177:6881– 6893. Kornak, J. M., C. C. Kuo, and L. A. Campbell. 1991. Sequence analysis of the gene encoding the Chlamydia pneumoniae DnaK protein homolog. Infect. Immun. 59:721–725. Macario, A. J. L., C. B. Dugan, and E. C. D. Macario. 1991. A dnaK homolog in the archaebacterium Methanosarcino mazei S6. Gene 108:133–137. Marshall, J. S., and K. Keegstra. 1992. Isolation and characterization of a cDNA clone encoding the major HSP70 of the pea chloroplast Stroma. Plant Physiol. (Rockville) 100:1048–1054. McKenzie, K. R., E. Adamas, W. J. Britton, R. J. Garsia, and A. Basten. 1991. Sequence and immunogenicity of the 70-kDa heat shock protein of Mycobacterium leprae. J. Immunol. 147:312–319. Morimoto, R. I., A. Tissieres, and C. Georgopoulos (ed.). 1994. The biology of heat shock proteins and molecular chaperones. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. Murray, R. G. E. 1992. The family Deinococcaceae, p. 3732–3744. In A. Balows, H. G. Tru ¨per, D. Martin, W. Harder, and K.-H. Schleifer (ed.), The prokaryotes, 2nd ed. Springer-Verlag, New York, N.Y. Naberhaus, F., K. Giebesler, and H. Bahl. 1992. Molecular characterization of the dnaK region of Clostridium acetobutylicum, including grpE, dnaJ, and a new heat shock gene. J. Bacteriol. 174:3290–3299. Nimura, K., H. Yoshikawa, and H. Takahashi. 1994. Identification of dnaK multigene family in Synchecococcus sp. PCC7942. Biochem. Biophys. Res. Commun. 201:466–471. Normington, K., K. Kohno, Y. Kozutsumi, M. J. Gething, and J. Sambrook. 1989. Saccharomyces cerevisiae encodes an essential protein homologous in sequence and function to mammalian BIP. Cell 57:1223–1236. Ohta, T., K. Saito, M. Kuroda, K. Honda, H. Hirata, and H. Hayashi. 1994. Molecular cloning of two new heat shock genes related to the hsp70 genes in Staphylococcus aureus. J. Bacteriol. 176:4779–4783. Olsen, G. J., and C. R. Woese. 1993. Ribosomal RNA: a key to phylogeny. FASEB J. 7:113–123. Olsen, G. J., C. R. Woese, and R. Overbeek. 1994. The winds of (evolutionary) change: breathing new life into microbiology. J. Bacteriol. 176:1–6. Opperman, T., and J. P. Richardson. 1994. Phylogenetic analysis of sequences from diverse bacteria with homology to the Escherichia coli rho gene. J. Bacteriol. 176:5033–5044. Palm, P., C. Schleper, I. Arnold-Ammer, I. Holz, T. Meier, F. Lottspeich, and W. Zillig. 1993. The DNA-dependent RNA polymerase of Thermotoga maritima: characterization of the enzyme and the DNA sequence of the genes for the large subunits. Nucleic Acids Res. 21:4904–4908. Partridge, J., J. King, J. Krska, D. Rockabrand, and P. Blum. 1993. Cloning, heterologous expression, and characterization of the Drysipelothrix rhusiopathie DnaK protein. Infect. Immun. 61:411–417. Perry, J. J. 1992. The Genus Thermomicrobium, p. 3775–3779. In A. Balows, H. G. Tru ¨per, D. Martin, W. Harder, and K.-H. Schleifer (ed.), The prokaryotes, 2nd ed. Springer-Verlag, New York, N.Y. Pesole, G., C. Gissi, C. Lanave, and C. Saccone. 1995. Glutamine synthetase gene evolution in bacteria. Mol. Biol. Evol. 12:189–197. Reith, M., and J. Munholland. 1991. An HSP70 homolog is encoded on the plastid genome of the red alga, Porphyra umbilicalis. FEBS Lett. 294:116– 120. Rochester, D. E., J. A. Winer, and D. M. Shah. 1986. The structure and expression of maize genes encoding the major heat shock protein, hsp70. EMBO J. 5:451–458. Rubin, D. M., A. Mehta, J. Zhu, S. Shoham, X. J. Chen, Q. Well, and K. B. Palter. 1993. Genomic structure and sequence analysis of Drosophila melanogaster HSC70 genes. Gene 128:155–163. Saitou, N., and M. Nei. 1987. The neighbor joining method: a new method of reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. Seaton, B. L., and L. E. Vickery. 1994. A gene encoding a new Dnak/hsp70 homolog in Escherichia coli. Proc. Natl. Acad. Sci. USA 91:2066–2070. Segal, G., and E. Z. Ron. 1995. The dnaKJ operon of Agrobacterium tume-
hsp70-BASED PHYLOGENY OF PROKARYOTES
68. 69. 70. 71. 72. 73. 74. 75.
76. 77.
78. 79. 80. 81. 82.
83. 84. 85.
86. 87. 88. 89.
357
faciens: transcriptional analysis and evidence for a new heat shock promoter. J. Bacteriol. 177:5952–5958. Slater, M. R., and E. A. Craig. 1989. The SSA1 and SSA2 genes of the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 17:805–806. Smith, M. W., D. F. Feng, and R. F. Doolittle. 1992. Evolution by acquisition: the case for horizontal gene transfer. Trends Biochem. Sci. 17:489–493. Sneath, P. H. A., N. S. Mair, M. E. Sharpe, and J. G. Holt (ed.). 1986. Bergey’s manual of systematic bateriology, vol. 2, p. 1035–1043. Williams & Wilkins, Baltimore, Md. Stanier, R. Y., and C. B. Van Niel. 1941. The main outlines of bacterial classification. J. Bacteriol. 42:437–466. Stanier, R. Y., J. L. Ingraham, M. L. Wheelis, and P. R. Painter. 1987. The archaebacteria, p. 330–343. General microbiology, 5th ed. Macmillan Education Ltd., London, United Kingdom. Stevenson, K., N. F. Inglis, B. Rae, W. Donachie, and J. M. Sharp. 1991. Complete nucleotide sequence of a gene encoding the 70 Kd heat shock protein of Mycobacterium paratuberculosis. Nucleic Acids Res. 19:4552. Sussman, M. D., and P. Setlow. 1987. Nucleotide sequence of a Bacillus megaterium gene homologous to the dnaK gene of Escherichia coli. Nucleic Acids Res. 15:3923. Tiboni, O., P. Cammarano, and A. M. Sanangelantoni. 1993. Cloning and sequencing of the gene coding glutamine synthetase I from the archaeum Pyrococcus woesi: anomalous phylogenies inferred from analysis of archaeal and bacterial glutamine synthetase I sequence. J. Bacteriol. 175:2961–2969. Ting, J., and A. S. Lee. 1988. Human gene encoding the 78,000-dalton glucose-regulated protein and its pseudogene: structure, conservation and regulation. DNA 1:275–286. Van den Hynde, H., Y. Van de Peer, J. Perry, and R. deWachter. 1990. 5S rRNA sequences of representatives of the genera Chlorobium, Prosthecochloris, Thermomicrobium, Cytophaga, Flavobacterium, Flexibacter and Saprospira and a discussion of the evolution of bacteria in general. J. Gen. Microbiol. 136:11–18. Van de Peer, Y., J.-M. Neefs, P. DeRijk, P. DeVos, and R. DeWachter. 1994. About the order of divergence of the major bacterial taxa during evolution. Syst. Appl. Microbiol. 17:32–38. Van Neil, C. B. 1946. The classification and natural relationship of bacteria. Cold Spring Harbor Symp. Quant. Biol. 11:285–301. Viale, A. M., and A. K. Arakaki. 1994. The chaperone connection to the origins of the eukaryotic organelles. FEBS Lett. 341:146–151. Viale, A. M., A. K. Arakaki, F. C. Soncini, and R. G. Ferreyra. 1994. Evolutionary relationships among eubacterial groups as inferred from GroEL (chaperonin) sequence comparisons. Int. J. Syst. Bacteriol. 44:527–533. Wilson, K. 1994. Preparation of genomic DNA from bacteria, p. 2.4.1–2.4.5. In F. M. Ausubel, R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A. Smith, K. Struhl, L. M. Albright, D. M. Coen, A. Varki, and K. Janssen (ed.), Current protocols in molecular biology, 2nd ed., vol. 1. John Wiley & Sons, Inc., New York, N.Y. Winefield, C. S., K. J. F. Farnden, P. H. S. Reynolds, and C. J. Marshall. 1995. Evolutionary analysis of aspartate aminotransferases. J. Mol. Evol. 40:455–463. Woese, C. R. 1987. Bacterial evolution. Microbiol. Rev. 51:221–271. Woese, C. R. 1991. The use of ribosomal RNA in reconstructing evolutionary relationships among bacteria, p. 1–24. In R. K. Selander, A. G. Clark, and T. S. Whittmay (ed.), Evolution at the molecular level. Sinauer Associates Inc., Publishers, Sunderland, Mass. Woese, C. R. 1994. There must be a prokaryote somewhere: microbiology’s search for itself. Microbiol. Rev. 58:1–9. Woese, C. R., O. Kandler, and M. L. Wheelis. 1990. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria and Eucarya. Proc. Natl. Acad. Sci. USA 87:4576–4579. Zuber, M., T. A. Hoover, M. T. Dertzbaugh, and D. L. Court. 1995. Analysis of the DnaK molecular chaperone system of Francisella tulareusis. Gene 164:149–152. Zuckerkandl, E., and L. Pauling. 1965. Molecules as documents of evolutionary history. J. Theor. Biol. 8:357–366.