Glyceraldehyde-3-Phosphate Dehydrogenase Gene Diversity in ...

39 downloads 32 Views 452KB Size Report
ing protocols from the supplier (New England Biolabs). Page 3. GAPDH Gene Diversity in Eubacteria and Eukaryotes 431. Subsequently, the DNA fragments ...
Glyceraldehyde-3-Phosphate Dehydrogenase Gene Diversity in Eubacteria and Eukaryotes: Evidence for Intra- and Inter-Kingdom Gene Transfer Rainer Martin Figge, Milada Schubert, Henner Brinkmann, and Ru¨diger Cerff Institut fu¨r Genetik, Technische Universita¨t Braunschweig, Germany Cyanobacteria contain up to three highly divergent glyceraldehyde-3-phosphate dehydrogenase (GAPDH) genes: gap1, gap2, and gap3. Genes gap1 and gap2 are closely related at the sequence level to the nuclear genes encoding cytosolic and chloroplast GAPDH of higher plants and have recently been shown to play distinct key roles in catabolic and anabolic carbon flow, respectively, of the unicellular cyanobacterium Synechocystis sp. PCC6803. In the present study, sequences of 10 GAPDH genes distributed across the cyanobacteria Prochloron didemni, Gloeobacter violaceus PCC7421, and Synechococcus PCC7942 and the a-proteobacterium Paracoccus denitrificans and the b-proteobacterium Ralstonia solanacearum were determined. Prochloron didemni possesses homologs to the gap2 and gap3 genes from Anabaena, Gloeobacter harbors gap1 and gap2 homologs, and Synechococcus possesses gap1, gap2, and gap3. Paracoccus harbors two highly divergent gap genes that are related to gap3, and Ralstonia possesses a homolog of the gap1 gene. Phylogenetic analyses of these sequences in the context of other eubacterial and eukaryotic GAPDH genes reveal that divergence across eubacterial gap1, gap2, and gap3 genes is greater than that between eubacterial gap1 and eukaroytic glycolytic GapC or between eubacterial gap2 and eukaryotic Calvin cycle GapAB. These data strongly support previous analyses which suggested that eukaryotes acquired their nuclear genes for GapC and GapAB via endosymbiotic gene transfer from the antecedents of mitochondria and chloroplasts, and extend the known range of sequence diversity of the antecedent eubacterial genes. Analyses of available GAPDH sequences from other eubacterial sources indicate that the glycosomal gap gene from trypanosomes (cytosolic in Euglena) and the gap gene from the spirochete Treponema pallidum are each other’s closest relatives. This specific relationship can therefore not reflect organismal evolution but must be the result of an interkingdom gene transfer, the direction of which cannot be determined with certainty at present. Contrary to this, the origin of the cytosolic Gap gene from trypanosomes can now be clearly defined as g-proteobacterial, since the newly established Ralstonia sequence (b-proteobacteria) branches basally to the g-proteobacterial/trypanosomal assemblage.

Introduction Glyceraldehyde-3-phosphate dehydrogenase (GAPDH, phosphorylating: EC 1.2.1.12 and 1.2.1.13) is an ancient and ubiquitously distributed enzyme of sugar phosphate metabolism that catalyzes the reversible interconversion of glyceraldehyde-3-phosphate and 3phosphoglycerate (Fothergill-Gilmore and Michels 1993). Two very different types of phosphorylating GAPDH enzymes with only 10% to 15% sequence similarity are distributed across eubacteria and archaebacteria; these were designated as class I and class II GAPDH (Cerff 1995). Whereas class II GAPDH has so far only been found in archaebacteria (Hensel et al. 1989), class I GAPDH is the classic enzyme of glycolysis and the Calvin cycle in eubacteria and eukaryotes and has been characterized in one halophilic archaebacterium (Pru¨ß, Meyer, and Holldorf 1993; Brinkmann and Martin 1996). Across eubacterial genomes, class I GAPDH genes possess an exceptional degree of sequence diversity in the form of gene families. For example, three highly divergent gap genes each are found in the genomes of Anabaena variabilis (Martin et al. 1993), Escherichia coli (Blattner et al. 1997), and P. aeruginosa (see Pseudomonas genome project: http://www.pseudomonas.com/), whereas two highly divergent gap genes are found in Key words: cyanobacteria, proteobacteria, spirochetes, euglenozoa, glyceraldehyde-3-phosphate dehydrogenase, organelle evolution, gene transfer, long-branch attraction. Address for correspondence and reprints: Ru¨diger Cerff, Institut fu¨r Genetik, Technische Universita¨t Braunschweig, Spielmannstr. 7, D38106 Braunschweig, Germany. E-mail: [email protected]. Mol. Biol. Evol. 16(4):429–440. 1999 q 1999 by the Society for Molecular Biology and Evolution. ISSN: 0737-4038

the genomes of Synechocystis sp. PCC6803 (Kaneko et al. 1996; Koksharova et al. 1998) and Bacillus subtilis (Kunst et al. 1997). However, these genomes do not possess the same two or three gap genes. Rather they possess divergent samples of a larger GAPDH gene family, members of which are distributed in a skewed manner across contemporary eubacterial genomes. This view is substantiated by the finding that other gap genes are present in eubacterial genomes (e.g., Clostridium, Corynebacterium, Thermotoga, Thermus, and Deinococcus) that apparently are not orthologous to the gap genes in the aforementioned sequenced genomes (Martin et al. 1993; Brown and Doolittle 1997; Viscogliosi and Mu¨ller 1998). All eukaryotic GAPDH genes surveyed to date are nuclear encoded and were apparently aquired from eubacteria. The vast majority of eukaryotic GAPDH genes descend from only two distinct members of the eubacterial class I GAPDH family, genes that are designated in cyanobacteria as gap1 and gap2 (Martin et al. 1993; Cerff 1995; Brown and Doolittle 1997; Martin and Schnarrenberger 1997). Eubacterial gene gap1 is closely related to eukaryotic gene GapC, encoding, as a rule, cytosolic GAPDH of glycolytic-gluconeogenetic function. This nuclear gene was probably acquired from the proteobacterial antecedents of mitochondria (Smith 1989; Martin et al. 1993; Liaud et al. 1994), although orthologous gap1 genes are present in other eubacterial phyla, including cyanobacteria. A notable exception to this rule is cytosolic GapC of Euglena and its glycosomal equivalent in trypanosomes, which are probably not of mitochondrial origin because they specify a branch on the GapC tree rooting basally to the proteo- cyano429

430

Figge et al.

bacterial divergence. Another exception is the cytosolic GapC, only found in certain trypanosomes, which is surprisingly close to the corresponding homolog in g-proteobacteria (Michels et al. 1991; Henze et al. 1995). The eubacterial gene gap2, on the other hand, shows only 40% to 50% sequence similarity to the gene gap1. It has so far been found only in cyanobacteria and is highly similar to the nuclear gene GapAB encoding chloroplast Calvin cycle GAPDH in plants, green and red algae, as well as in Euglena gracilis (Brinkmann et al. 1989; Martin et al. 1993; Kersanach et al. 1994; Liaud et al. 1994; Henze et al. 1995). Hence, cyanobacteria and photosynthetic eukaryotes with green and red plastids possess two divergent GAPDH enzymes that differ with respect to their cosubstrate specificity and their role in sugar phosphate metabolism. The glycolytic (catabolic) enzyme both in plants (GapC) and cyanobacteria (Gap1) is NAD-specific, whereas the photosynthetic enzyme in plants (GapAB) is NADP-dependent but active with both NADP and NAD in cyanobacteria (Gap2) (Cerff 1978; Koksharova et al. 1998). Some cyanobacterial genomes harbor a third member of the eubacterial gap gene family that has not been found in eukaryotic genomes to date—gap3. The function of cyanobacterial gap3 is not known. Previous analyses suggested that gap3 has phylogenetic affinity to a gap gene of E. coli (gapB; [Alefounder and Perham 1989]), that has recently been shown to encode a highly specific erythrose-4-phosphate dehydrogenase (E4PDH) rather than glyceraldehyde-3-phosphate dehydrogenase (Zhao et al. 1995). Curiously, gap genes specific to the Calvin cycle in Rhodobacter sphaeroides (Chen et al. 1991) and Ralstonia eutropha (previously called Alcaligenes eutrophus) (Schaferjohann, Yoo, and Bowien 1995) and Xanthobacter flavus (Meijer, van den Bergh, and Smith 1996) are more similar to the E4PDH (gapB) of E. coli than to the Calvin cycle–specific gap2 genes of cyanobacteria (Henze et al. 1995), indicating that recruitment of GAPDH genes for autotrophic function has occurred at least twice independently in eubacterial evolution. In order to better understand the functional and evolutionary relationships between eubacterial and eukaryotic GAPDH genes, we have used primers that will amplify various members of the eubacterial gap gene family to isolate and sequence gap genes from the cyanobacteria Prochloron didemni, Gloeobacter violaceus PCC7421, Synechococcus PCC7942, and the proteobacteria Paracoccus denitrificans and Ralstonia solanacearum (previously called Pseudomonas solanacearum). We report 10 new eubacterial GAPDH sequences that can either be assigned to the gap1 and gap2 subfamilies or to a cluster that contains gap3 and several related proteobacterial sequences. We examine the phylogeny of eubacterial and eukaryotic GAPDH genes in the context of endosymbiotic gene transfer during mitochondrial and plastid symbiosis. Furthermore, we have investigated the effect of long-branch attraction and outgroup effects on the position of the unstably branching, rapidly evolving GAPDH sequence for GapA from the photosynthetic protist E. gracilis.

Materials and Methods Nomenclature The designations for individual GAPDH genes and proteins follow the guidelines of the Plant Gene Nomenclature Commission (Mnemonic/Numeric system of the Mendel Database). All GAPDH genes and proteins start with the mnemonic term ‘‘gap’’ followed by a number (‘‘gap1’’, ‘‘gap2’’, etc.) or a letter (‘‘GapA’’, ‘‘GapB’’, etc.) or both (‘‘GapA1’’, ‘‘GapA2’’, etc.). All gene names are written in italics, with the first letter in lowercase (gap1) or uppercase (GapA) for bacterial/organellar and nuclear-encoded genes, respectively. All GAPDH proteins, whether encoded by bacterial/organellar or nuclear genes, have the same names as their corresponding genes but are always written in standard style and with the first letter in uppercase, e.g., proteins/ enzymes Gap1, GapA, GapA1, etc. Bacterial Strains, Plasmids, and Culture Conditions Wild type Synechococcus PCC7942 was grown photoautotrophically in a modified BG11 medium at 258C under high-intensity fluorescent light (125 mEm22s21) as previously described (Koksharova et al. 1998). G. violaceus (SAG 7.82, Sammlung von Algenkulturen, Go¨ttingen) was grown as indicated above, but at low light intensities (5 mEm22s21). DNA of R. solanacearum (formerly Burgholderia solanacearumDSM9544) was purchased from the Deutsche Sammlung fu¨r Mikroorganismen. Escherichia coli strains XL1-blue and POP13 were grown with appropriate antibiotics according to standard procedures (Sambrook et al. 1989). Cloning vectors used were phage NM1149 (Murray 1983) and pBluescript-SK(1) (Stratagene). Plasmid transformation, selection, and testing for recombinant clones was performed as described (Sambrook, Fritsch, and Maniatis 1989). Cloning and Sequencing of Eubacterial gap Genes Degenerate primers for two highly conserved regions at the N-terminal and C-terminal ends of GAPDH proteins (INGFGRI, WYDNE) were used for the amplification of gap genes (95% of the coding sequence) from genomic DNA samples. For the amplification of gap sequences from Gloeobacter and P. didemni, the following primers were used: G/AINGFG: 59-GSNATH AAYGGNTTYGG-39andWYDNEW:59-CCAYTCRTT RTCRT-ACCA-39(noncoding strand). In the case of Synechococcus and Ralstonia, slightly different primers were used: INGFGR: 59-ATHAAYGGNTTYGGNMG39andWYDNE:59-AYTCRTTRTCRTACCA-39 (noncoding strand). Polymerase chain reaction (PCR) conditions were as follows: cycle 1: 938C for 5 min; cycles 2 to 35: 938C for 1 min, 508C for 1 min, and 728C for 2 min; and cycle 36: 728C for 5 min. All reactions were performed in a Perkin Elmer thermocycler at a Mg21 concentration of 1.5 mM. Chromosomal DNA was prepared as described in Koksharova et al. (1998). Amplification products of corresponding size were treated with polynucleotide kinase and Klenow DNA polymerase using protocols from the supplier (New England Biolabs).

GAPDH Gene Diversity in Eubacteria and Eukaryotes

Subsequently, the DNA fragments were cloned into pBluescript-SK1 and both strands of at least three independent PCR-generated clones for each gene were sequenced with appropriate oligonucleotides using the dideoxy chain termination method (Pharmacia protocol). Synechococcus and Paracoccus gap genes were cloned from genomic libraries. For the generation of the Synechococcus library, chromosomal DNA from an autotrophically grown, axenic culture was digested to completion with HindIII and cloned into the HindIII site of phage lambda NM1149. In case of Paracoccus, chromosomal DNA was completely digested either with HindIII or with EcoRI and cloned into the HindIII or EcoRI site of the phage lambda NM1149, respectively. Packaging of phage particles was performed as described (Hohn and Murray 1977; Kaiser and Murray 1988). Positively hybridizing clones were isolated by plaque hybridization in 5x SSC, 0.02% PVP, 0.02% Ficoll 400, 0.1% SDS, and 50 mg/ml denatured salmon sperm DNA. For Synechococcus screens were performed with homologous probes at 688C; the Paracoccus library was probed with the end-labeled (PNK) oligomer WYDNE at 308C. Probes for Synechococcus were generated by PCR as described above and labeled with a-32P dCTP by random priming. Filters were washed twice for 30 min in 2x SSC, 0.1% SDS at the hybridization temperatures. Hybridizing clones were isolated (Synechococcus gap2 and gap3; Paracoccus two gap clones) and genomic fragments were subcloned into the HindIII or EcoRI site of pBluescript-SK1 and sequenced as described above. The gap1 gene from Synechococcus was cloned from a Lambda FIXt library (Stratagene). For the generation of the library, genomic DNA from Synechococcus was partially digested with Sau3AI and partially refilled (see protocol of manufacturer) fragments of sizes between 7 and 23 kb were cloned into the XhoI site of Lambda FIXt. DNA from hybridizing clones was digested with SalI and subcloned into pBluescript-SK1 and sequenced as described above. The nucleotide sequence data reported will appear in the DDBJ/EMBL/GenBank International Nucleotide Sequence Database under the accession numbers X91235, X91236, X91237 (gap1, gap2 and gap3 of Synechococcus PCC7942); X92514 (gap1 of Ralstonia solanacearum); AJ007271, AJ007272 (gap2 and gap3 of Prochloron didemni); AJ007273, AJ007274 (gap1 and gap2 of Gloeobacter violaceus); AJ012158 and AJ012157 (gap1 and gap2 of Paracoccus denitrificans both belonging to subfamily Gap-III). Phylogenetic Analysis Except for P. aeroginosa gap2, which was obtained from TIGR directly, (see: http://www.tigr.org/tdb/mdb/ mdb.html) all GAPDH sequences used were retrieved from GenBank. Database accession numbers of the sequences included in the dataset are the following: A. variabilis gap1, L07497; A. variabilis gap2, L07498; A. variabilis gap3, L07499; Arabidopsis thaliana GapA, M64114; A. thaliana GapB, M64115; Bacteroides fragilis gap1, U27541; Chlamydia trachomatis gap1,

431

U83198; Chlamydomonas reinhardtii GapA, L27668; Chondrus crispus GapA, X73033; C. crispus GapC, X73036; Gallus gallus GapC, K01458; Entamoeba histolytica GapC, M89790; E. coli gap1 (A), X02662; E. coli gap, X14436; E. gracilis GapA, L21904; E. gracilis GapC, L21903; Giardia lamblia GapC, U31911; Gracilaria verrucosa GapA, Z15102; Haemophilus influenzae gap1, U32848; Leishmania mexicana GapC, X65220; Pinus sylvestris GapA, L26923; Pisum sativum GapA, X15190; P. sativum GapB, X15188; P. sativum GapC, L07500; Pseudomonas aeruginosa gap1, M74256; R. eutropha gap1, U12422; Rhodobacter capsulatus gap3, M68914; R. sphaeroides gap, M68914; Synechocystis PCC6803 gap1, X86375; Synechocystis PCC6803 gap2, X86376; T. pallidum gap1, AE001255; Trypanosoma brucei GapC, X53472; T. brucei GapCg, X59955; Ustilago maydis GapC, X07879; Xanthobacter flavus gap, U33064; Zea mays GapA, X07157; Zymomonas mobilis gap, M18802. Deduced amino acid sequences were aligned with CLUSTAL W (Thompson, Higgins, and Gibson 1994) and refined by hand using the SEQED option of the MUST package (Philippe 1993). After the exclusion of gaps and the elimination of the N- and C-terminal regions in front of GINGFG and after WYDNE, the resulting alignment contains 49 sequences with 307 positions. A phylogenetic tree was constructed using Kimura distance estimation and the NJ algorithm (Saitou and Nei 1987), and subsequently the statistical support for all nodes was estimated by 1,000 bootstrap replicates using the NJBOOT option of the MUST package. The same alignment was used in a maximum-parsimony (MP) (Swofford 1993) bootstrap analysis (100 replicates). In MP bootstrap analyses, the default settings were used, except that (1) bootstrap values lower than 50% were retained, and (2) sequences were randomly added for each of the 100 bootstrap replicates. The same dataset of 49 sequences was also subjected to maximumlikelihood analyses (Adachi and Hasegawa 1996) with the NJ DIST and local rearrangement options of ProtML. Results Cloning and Sequence Analysis of Bacterial gap Genes We determined five partial and five complete GAPDH sequences from three cyanobacteria and from one a- and one b-proteobacterium (shown in fig. 1). The prochlorophyte P. didemni (Lewin and Withers 1975) harbors homologs to the gap2 and gap3 genes from A. variabilis (ATCC 29413); the sequences from G. violaceus PCC7421 (Rippka, Waterbury, and Cohen-Bazire 1974) could be identified as gap1 and gap2 homologs. As in Anabaena, three GAPDH genes are present in Synechococcus PCC7942. Two highly divergent gap genes were found in the a-proteobacterium P. denitrificans, and in the b-proteobacterium R. (Pseudomonas) solanacearum, only a single gap1 gene was amplified. A comparison of the 10 deduced protein sequences for Gap1, Gap2, and Gap3 determined in this study with

432

Figge et al.

GAPDH Gene Diversity in Eubacteria and Eukaryotes

some of their respective homologs in other eubacteria and eukaryotes (fig. 1) revealed unexpectedly high similarities between the euglenozoan GapC sequences from Euglena and Trypanosoma brucei (75% similarity) on the one hand and the Gap1 sequence of the spirochete T. pallidum (Fraser et al. 1998) on the other (67% and 66% similarity, respectively). This euglenozoan/spirochaete subgroup has a quite low overall similarity relative to all other members of the Gap1/GapC family (48% to 61% based on the Gap1/GapC dataset used for tree construction in fig. 2), which coincides with the presence of three unique conserved insertions at positions 22A–22C (-K/GLL-), 164A (-G-), and 302A–302E (-LPNEK/R-), respectively (see fig. 1). Trypanosoma and Euglena share another unique insertion at position 60A–60H (-KSDANLAE- and -KSKPSVAK-, respectively) which is not present in Treponema. All insertions, except G-164A, contain charged residues, suggesting that they are solvent exposed and located at the surface of the native subunit. Phylogenetic Analysis of GAPDH Sequences We selected 32 different organisms harboring 49 GAPDH proteins for phylogenetic analyses. Three hundred and seven positions were used to construct phylogenetic trees by distance matrix (Kimura correction) and maximum-parsimony methods. Since the resulting topologies were almost identical, only the phylogenetic tree resulting from the distance matrix analysis is shown in figure 2. In order to estimate the robustness of internal branches, bootstrap proportions for both NJ and MP trees (1,000 and 100 replicates, respectively) were calculated and are indicated at the corresponding nodes. The resulting tree shows three major subgroups, containing cyanobacterial gap genes that are related to the two distinct genes gap1 and gap2 and a third cluster that contains, in addition to the gap3 sequences, several gap3-related proteobacterial genes. For the three different categories of Gap genes/proteins defined by bootstrap values higher than 90%, we chose the operational designations type Gap-I, type Gap-II, and type Gap-III. These definitions are based on phylogenetic criteria and are independent of the individual, species-specific gene designations which usually have no phylogenetic meaning and, hence, lead to confusions in intertaxonic and evolutionary comparisons (e.g., genes gapB and GapB of E. coli and higher plants, respectively; see fig. 2)

433

In the Gap1/GapC (Gap-I) subtree, a group of proteobacterial sequences is evident that includes the Gap1 sequences from E. coli and R. solanacearum and the sequences for GapC of the eukaryotic cytosol that are arguably of mitochondrial origin (Martin et al. 1993; Henze et al. 1995; Martin and Schnarrenberger 1997). A Gap sequence from B. fragilis that belongs to the bacteroides/flavobacterium group branches robustly with the Gap1 sequence from the b-proteobacterium Ralstonia. The cyanobacterial Gap1 sequences of Synechocystis, Synechococcus, Anabaena, and Gloeobacter are basal to sequences from the proteobacterial subdivisions. The position of C. trachomatis Gap1 could not be resolved in these analyses. Whereas the NJ and MP methods place it with cyanobacteria, though with low bootstrap support, in the ProtML analysis (using NJDIST and the local-rearrangement option), it is more closely related to proteobacteria. The GapC sequences of the two Euglenozoa and the spirochete Treponema branch robustly together at the base of the Gap1/GapC (Gap-I) subtree in figure 2. The Gap2/GapAB (Gap-II) subtree contains Gap2 sequences from cyanobacteria and the nuclear-encoded chloroplast enzymes of plants (GapA and GapB). The Gap2 sequence of Gloeobacter branches basally among the Gap2 sequences analyzed (fig. 2). Branching orders within the cyanobacterial Gap2 cluster vary considerably and do not receive high bootstrap support in any analysis. The position of the GapA sequence from E. gracilis is very unstable and assumes different branching positions depending on species sampling and the phylogenetic analysis performed (see below and Discussion). The Gap-III subtree can be subdivided into three separate groups: first, it carries a cyanobacterial Gap3 branch which also includes a closely related sequence from the a-proteobacterium R. capsulatus that was recently established in the course of the Rhodobacter genome project. Sequence identity between the R. capsulatus Gap sequence and cyanobacterial Gap3 sequences is on the order of 60% to 64%, comparable to the identity between Synechococcus and Anabaena Gap3 (67%). The branch bearing cyanobacterial Gap3 sequences is quite long, suggesting an elevated evolutionary rate for these sequences. Gap-III carries another fast-evolving group with sequences related to E. coli GapB, that was shown to have

← FIG. 1.—Alignment of derived amino acid sequences Gap1, Gap2, and Gap3 from Synechococcus with their corresponding homologs in eubacteria and eukaryotes in three separate blocks representing subfamilies Gap-I, Gap-II, and Gap-III (see fig. 2). Only the reference sequences (Synechococcus Gap1, Gap2, Gap3) are written in full. Amino acids not identical to the reference are shown. Identical residues are replaced by dots. The numbering system is in accordance with the one based on the three-dimensional structure of Bacillus stearothermophilus GAPDH. P188 denotes the presence (Gap1) or absence (Gap2) of Pro-188 indicative of NAD specificity and dual cosubstrate specificity with NAD and NADP, respectively (Clermont et al. 1993). The only known exception to this rule is glycosomal GapC of trypanosomes (sequence 7), which has a valine in position 188 and which has been shown to be strictly NAD-specific (Misset et al. 1987). Val-188 is also present in Treponema but not in Euglena GapC (sequences 5 and 6). Insertions at positions 22A–22C, 164A, and 302A–302E are specific to T. pallidum, E. gracilis and T. brucei and are indicated by asterisks above the alignment. The arrowhead (.) at the beginning of the sequence indicates the presence of a transit peptide. Missing residues in partial sequences or indels are indicated by dashes; double points at the C-terminal end denote termination codons. The strict-consensus sequence is shown below the alignment. Percentage values at the bottom of the sequence were obtained by pairwise comparison. Sequences from this study are as follows: Synechococcus Gap1, Gap2, Gap3; R. solanacearum Gap1; G. violaceus Gap1 and Gap2; P. didemni Gap1 and Gap3 and P. denitrificans Gap1 and Gap2 sequences which belong to subfamily Gap-III. Sources for all other sequences are indicated in Materials and Methods.

434

Figge et al.

FIG. 2.—Phylogenetic tree constructed by the neighbor-joining method with the Kimura formula for the estimation of distances. Bootstrap values indicate the number of times that a given node was detected out of 1,000 neighbor-joining replicates (upper value) or out of 100 maximumparsimony replicates (lower value). Only bootstrap values above 50% are indicated. Dashes at internodes designate topologies that differed in the MP bootstrap analysis from the NJ analysis shown and which, in addition, were supported by MP bootstrap values higher than 50 percent. The scale bar indicates 0.1 amino acid substitutions per nonsynonymous site. Closed Diamonds and arrowheads indicate gene duplications and gene transfer events, respectively. Sequences established in this study are shown in bold.

erythrose-4-phosphate dehydrogenase activity (Zhao et al. 1995). In addition, a third group of quite slowly evolving sequences from a-, b-, and g-proteobacteria is part of this cluster. The topology of these latter sequences is in agreement with the generally accepted rRNA phylogeny. This group contains a Gap sequence from R.

sphaeroides that shares only 45% identity with the Gap3 sequence from R. capsulatus. The same degree of divergence is observed when the two newly established Paracoccus sequences are compared, demonstrating that the Gap-III cluster is much more divergent than the GapI or Gap-II subtrees.

GAPDH Gene Diversity in Eubacteria and Eukaryotes

435

FIG. 3.—Influence of outgroup species sampling and LBA on the position of Euglena. (A), schematic demonstration of the LBA influence: distant outgroup sequences pull fast-evolving Euglena GapA from its ‘‘correct’’ position on the green algal branch to the most basal position of the tree. B, C, and D represent Gap2/GapAB (Gap-II) NJ topologies which are rooted with (B) 33 outgroup sequences, (C) 27 outgroup sequences, or (D) unrooted, leading to a decreased attraction of Euglena GapA toward the basis of the tree. Outgroup sequences in B are as in figure 2. In topology C, the following six Gap1/GapC (Gap-I) outgroup sequences have been omitted: E. histolytica, G. lamblia, Leishmania mexicana, H. influenzae, E. coli, and cytosolic GapC of T. brucei. Only NJ bootstrap values above 40% are shown.

Influence of Outgroup Sampling on the Topology of Subtree Gap-II Subtree Gap-II carries only genes from cyanobacteria and photosynthetic eukaryotes with green and red plastids, such as chlorophytes (landplants and green algae), red algae, and Euglena. The position of Euglena is surprising, since it does not fall into the eukaryotic clade but occupies a basal position below all cyanobacteria except the supposedly primitive cyanobacterium Gloeobacter (see Discussion). This unusual basal position of Euglena may be due to a treeing artifact known as ‘‘long-branch attraction’’ (LBA, Felsenstein 1978). In brief, LBA causes rapidly evolving sequences to assume an artifactually basal position in phylogenetic analysis in the presence of distantly related outgroup sequences. In order to examine the influence that LBA might have on the position of Euglena GapA, we constructed data sets consisting of fewer or no outgroup sequences to the Gap-II subtree and repeated the phylogenetic analysis. These results are shown in figure 3. Figure 3A schematically summarizes the suspected effect that LBA might have on these data: Inclusion of distantly related outgroups (long branches by definition) attract longbranch (fast-evolving) ingroup sequences to an artifactual basal position. Figure 3B shows the basal position of Euglena GapA in the presence of 33 outgroup sequences (as in fig. 2). Figure 3C shows the result with the same method using six fewer distantly related outgroup (GapC) sequences, whereby Euglena GapA moves up in the topology, branching above all cyanobacterial Gap2 se-

quences sampled. In the absence of outgroups to the Gap-II subtree (fig. 3D), the position of Euglena GapA moves even further up in the tree, above the rhodophyte GapA branch, forming a moderately supported branch (bootstrap value 58%) together with the chlorophyte sequences. Interestingly, also Gloeobacter makes an upward movement in the unrooted subtree Gap-II and branches now together with Anabaena and Prochloron, although weakly supported by bootstrap analyses (56%). Discussion We have isolated and characterized GAPDH sequences from the cyanobacteria Synechococcus PCC7942 (genes gap1, gap2, and gap3), P. didemni (gap2 and gap3), and G. violaceus (gap1 and gap2), from the b-proteobacterium R. solanacearum (gap1) and the a-proteobacterium P. denitrificans (two gap genes belonging to the subfamily Gap-III). Previous studies of GAPDH gene evolution have indicated that the nuclear genes for higher-plant GAPDH were acquired from eubacteria in the context of the endosymbiotic origins of organelles, the gene for GapC of the cytosol having been obtained from the antecedents of mitochondria and the gene for GapAB of chloroplasts having been acquired from the antecedents of plastids (Liaud, Zhang, and Cerff 1990; Martin et al. 1993; Liaud et al. 1994; Henze et al. 1995). However, in those studies, lineage-sampling among those members of the eubacterial GAPDH gene family that ultimately gave rise to the eukaryotic nuclear genes GapC and GapAB, respectively (eubacterial gap1

436

Figge et al.

and gap2), was rather limited. In this study, we have shown that gap1, gap2, and gap3 genes are found not only in Anabaena (Martin et al. 1993) but are quite broadly distributed across cyanobacteria, including suspectedly primitive forms such as Gloeobacter and phycobilisome-lacking forms such as Prochloron (Giovannoni et al. 1988; Turner et al. 1989; Nelissen et al. 1995). Furthermore, we have shown that the phylogenetic distribution of gap1 genes extends beyond cyanobacteria and g-proteobacteria, since the b-proteobacterium R. solanacearum was shown here to possess a gene of the gap1 type. In the course of phylogenetic analysis of these data, we searched available databases and identified two additional gap1 genes from the genomes of C. trachomatis and T. pallidum and a member of the gap3 family in the genome of the a-proteobacterium R. capsulatus. Our neighbor-joining GAPDH tree shown in figure 2 is in agreement with the maximum-likelihood GAPDH tree recently published by Viscogliosi and Mu¨ller (1998), which carries a number of additional GAPDH sequences from Gram-positive and thermophilic bacteria and parabasalid protists which were not included in the present study. The following discussion will concentrate on the phylogenetic implications inferred form subtrees Gap-I, Gap-II and Gap-III, whose topologies are fairly stable and not dependent on the inclusion of the complete set of class I GAPDH sequences presently available. Eubacterial gap1 Genes as Antecedents of Nuclear Genes Encoding Cytosolic GapC Enzymes in Eukaryotes (Subtree Gap-I) On the basis of comparative biochemistry and phylogenetic analyses (Gray 1992), it is now well etablished that mitochondria derive from a-proteobacteria. GapC sequences from eukaryotes cluster firmly with gap1 genes from b- and g-proteobacteria (fig. 2), to the exclusion of the basally branching cyanobacterial gap1 homologs. This position of eukaryotic GapC can be most straightforwardly interpreted as reflecting a mitochondrial (a-proteobacterial) origin of the eukaryotic nuclear genes. No gap1 homologs have yet been found in aproteobacteria, but this interpretation clearly predicts that in time, a-proteobacterial gap1 sequences will be found. The two amitochondriate protozoa G. lamblia and E. histolytica also possess GAPDH enzymes of suspected mitochondrial origin, suggesting that these organisms indeed once harbored mitochondria (Henze et al. 1995; Liaud et al. 1997). This is in agreement with evidence from numerous recent studies that suggest a secondary loss of mitochondria in these lineages (Clark and Roger 1995; Keeling and McFadden 1998; Philippe and Adoutte 1998; Roger et al. 1998) and is consistent with the view that the common ancestor of all extant eukaryotes possessed a mitochondrion-like organelle (Martin and Mu¨ller 1998). In our analyses, the position of C. trachomatis cannot be inferred with confidence, and different methods place it either with proteobacteria or with cyanobacteria. In general, the position of Chlamydia in eubacterial phy-

logeny is rather controversial and depends on the molecule of choice (Viale et al. 1994; Eisen 1995). It should be noted, however, that in a recent analysis of group one sigma factors, the chlamydiae share a very robust common branch with proteobacteria (Gruber and Bryant 1997). Treponema Gap1 and euglenozoan GapC share a robustly supported common branch rooting at the basis of the Gap-I subtree (fig. 2). In addition, the sequences from these three organisms (Treponema, Euglena, and Trypanosoma) share three unique insertions (see Results, fig. 1, and below), suggesting that they have a common evolutionary origin which is distinct from that of all other GapC/Gap1 sequences and which apparently antedates the emergence of mitochondria. While the majority of eukaryote lineages including G. lamblia and other amitochondriate diplomonads have glycolytic GAPDH enzymes belonging to the Gap-I subfamily (GapC enzymes of mainly mitochondrial origin), Viscogliosi and Mu¨ller (1998) have recently shown that parabasalid flagellates are exceptional in that they adopted a glycolytic GAPDH descending from an independent (eubacterial?) lineage which may be loosely attached to subtree Gap-III. Parabasalid flagellates are primitive protists with a fermentative metabolism. They are characterized by the absence of mitochondria and the presence of an unusual metabolic organelle, the hydrogenosome. It remains to be established whether or not this enigmatic, novel GAPDH is also present in other, so far not studied, amitochondriate or even mitochondriate protists. Intra-Kingdom Gene Transfer Between the Phyla Proteobacteria and Bacteroides/Flavobacteria (Subtree Gap-I) A surprising finding is the extremely close relationship between the Gap-I sequences from B. fragilis and the b-proteobacterium R. solanacearum (80% identity and 100% NJ-bootstrap support). This close relationship of two Gap sequences from organisms that do not belong to the same eubacterial phylum can clearly not represent organismal evolution but is most likely the result of a horizontal gene transfer from a proteobacterium to an ancestral member of the bacteroides/flavobacterium phylum. This transfer must have occurred after the separation of b- and g-proteobacteria, since the bproteobacterial Gap1 sequence from Ralstonia is significantly closer related to that of Bacteroides (80%) than it is to the g-proteobacterial E. coli sequence (70%). Euglenozoan Cytosolic and Glycosomal GapC Genes Are Unusually Close to gap1 Genes from gProteobacteria and the Spirochete Treponema, Respectively (Subtree Gap-I) Euglenozoan GAPDH sequences sampled here represent the most deeply branching lineage on the Gap-I subtree and, hence, are difficult to interpret as mitochondrial descendants. The two classes of Euglenozoa represented by Euglena (Euglenoidea) and Trypanosoma (Kinetoplastidea) differ with respect to the localization of their glycolytic enzymes. Whereas Euglena has

GAPDH Gene Diversity in Eubacteria and Eukaryotes

a complete enzymatic machinery for glucose degradation in the cytosol, Trypanosoma performs glycolysis mainly in the glycosome, an organelle unique to kinetoplastids (Michels and Hannaert 1994). Notably, glycosomal GAPDH of kinetoplastids branches robustly with cytosolic GAPDH in Euglena, indicating recompartmentalization of this enzyme during euglenozoan evolution (Henze et al. 1995; Wiemer et al. 1995). In addition, some kinetoplastids possess a second, distinct isoenzyme in the cytosol (Michels et al. 1991; Hannaert et al. 1992), which is very closely related to enterobacterial Gap1 and might have been acquired by kinetoplastids through a prokaryote-to-eukaryote gene transfer event that did not entail the establishment of organelles (Henze et al. 1995). This scenario is further substantiated by the present finding that the phylogenetic position of the gap sequence from the b-proteobacterium Ralstonia represents an outgroup to the g-proteobacterial/trypanosomal group (fig. 2). This basal position of Ralstonia, together with the absence of the cytosolic equivalent in the deeper-branching kinetoplastids (Trypanoplasma [Wiemer et al. 1995]), indicates that the lateral transfer occured fairly recently in euglenozoan evolution. While cytosolic GapC of Trypanosoma seems to be derived from a g-proteobacterial Gap1, glycosomal GapC of Euglenozoa is surprisingly close to a homolog present in the spirochete Treponema (100% bootstrap significance; see fig. 2 and alignment in fig. 1). This raises the question of the direction and context in which this horizontal gene transfer may have occurred. If the gap1 gene was transferred from a spirochete to the ancestor of the Euglenozoa (scenario I), other spirochetes would be expected to harbor a Treponema gap1 ortholog. So far, only one other spirochetal gap gene from Borrelia burgdorferi, has been sequenced (see the Borrelia genome database and Fraser et al. [1997]). This gap gene, however, is clearly not part of the Gap-I subtree (Viscogliosi and Mu¨ller 1998). Therefore, under scenario I, the common ancestor of Treponema and Borrelia would have possessed both divergent gap genes, and the present state would be the result of selective loss and retention during the streamlining process, leading to the extremely small genomes of these parasites (1.1 MB and 0.8 MB). The alternative hypothesis of a gene transfer in the opposite, eukaryote to prokaryote, direction (scenario II) raises other, even more puzzling questions concerning the origin of the glycosomal GapC gene and the biological context for its transfer to spirochetes. Few prokaryotic genes have been claimed to be of eukaryotic origin, and mechanisms of their transfer remain highly speculative (for review, see Smith, Feng, and Doolittle 1992). With the present data, a clear-cut decision concerning the direction of the transfer cannot be made but will await the sequencing of further spirochetal gap genes. Scenario I may also be viewed in the context of earlier suggestions that glycosomes are of endosymbiotic origin. Support for a xenogenous origin of microbodies, including glycosomes, glyoxysomes, and peroxisomes, is so far based on the presence of some enzymes specific to all microbodies, their way of multi-

437

plication and biogenesis, and most important, on the presence of similar import signals in all proteins imported (Opperdoes et al. 1998). In case of a cryptic endosymbiosis of an ancestral spirochete, we would expect some other glycosomal proteins to be closely related to their homologs in extant spirochetes. Several analyses performed with glycolytic enzymes other than GAPDH demonstrated that this is not the case (data not shown). Therefore, an endosymbiotic scenario in which a spirochete is seen as the ancestor of the glycosome is not supported by the present data. Rhodobacter capsulatus Harbors a Cyanobacterial gap3 Homolog (Subtree Gap-III) To date, gap3 homologs have only been described in cyanobacteria. Recently, the R. capsulatus genome project uncovered the presence of a gap gene in this aproteobacterium which branches robustly with cyanobacterial gap3 sequences just below the Synechococcus homolog in all analyses performed (fig. 2). This observation may be explained in alternative ways: (1) Rhodobacter could have obtained the gap3 gene through a lateral gene transfer from an ancestral cyanobacterium, or (2) the position of Rhodobacter on the Gap-III subtree reflects organismal evolution. The latter interpretation suggests that the common ancestor of (a-) proteobacteria and cyanobacteria possessed gap3 genes and that these have been lost during the process of genome evolution in the lineages leading to the sequenced genomes of the g-proteobacteria H. influenzae, which possesses only gap1, and E. coli, which possesses gap1 and two other gap genes, that, however, do not fall into the classes designated as gap1–3 here (Henze et al. 1995). As more proteobacterial genomes are sequenced, other proteobacteria may become known that also possess gap3 genes. Is Gloeobacter an Ancestral or a Highly Derived Cyanobacterium (Subtrees Gap-I and Gap-II)? It is generally accepted that eukaryotes received their photosynthetic apparatus through the engulfment of an ancient cyanobacterium that was subsequently transformed into plastids (Gray and Doolittle 1982). Furthermore, the majority of the data are compatible with a monophyletic origin of all plastids (primary endosymbionts; Gray 1992; Martin, Somerville, and Loiseaux-DeGoe¨r 1992; Loiseaux-DeGoe¨r 1994; Delwiche and Palmer 1997). The general picture emerging from 16S rRNA, tufA, atpB, and psbA phylogenies shows that all cyanobacteria, including plastids, are part of a massive radiation that happened rather late in eubacterial evolution (Morden et al. 1992; Douglas 1994; Nelissen et al. 1995). The phylogenetic analysis of Gap2 sequences also supports a monophyletic origin of primary plastids and, in addition, an early divergence of Gloeobacter (fig. 2). Similarly, Gloeobacter appears as one of the early-branching organisms of the cyanobacterial radiation in 16S rRNA (Wilmotte 1994; Nelissen et al. 1995) and tufA phylogenies (Koehler et al. 1997). This position is in agreement with the unique morphology of Gloeobacter, the only cyanobacterial species known to lack

438

Figge et al.

thylakoid membranes (Rippka, Waterbury, and CohenBazire 1974). Additional peculiar morphological features include the lack of sulfoquinovosyl diacylglycerol lipids (Selstam and Campbell 1996) and an unusual structure and localization of phycobilisomes (Guglielmi, Cohen-Bazire, and Bryant 1981). However, in the present study, the early divergence of Gloeobacter is only supported if an outgroup to the Gap-II tree is present (fig. 3B and C). In the absence of an outgroup, Gloeobacter and Anabaena form a weakly supported group (fig. 3D). On the other hand, a very close relationship between these two organisms is suggested in the Gap-I subtree. Under the assumption that Gloeobacter is indeed the most primitive cyanobacterium known, Gap1 phylogeny can only be explained by a horizontal gene transfer from Anabaena to Gloeobacter. Conversely, Gloeobacter could also be a highly derived organism, assuming that the position of Gloeobacter in the Gap1 subtree correctly reflects the evolution of the organism and that the position in the Gap2 subtree is the result of a treeing artifact. This in turn is in contradiction with several other phylogenetic studies performed (see above) that rather support the ancestral state of this organism. More analyses with other molecular markers will be necessary to clarify this issue. Influence of Species Sampling on the Position of Euglena in Subtree Gap-II The position of Euglena GapA in fig. 2 is at odds with the hypothesis of Gibbs (Gibbs 1978), who argued that Euglena’s plastids arose through engulfment of a (eukaryotic) chlorophyte by a phagocytosing, nonphotosynthetic kinetoplastid host. Data from completely sequenced chloroplast genomes firmly attest to the chlorophytic ancestry of Euglena’s plastids (Martin et al. 1998). It is therefore surprising that Euglena GapA branches below all cyanobacterial gap genes sampled, with the exception of Gloeobacter (fig. 2). This, in addition to the conspicuously long branch observed for Euglena GapA in previous analyses (Henze et al. 1995), suggested that the position of Euglena GapA might be due to an artifact of phylogenetic construction known as long branch attraction (LBA) (Felsenstein 1978). As shown in figure 3B–D, reducing the number of outgroup sequences leads to a progressive movement of Euglena GapA towards the top of the Gap-II subtree until it branches above the rhodophyte GapA branch in the unrooted Gap-II topology. This latter position is much more compatible with chloroplast-encoded protein data that confirm the chlorophytic ancestry of Euglena’s plastids (Martin et al. 1998). It is thus congruent with the view that the GapA gene of Euglena descends from a GAPDH gene of plastid origin that was donated from the secondary endosymbiont (chlorophyte) to the host (kinetoplastid) nucleus, where the gene product was successfully rerouted to the organelle of its genetic origin. Notably, nuclear-encoded chloroplast GAPDH from other secondary endosymbionts do not belong to the GapAB group but are, rather, duplicated genes of the Gap1/GapC (Gap-II) type (Liaud et al. 1997).

In summary, the complex architecture of the universal class I GAPDH tree is due to multiple successive gene duplications, intra- and interkingdom gene transfer (endosymbiotic and nonendosymbiotic), selective loss/ retention of genes, and lineage-specific differences in evolutionary rate. In spite of this complexity, a large fraction of GAPDH gene diversity falls into three major categories, represented by subtrees Gap-I, Gap-II, and Gap-III (fig. 2), carrying genes related to cyanobacterial genes gap1, gap2, and gap3, first described for A. variabilis. All genes present in extant eukaryotes, except those from parabasalid flagellates, are derived from eubacterial genes gap1 and gap2 and were transferred to the nucleus, probably in connection with the acquisition of mitochondria (gap1 → GapC) and chloroplasts (gap2 → GapA), respectively. Glycosomal GapC of Euglenozoa is a notable exception. It seems to antedate the symbiotic event leading to mitochondria and is specifically related to Gap1 of the spirochete T. pallidum, suggesting that it spread by interkingdom gene transfer the direction of which remains elusive at the present time. Acknowledgments We thank Roger Hiller and Pamela Wrench (Sydney) for DNA samples of P. didemni, Frank Ho¨ren (GBF Braunschweig) for genomic DNA from P. denitrificans and Petra Brandt (GBF Braunschweig) for her help with sequencing. William Martin is acknowledged for valuable discussions and critical comments on the text. We also thank the three referees for their competent and constructive comments on the manuscript. Major financial support, including a Ph.D. stipend for R.M.F., has been received from the DFG priority program ‘‘The molecular basis of plant evolution’’. M.S. received a stipend from the HSP-II-Frauenfo¨rderung der TU Braunschweig. H.B. gratefully acknowledges additional financial support from the CNRS. LITERATURE CITED

ADACHI, J., and M. HASEGAWA. 1996. Model of amino acid substitution in proteins encoded by mitochondrial DNA. J. Mol. Evol. 42:459–468. ALEFOUNDER, P. R., and R. N. PERHAM. 1989. Identification, molecular cloning and sequence analysis of a gene cluster encoding the class II fructose 1,6-bisphosphate aldolase, 3phosphoglycerate kinase and a putative second glyceraldehyde 3-phosphate dehydrogenase of Escherichia coli. Mol. Microbiol. 3:723–732. BLATTNER, F. R., G. R. PLUNKETT, C. A. BLOCH et al. (17 coauthors). 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453–1474. BRINKMANN, H., R. CERFF, M. SALOMON, and J. SOLL. 1989. Cloning and sequence analysis of cDNAs encoding the cytosolic precursors of subunits GapA and GapB of chloroplast glyceraldehyde-3-phosphate dehydrogenase from pea and spinach. Plant Mol. Biol. 13:81–94. BRINKMANN, H., and W. MARTIN. 1996. Higher-plant chloroplast and cytosolic 3-phosphoglycerate kinases: a case of endosymbiotic gene replacement. Plant Mol. Biol. 30:65– 75.

GAPDH Gene Diversity in Eubacteria and Eukaryotes

BROWN, J. R., and W. F. DOOLITTLE. 1997. Archaea and the prokaryote-to-eukaryote transition. Microbiol. Mol. Biol. Rev. 61:456–502. CERFF, R. 1978. Glyceraldehyde-3-phosphate dehydrogenase (NADP) from Sinapis alba: steady state kinetics. Phytochemistry 17:2061–2067. CERFF, R. 1995. The chimeric nature of nuclear genomes and the antiquity of introns as demonstrated by the GAPDH gene system. Pp. 205–227 in M. GO˜ and P. SCHIMMEL, eds. Tracing biological evolution in protein and gene structures. Elsevier, Amsterdam. CHEN, J. H., J. L. GIBSON, L. A. MCCUE, and F. R. TABITA. 1991. Identification, expression, and deduced primary structure of transketolase and other enzymes encoded within the form II CO2 fixation operon of Rhodobacter sphaeroides. J. Biol. Chem. 266:20447–20452. CLARK, C. G., and A. J. ROGER. 1995. Direct evidence for secondary loss of mitochondria in Entamoeba histolytica. Proc. Natl. Acad. Sci. USA 92:6518–6521. CLERMONT, S., C. CORBIER, Y. MELY, D. GERARD, A. WONACOTT, and G. BRANLANT. 1993. Determinants of coenzyme specificity in glyceraldehyde-3-phosphate dehydrogenase: role of the acidic residue in the fingerprint region of the nucleotide binding fold. Biochemistry 32:10178–10184. DELWICHE, C. F., and J. D. PALMER. 1997. The origin of plastids and their spread via secondary endosymbiosis. Pp. 53– 86 in D. BHATTACHARYA, ed. Origins of algae and their plastids. Springer Verlag, Wien. DOUGLAS, S. E. 1994. Chloroplast origins and evolution. Pp. 91–118 in D. A. BRYANT, ed. The molecular biology of Cyanobacteria. Kluwer Academic Publishers, The Netherlands. EISEN, J. A. 1995. The RecA protein as a model molecule for molecular systematic studies of bacteria: comparison of trees of RecAs and 16S rRNAs from the same species. J. Mol. Evol. 41:1105–1123. FELSENSTEIN, J. 1978. Cases in which parsimony and compatibility methods will be positively misleading. System. Zool. 27:401–410. FOTHERGILL-GILMORE, L. A., and P. A. M. MICHELS. 1993. Evolution of glycolysis. Prog. Biophys. Molec. Biol. 59: 105–235. FRASER, C. M., S. CASJENS, W. M. HUANG et al. (38 co-authors). 1997. Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 390:580–586. FRASER, C. M., S. J. NORRIS, G. M. WEINSTOCK et al. (33 coauthors). 1998. Complete genome sequence of Treponema pallidum, the syphilis spirochete. Science 281:375–388. GIBBS, S. 1978. The chloroplasts of Euglena may have evolved from symbiotic green algae. Can. J. Bot. 56:2883–2889. GIOVANNONI, S. J., S. TURNER, G. J. OLSEN, S. BARNS, D. J. LANE, and N. R. PACE. 1988. Evolutionary relationships among cyanobacteria and green chloroplasts. J. Bacteriol. 170:3584–3592. GRAY, M. W. 1992. The endosymbiont hypothesis revisited. Int. Rev. Cytol. 141:233–357. GRAY, M. W., and W. F. DOOLITTLE. 1982. Has the endosymbiont hypothesis been proven? Microbiol. Rev. 46:1–42. GRUBER, T. M., and D. A. BRYANT. 1997. Molecular systematic studies of eubacteria, using sigma70-type sigma factors of group 1 and group 2. J. Bacteriol. 179:1734–1747. GUGLIELMI, G., G. COHEN-BAZIRE, and D. A. BRYANT. 1981. The structure of Gloeobacter violaceus and its phycobilisomes. Arch. Microbiol. 129:181–189. HANNAERT, V., M. BLAAUW, L. KOHL, S. ALLERT, F. R. OPPERDOES, and P. A. MICHELS. 1992. Molecular analysis of the cytosolic and glycosomal glyceraldehyde-3-phosphate

439

dehydrogenase in Leishmania mexicana. Mol. Biochem. Parasitol. 55:115–126. HENSEL, R., P. ZWICKL, S. FABRY, J. LANG, and P. PALM. 1989. Sequence comparison of glyceraldehyde-3-phosphate dehydrogenases from the three urkingdoms: evolutionary implication. Can. J. Microbiol. 35:81–85. HENZE, K., A. BADR, M. WETTERN, R. CERFF, and W. MARTIN. 1995. A nuclear gene of eubacterial origin in Euglena gracilis reflects cryptic endosymbioses during protist evolution. Proc. Natl. Acad. Sci. USA 92:9122–9126. HOHN, B., and K. MURRAY. 1977. Packaging recombinant DNA molecules into bacteriphages particles invitro. Proc. Natl. Acad. Sci. USA 74:3259. KAISER, K., and N. E. MURRAY. 1988. The use of phage lambda replacement vectors in the construction of representative genomic DNA libraries. Pp. 1–48 in D. M. GLOVER, ed. DNA cloning: a practical approach. IRL Press, Oxford. KANEKO, T., S. SATO, H. KOTANI et al. (24 co-authors). 1996. Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res. 3:109–136. KEELING, P. J., and G. I. MCFADDEN. 1998. Origins of microsporidia. Trends Microbiol. 6:19–23. KERSANACH, R., H. BRINKMANN, M. F. LIAUD, D. X. ZHANG, W. MARTIN, and R. CERFF. 1994. Five identical intron positions in ancient duplicated genes of eubacterial origin. Nature 367:387–389. KOEHLER, S., C. F. DELWICHE, P. W. DENNY, L. G. TILNEY, P. WEBSTER, R. J. M. WILSON, J. D. PALMER, and D. S. ROOS. 1997. A plastid of probable green algal origin in apicomplexan parasites. Science 275:1485–1489. KOKSHAROVA, O., M. SCHUBERT, S. SHESTAKOV, and R. CERFF. 1998. Genetic and biochemical evidence for distinct key functions of two highly divergent GAPDH genes in catabolic and anabolic carbon flow of the cyanobacterium Synechocystis sp. PCC 6803. Plant Mol. Biol. 36:183–194. KUNST, F., N. OGASAWARA, I. MOSZER et al. (151 co-authors). 1997. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390:249–256. LEWIN, R. A., and N. W. WITHERS. 1975. Extraordinary pigment composition of a prokaryotic alga. Nature 256:735– 737. LIAUD, M. F., U. BRANDT, M. SCHERZINGER, and R. CERFF. 1997. Evolutionary origin of cryptomonad microalgae: two novel chloroplast/cytosol-specific GAPDH genes as potential markers of ancestral endosymbiont and host cell components. J. Mol. Evol. 44:S28-S37. LIAUD, M. F., C. VALENTIN, W. MARTIN, F. Y. BOUGET, B. KLOAREG, and R. CERFF. 1994. The evolutionary origin of red algae as deduced from the nuclear genes encoding cytosolic and chloroplast glyceraldehyde-3-phosphate dehydrogenases from Chondrus crispus. J. Mol. Evol. 38:319– 327. LIAUD, M. F., D. X. ZHANG, and R. CERFF. 1990. Differential intron loss and endosymbiotic transfer of chloroplast glyceraldehyde-3-phosphate dehydrogenase genes to the nucleus. Proc. Natl. Acad. Sci. USA 87:8918–8922. LOISEAUX-DEGOE¨R, S. 1994. Plastid lineages. Pp. 138–177 in F. E. ROUND and D. J. CHAPMAN, eds. Progress in phycological research. Biopress Ltd., Bristol, U.K. MARTIN, W., H. BRINKMANN, C. SAVONA, and R. CERFF. 1993. Evidence for a chimeric nature of nuclear genomes: eubacterial origin of eukaryotic glyceraldehyde-3-phosphate dehydrogenase genes. Proc. Natl. Acad. Sci. USA 90:8692– 8696.

440

Figge et al.

MARTIN, W., and M. MU¨LLER. 1998. The hydrogen hypothesis for the first eukaryote. Nature 392:37–41. MARTIN, W., and C. SCHNARRENBERGER. 1997. The evolution of the Calvin cycle from prokaryotic to eukaryotic chromosomes: a case study of functional redundancy in ancient pathways through endosymbiosis. Curr. Genet. 32:1–18. MARTIN, W., C. C. SOMERVILLE, and S. LOISEAUX-DEGOE¨R. 1992. Molecular phylogenies of plastid origins and algal evolution. J. Mol. Evol. 35:385–404. MARTIN, W., B. STO¨BE, V. GOREMYKIN, S. HANSMANN, M. HASEGAWA, and K. V. KOWALLIK. 1998. Gene transfer to the nucleus and the evolution of chloroplasts. Nature 393: 162–165. MEIJER, W. G., E. R. VAN DEN BERGH, and L. M. SMITH. 1996. Induction of the gap-pgk operon encoding glyceraldehyde3-phosphate dehydrogenase and 3-phosphoglycerate kinase of Xanthobacter flavus requires the LysR-type transcriptional activator CbbR. J. Bacteriol. 178:881–887. MICHELS, P. A., and V. HANNAERT. 1994. The evolution of kinetoplastid glycosomes. J. Bioenerg. Biomembr. 26:213– 219. MICHELS, P. A., M. MARCHAND, L. KOHL, S. ALLERT, R. K. WIERENGA, and F. R. OPPERDOES. 1991. The cytosolic and glycosomal isoenzymes of glyceraldehyde-3-phosphate dehydrogenase in Trypanosoma brucei have a distant evolutionary relationship. Eur. J. Biochem. 198:421–428. MISSET, O., J. VAN BEEUMEN, A. M. LAMBEIR, R. VAN DER MEER, and F. R. OPPERDOES. 1987. Glyceraldehyde phosphate dehydrogenase from Trypanosoma brucei: comparison of the glycosomal and cytosolic isoenzymes. Eur. J. Biochem. 162:501–508. MORDEN, C. W., C. F. DELWICHE, M. KUHSEL, and J. D. PALMER. 1992. Gene phylogenies and the endosymbiotic origin of plastids. BioSystems 28:75–90. MURRAY, N. E. 1983. Phage lambda and molecular cloning. Pp. 395–432 in R. W. HENDRIX, ed. Lambda II Monograph 13. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. NELISSEN, B., Y. VAN DE PEER, A. WILMOTTE, and R. DE WACHTER. 1995. An early origin of plastids within the cyanobacterial divergence is suggested by evolutionary trees based on complete 16S rRNA sequences. Mol. Biol. Evol. 12:1166–1173. OPPERDOES, F. R., P. A. M. MICHELS, C. ADJE, and V. HANNAERT. 1998. Organelle and enzyme evolution in trypanosomatids. Pp. 229–253 in G. H. COOMBS, K. VICKERMANN, M. A. SLEIGH, and A. WARREN, eds. Evolutionary relationships among protozoa. Chapman & Hall, London. PHILIPPE, H. 1993. MUST, a computer package of management utilities for sequences and trees. Nucleic Acids Res. 21: 5264–5272. PHILIPPE, H., and A. ADOUTTE. 1998. The molecular phylogeny of eukaryota: solid facts and uncertainties. Pp. 25–56 in G. H. COOMBS, K. VICKERMANN, M. A. SLEIGH, and A. WARREN, eds. Evolutionary relationships among protozoa. Chapman & Hall, London. PRU¨ß, B., H. E. MEYER, and A. W. HOLLDORF. 1993. Characterization of the glyceraldehyde 3-phosphate dehydrogenase from the extremely halophilic archaebacterium Haloarcula vallismortis. Arch. Microbiol. 160:5–11. RIPPKA, R., J. B. WATERBURY, and G. COHEN-BAZIRE. 1974. A cyanobacterium which lacks thylakoids. Arch. Microbiol. 100:419–436.

ROGER, A. J., S. G. SVARD, J. TOVAR, C. G. CLARK, M. W. SMITH, F. D. GILLIN, and M. L. SOGIN. 1998. A mitochondrial-like chaperonin 60 gene in Giardia lamblia: evidence that diplomonads once harbored an endosymbiont related to the progenitor of mitochondria. Proc. Natl. Acad. Sci. USA 95:229–234. SAITOU, N., and M. NEI. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS. 1989. Molecular cloning—a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. SCHAFERJOHANN, J., J. G. YOO, and B. BOWIEN. 1995. Analysis of the genes forming the distal parts of the two cbb CO2 fixation operons from Alcaligenes eutrophus. Arch. Microbiol. 163:291–299. SELSTAM, E., and D. CAMPBELL. 1996. Membrane lipid composition of the unusual cyanobacterium Gloeobacter violaceus sp. PCC 7421, which lacks sulfoquinovosyl diacylglycerol. Arch. Microbiol. 166:132–135. SMITH, M. W., D. F. FENG, and R. F. DOOLITTLE. 1992. Evolution by acquisition: the case for horizontal gene transfers. Trends Biochem. Sci. 17:489–493. SMITH, T. L. 1989. Disparate evolution of yeasts and filamentous fungi indicated by phylogenetic analysis of glyceraldehyde-3-phosphate dehydrogenase genes. Proc. Natl. Acad. Sci. USA 86:7063–7066. SWOFFORD, D. L. 1993. PAUP: phylogenetic analysis using parsimony. Version 3.1.1. Laboratory of Molecular Systematics, Smithsonian Institution, Washington, D.C. THOMPSON, J. D., D. G. HIGGINS, and T. J. GIBSON. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680. TURNER, S., T. BURGER-WIERSMA, S. J. GIOVANNONI, L. R. MUR, and N. R. PACE. 1989. The relationship of a prochlorophyte Prochlorothrix hollandica to green chloroplasts. Nature 337:380–382. VIALE, A. M., A. K. ARAKAKI, F. C. SONCINI, and R. G. FERREYRA. 1994. Evolutionary relationships among eubacterial groups as inferred from GroEL (chaperonin) sequence comparisons. Int. J. Syst. Bacteriol. 44:527–33. VISCOGLIOSI, E., and M. MU¨LLER. 1998. Phylogenetic relationships of the glycolytic enzyme, glyceraldehyde-3-phosphate dehydrogenase, from parabasalid flagellates. J. Mol. Evol. 47:190–199. WIEMER, E. A., V. HANNAERT, P. R. L. A. VAN DEN IJSSEL, J. VAN ROY, F. R. OPPERDOES, and P. A. MICHELS. 1995. Molecular analysis of glyceraldehyde-3-phosphate dehydrogenase in Trypanoplasma borelli: an evolutionary scenario of subcellular compartmentation in kinetoplastida. J. Mol. Evol. 40:443–454. WILMOTTE, A. 1994. Molecular evolution and taxonomy of cyanobacteria. Pp. 1–25 in D. A. BRYANT, ed. The molecular biology of Cyanobacteria. Kluwer Academic Publishers, The Netherlands. ZHAO, G., A. J. PEASE, N. BHARANI, and M. E. WINKLER. 1995. Biochemical characterization of gapB-encoded erythrose 4-phosphate dehydrogenase of Escherichia coli K-12 and its possible role in pyridoxal 59-phosphate biosynthesis. J. Bacteriol. 177:2804–2812.

AXEL MEYER, reviewing editor Accepted December 7, 1998