Eukarya (formerly eukaryotes), and LUCA was a simple prokaryote with ... universal tree of life in the bacterial branch and the finding of eukaryal features in ...
Reference:
Biol. Bull.
196: 373-377.
(June 1999)
The Last Universal Common Ancestor Simple or Complex? PATRICK
FORTERRE’
AND
HERVE
(LUCA),
PHILIPPE2
’ lnstitut de Ge’ne’tique et Microbiologic, But 409; and 2 Labomtoire de Biologic celluluire, Universite’ Paris XI 91405 Orsay cedex, France
The concept of Archaea (formerly Archaebacteria), introduced by Carl Woese at the end of the seventies, raised the hope that studying this third form of life on earth would help to reconstitute the Last Universal Common Ancestor (LUCA) to all living organisms. In the years that followed, a consensus emerged within the community of archaeobiologists about early cellular evolution and the nature of LUCA. In the new paradigm, Archaea are the sister group of Eukarya (formerly eukaryotes), and LUCA was a simple prokaryote with features between those of Archaea and Bacteria (1, 2). In this scenario, the complex and specific molecular traits of the Eukarya, such as nuclear pores, mRNA splicing, or an elaborated cytoskeleton, are late inventions in the eukaryal lineage. This paradigm is usually justified by the rooting of the universal tree of life in the bacterial branch and the finding of eukaryal features in Archaea at the molecular level. In our opinion however, it is also based on the prejudice that “pro”karyotes predated “eu”karyotes in the course of early cellular evolution. Prokaryotes of course predate “modern” eukaryotes in the history of life since all present-day eukaryotes have probably descended from an ancestor that had already engulfed the bacterial endosymbiont which gave birth to mitochondria, (e.g., 3). We think, however, that the relationships between present-day prokaryotes and the ancestral eukaryotic lineage (those which existed before the
Bat 444,
endosymbiotic event) are far from being definitely settled. We have previously challenged several of the ideas behind the current paradigm. For example, we have shown that cladistic analysis (sensu Hennig) of the elongation factor protein data set does not support the rooting of the tree of life in the bacterial branch (4). In fact, many protein sequence data sets are saturated by mutations, suggesting that classical methods of molecular phylogeny are very poor at resolving ancient divergences (5). (See Fig. I for the case of Be tRNA synthetases from the three domains of life.) In particular, this saturation will always lead to an early emergence of the fast-evolving lineages, whatever their true position, and this artifactual result can be statistically robust. The rapid accumulation of new information about the molecular biology of Archaea, as well as the recent availability of several completely sequenced archaeal and bacterial genomes, has renewed the interest of the scientific community in the problem of LUCA (1, 6). These new data have usually been interpreted in terms of the above paradigm implying a “simple” LUCA. It has been proposed, in particular, that LUCA had a primitive replication apparatus (2) and even an RNA genome (7). In this communication, we use this new genomic information differently: i.e., to systematically reconsider all data supporting the bacterial rooting of the tree of life. Our results show that all the phylogenies that have been used to root the universal tree of life in the bacterial branch are unreliable. Beside the general problem of saturation, most of them are “confused”; that is, one or several domains are paraphyletic or polyphyletic in trees inferred from these data sets (8). As an example, Figure 2 shows a simplified version of the Val/Ile-t-RNA synthetase tree. These paralo-
This paper was originally presented at a workshop titled Evolution: A A4olecular Point of View. The workshop, which was held at the Marine Biological Laboratory, Woods Hole, Massachusetts, from 24-26 October 1997, was sponsored by the Center for Advanced Studies in the Space Life Sciences at MBL and funded by the National Aeronautics and Space Administration under Cooperative Agreement NCC 2-896.
313
P. FORTERRE
AND
1
Figure 1. Saturation analysis of the Ile tRNA synthetases. The method of Philippe ut al. (14) was applied to the set of sequences used in Figure 2. The number of substitutions between two species is estimated as the sum of the lengths of the branches linking these two species in the most parsimonious tree. Each point in the diagram corresponds to a couple of species, for which the abscissa is the number of inferred substitutions, and the ordinate is the number of observed differences. Near the origin, the curve is linear, indicating that the Be tRNA synthetases are not mutationally saturated at small evolutionary distances (within methanogens, gram-positive, Fungi), but they are highly saturated at large evolutionary distances (between the three domains but also within each domain). For a couple of species, one can observe only 250 differences, but during the course of evolution, at least 1000 substitutions have occurred.
H. PHILIPPE
because of a high level of homoplasy, cannot easily be interpreted. The difficulty of recovering the phylogenetic relationshipsbetween the three domains is not surprising when one considers the many problems encountered by molecular phylogenetic analysis when dealing with more recent divergence. For example, completely different topologies have been obtained recently for many groups of protists when different genes were used to construct eukaryotic phylogenetic trees (5). Organisms appeared,either in the crown of the eukaryotic tree when the reporter gene was a slowly evolving one, or as an early branching lineage when the genewas a fast-evolving one. This was due to the saturation of the data set used and to the long branch of the outgroup (Archaea or Bacteria). A spectacular example of the problems that occur in the eukaryotic tree is the misplacement of microsporidia. Microsporidia are most likely fungi, as indicated by (Y and j? tubulin, Hsp70, and RNA polymerase trees, as well as by several phenotypic characters (12). However, they are early branching in rRNA, elongation factors, and tRNA synthetasetrees! This again emphasizesthat elongation factors and tRNA synthetase are not good phylogenetic markers and cannot be used to root the tree of life. Finally, even if the rooting problem is solved, the information gained will be insufficient to reveal the nature of LUCA. Indeed, all rooting is compatible with opposite