Coding coenzyme handles: A hypothesis for the origin ... - Europe PMC

8 downloads 0 Views 1MB Size Report
Jun 21, 1993 - amino acids actingas coenzymes of ribozymes in a metaboli- cally complex RNA world. The ancestral aminoacyl-adapter synthetases could ...
Proc. Natl. Acad. Sci. USA Vol. 90, pp. 9916-9920, November 1993 Evolution

Coding coenzyme handles: A hypothesis for the origin of the genetic code (ribozymes/RNA world/origin of life)

EORS SZATHMARY* Institute for Advanced Study Berlin, Wallotstrasse 19, D-1000 Berlin 33, Germany

Communicated by John Maynard Smith, June 21, 1993

ABSTRACT The coding coenzyme handle hypothesis suggests that useful coding preceded translation. Early adapters, the ancestors of present-day anticodons, were charged with amino acids acting as coenzymes of ribozymes in a metabolically complex RNA world. The ancestral aminoacyl-adapter synthetases could have been similar to present-day self-splicing tRNA introns. A codon-anticodon-discriminator base complex embedded in these synthetases could have played an important role in amino acid recognition. Extension of the genetic code proceeded through the takeover of nonsense codons by novel amino acids, related to already coded ones either through precursor-product relationship or physicochemical similarity. The hypothesis is open for experimental tests.

USE OF CCHs The Anticodon: The Early Adapter and Coenzyme Handle. I propose that the primordial adapter was the anticodon itself. The specifically charged anticodons were utilized by ribozymes as coenzymes. The genetic code established a correspondence between amino acids and trinucleotide handles (1). The selective advantage of such a coding is obvious, in that it enables the cell to use these novel types of coenzymes repeatedly and reliably. This idea is supported by the following arguments. I assume that the genetic code originated in an RNA world (2, 3). This RNA world is assumed to have been metabolically complex (4, 5): ribozymes catalyzed oxidation/reduction reactions, aldol condensations, Claisen condensations, transmethylations, and transesterifications. Presumably, DNA was invented before translation. Tetrapyrrols allowed for

photosynthesis. Terpenes were presumably used in membranes (5). The functional groups on the RNA building blocks themselves are hydrogen bonding sites, hydrophobic groups, phosphates, and sugars. Proteins lack phosphates and sugars, but they have aliphatic amines, carboxylates, sulfur, hydroxyl groups, and imidazole (4). Thus, it would have been highly advantageous for ribozymes to apply coenzymes that could have provided the missing functional groups. The currently known cofactors (used by proteins) provide the following groups: nicotinamide, R-SH, flavin, sulfonium ions, acyl ions, and metals (4). Aliphatic amines, carboxylates, sulfur, and hydroxyl groups are added sometimes to tRNAs by means of posttranscriptional modification (4). If, however, it was advantageous for protocells to use these groups repeatedly and modularly, then they had to come in coenzyme form. Posttranscriptional modification at several specific sites would have needed countless numbers of specific modifying enzymes. It has repeatedly been suggested that nucleotide coenzymes are fossils of an RNA world (6-8). Their nucleotide parts, not taking part in the catalyzed reactions, can be thought of as relict handles that ribozymes could have grabbed by specific hydrogen bonds (9). I suggest that an evolutionary bifurcation took place: dinucleotide coenzymes not carrying amino acids, like FAD or NAD, did not have 3'-5' phosphodiester bonds; hence, conventional WatsonCrick pairing with ribozymes was excluded (1). In contrast, amino acids were charged to conventional trinucleotides, which were bonded by Watson-Crick pairs. Reversible Binding to Ribozymes. The association between the ribozymes and the CCHs was specific and reversible. Some relevant data can be cited in support. The association constants between trinucleotides and complementary anticodons in tRNA are on the order of 103 M-1, whereas those between tRNA loops and tetranucleotides are on the order of 104 M-1 (10-12). The relative strength of these interactions is due to the rather rigid structure of the tRNA loop. As it was noted, "Dissociation constants of this magnitude [i.e., 150 ,M] are also observed in many enzyme-substrate interactions" (ref. 10, p. 531). Recently it was found that UUU can catalyze, in the presence of Mn2 , the specific cleavage of GAAAC between G and A, as well as that of other, longer oligonucleotides having complementary sequences. Note that this happens despite the lack of a hairpin structure in either molecule. It has been suggested, therefore, that trinucleotides might have played an important role in the RNA world (13). Two mechanisms could have contributed to an increased (up to 104 M-1) stability of the ribozyme: CCH-amino acid complex. Base-backbone interactions via certain 2'-OH groups could have stabilized the binding, as it occurs, for example, during substrate binding in the catalytic core of the

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.

Abbreviation: CCH, coding coenzyme handle. *Present address: Department of Plant Taxonomy and Ecology, Eotvos University, Ludovika ttr 2, H-1083, Budapest, Hungary.

This paper presents the coding coenzyme handle (CCH) hypothesis for the origin of the genetic code. The basic idea is that useful coding arose before translation. Protocells containing more efficient ribozymes were selected for, including those that were using a wide variety of amino acids as coenzymes, equipped with unambiguous oligonucleotide handles in an RNA world (1). The hypothesis is synthetic: several experimental and statistical investigations as well as hypothetical suggestions are accommodated into it. First, I describe how such amino acid coenzymes could have functioned and been synthesized by ribozymes. Then I explain how the suggested scheme, in which stereochemical fit between a ribozyme binding site (consisting of the anticodon and the discriminator base) and the anticodon and the cognate amino acid plays an important role, is compatible with earlier ideas about vocabulary extension of the code. Finally, some means of experimental testing will be suggested. Emphatically, this paper does not address the origin of protein synthesis by translation.

9916

Evolution:

Szathmdry

Tetrahymena ribozyme (14). Or, another (unfavored) alternative could have been the use of a tetranucleotide handle with a U at the 5' end without a coding function and binding by four-way wobble (cf. refs. 15-17) to the ribozyme. Survey of 350 tRNA sequences indicates that U immediately upstream of the anticodon is an archetypal feature (18). Stability of the Ester Bond. Experiments on aminoacyltRNAs are relevant here, although they work necessarily with the fixed CCA 3' end. There are three classes: less, equal, and more stable than Gly-tRNA: these are the basic, the more or less polar, and the aliphatic amino acids, respectively. Under low hydrolysis conditions (at neutral or acidic pH and at low ionic strength) the half-lives range from 20 to 140 hr (19). An intracellular environment similar to this could have ensured the adequate half-life of the CCH-amino acid molecules. If this is not satisfactory, I can offer three means of stabilization: (i) N-blocked amino acids (such as presentday formylmethionine, being about an order of magnitude more stable; refs. 19 and 20); (ii) stabilization of the ester bond by a H bond on the ribozyme; (iii) using the thiolester version of the CCH-amino acid link (cf. ref. 21). The last suggestion is especially appealing if one accepts that the RNA world was preceded by an iron-sulfur world, where the -OH groups were widely replaced by -SH groups-e.g., forming thiopentoses-or COO- groups were replaced by COSgroups (22).

RIBOZYMES AS AMINOACYL-ADAPTER SYNTHETASES It was first suggested in the context of the genomic tag model that ribozymes similar to the group I introns could have functioned as the first aminoacyl-tRNA synthetases (23). This view received experimental support by the demonstration that the hexanucleotide CAACCA, esterified with N-formylmethionine, was a substrate for an esterase reaction by an engineered Tetrahymena ribozyme (24). The internal guide sequence plays a crucial role in reversibly binding the substrate by base pairing. Note that this ribozyme cannot be considered as optimized for such a function. The Stereochemical Hypothesis. This hypothesis has a long history (25). Here I concentrate on the version that postulated a direct interaction between the amino acid and the anticodon loop of the tRNA. It was possible to show, using explicit Corey-Pauling-Koltun molecular models, that a noncovalent complex of the anticodon and the discriminator base (immediately upstream of the 3' CCA end) of tRNA (C4N model: complex of four nucleotides) has lock and key relations to the cognate amino acid (26). A calculation using semiempirical potential functions showed that the energies of formation of the C4Ns for Gly, Glu, and Val are -11.5, -13.8, and -8.7 kcal/mol, respectively (1 kcal = 4.18 kJ). The binding energies to the C4Ns of the cognate amino acids are -28.8, -29.7, and -35.1 kcal/mol, respectively (27). There is experimental evidence supporting this hypothesis. The association constants between mononucleosides and some amino acids are about 1 M-1 (28). The interactions of some dinucleotide monophosphates with their cognate amino acids (CpC:Gly, GpC:Ala, ApC:Val, CpG:Arg, and GpA:Ser) tend to be not exclusive but specific; the affinity coefficients are again on the order of 1 M-1 (29). Stereochemistry of the listed cases shows that the C2Ns (without the discriminator and the first anticodon base) can in fact interact with the cognate amino acids except for Val; this is borne out by the UV difference absorbance photometry. Proton magnetic resonance indicated a 2 M-1 association constant of Lys to UpU (30). Even more interesting, tRNAASP and tRNAGlU specifically discriminate their amino acids, with affinity coefficients of 200 M-1 (31), in accordance

Proc. Natl. Acad. Sci. USA 90 (1993)

9917

with the expectation that the discriminator base is not needed for recognition in this case (29). Somewhat unexpectedly, it has been found that many group I self-splicing introns can specifically bind Arg instead of G. The dissociation constant of Arg is 1.7-4.7 x 10-3 M (depending on Arg and Mg2+ concentration)-ca. three orders of magnitude worse than for G. The association between the intron and Lys is an order of magnitude less tight than this (32). Survey of the Arg binding sites in 92 group I sequences revealed that they assume the identity of AGA and CG&-i.e., four of the six Arg codons (33). Binding seems to be at least two times stronger than that between the tRNAs and amino acids cited previously. To conclude, I shall assume that a C4N-like complex was able to specifically discriminate between amino acids, but only as part of a ribozyme similar to the known group I introns. I assume that recognition was predominantly anticodonic in most cases, except for Arg, where a codonic dominance is assumed. The situation might be similar to the problem of replication: although fidelity is ultimately determined by the base-pairing differences, high-fidelity replication becomes possible with the aid of enzymes only, relying on these differences. Stereochemistry and Synthetases. I will assume that ribozymes were able to act as aminoacyl-adapter synthetases, specific for adapters as well as for amino acids. Matching of the amino acid against the adapter was primarily achieved through the formation of a C4N-like complex on the ribozyme. It was said that the data on the possible role of amino acid recognition by the anticodon were not compelling (23). Three remarks are in order: (i) although the genomic tag model postulates amino acid recognition by the synthetase ribozymes, no suggestion about the stereochemistry of this interaction was made; (ii) the only well-formulated hypothesis about amino acid recognition by tRNA is the C4N model; (iii) the evidence for the C4N model is weak, but significant, and the proposed interactions could turn out to be compelling in the context of ribozymes. One should note that here we are dealing with a direct interaction of tRNA with the cognate amino acid; the bases of tRNA that determine identity and are recognized by the aminoacyl-tRNA synthetases are not known to interact directly with the amino acid to be charged. The Primordial Synthetases: Ancestors of Self-Splicing tRNA Introns. I show my version of the primordial synthetase in Fig. 1. Clearly, other variants are also conceivable. For example, microscopic reversibility could suggest a reversal of the esterase activity (24). Nevertheless, the suggested structure, with the 3' half of known tRNA and the ribozyme, at whose 5' end a formal codon-anticodon pair is apparent, is meant to be very similar to self-splicing tRNA introns (35). Such introns seem to be the most ancient ones in the phylogenetic record (35). They can be found in cyanobacteria, chloroplasts, and purple nonsulfur bacteria. Usually they are inserted immediately downstream of the anticodon. This or a neighboring location has been maintained by many non-self-splicing archaebacterial and eukaryotic tRNA introns

(35, 36).

The suggestion that these introns are remnants of primordial aminoacyl-tRNA synthetases has been made before (37), without assigning any significance to their insertion point, however. In my model the components of the C4N complex are allowed to come together. An ancient tRNA form, composed essentially of the 3' half of the present-day molecule, and allowing for an amino acid-anticodon contact, was suggested some time ago (38), but my version is now complete with a capable built-in synthetase. Consistent with this is knowledge about the elements of identity of tRNAs as recognized by synthetases. A computer survey indicated that most identity elements should reside in the acceptor and T stems and anticodon loop, but not in the D stem and loop of

Evolution:

9918

Szathmdry

Proc. Natl. Acad. Sci. USA 90 (1993)

A PA

HOA

NUG Jsynthetase

synthetase charged with A and adapter

COO° A I

o-c-o-pA

p

N-C-H

HO

IC-CH-CH3

C-CH-CH3

NACOH-\

NA

activation of bound amino acid

adapter esterification

NAWoco

HOA

WN-C-H r NJG D

)

~~~CH

charged adapter: uncharged synthetase

coenzyme

with handle

FIG. 1. A primordial aminoacyl-anticodon synthetase. Note the resemblance of the reaction mechanism of aminoacylation to that in the genomic tag model (34). Heavy line, part homologous to presentday self-splicing tRNA intron; light line, part homologous to the 3' half of present-day tRNA; shadowed style characters, members of the C4N complex (anticodon and discriminator base); filled circles, reversible base-pairing interaction. The "internal guide sequence," complementary to the "anticodon," is formally a "codon." The case shown is for valine. After the reaction, an AMP is released. The codon-anticodon region allows the local chemistry to decide if the codon or the anticodon dominates the interaction (which would be the former in the case of Arg). Upon formation of the C4N-amino acid complex, the anticodon-codon binding is relieved. If handles of four nucleotides were needed, the internal guide sequence could be longer. The problem of the "wobble" position (N) could have been dealt with in two ways: (i) by the maintenance of strict Watson-Crick pairing, and having four types of synthetases, or (ii) by the application of four-way wobble (U in the third "codon" position).

tRNAs (34). This has been experimentally confirmed for tRNAs of Ala (39, 40), Gln (41), Tyr and Ser (42), Gly (40, 43), Ser (44), Met (45, 46), Val (47), and His (40). Single-stranded RNA derived from the 3' halfof the acceptor stem of tRNAMIa can be aminoacylated by the synthetase in the presence of complementary RNA cofactors (48). It is also remarkable that tRNASer (for AGY codons) lacks the D stem and loop in animal mitochondria (16). I suggest that protein synthetases evolved when the tRNAs were still incomplete. The transition from anticodon to tRNA adapters was catalyzed by the fortuitous self-splicing activity of the synthetase ribozymes. Group I tRNA introns are descendants of these ribozymes.

EARLY EXTENSIONS OF THE CODE Nonsense Codon Takeover. The code is not universal; some minor dialects do exist. Their origins are explicable in terms of the codon capture (reassignment) mechanism (15-17), consisting of the following component processes: (i) multiple rounds of reversing AT and GC mutation pressures to modify

codon usage; (ii) the role of the same pressure in altering anticodons; (iii) the appearance of tRNAs with restricted pairing capabilities, complementary to the reduced codon repertoire. A few changes independent from mutation pressure are allowed. It is possible, by this mechanism, for an amino acid to capture a stop codon, which may or may not have been a sense codon of a different amino acid before. In the former case one amino acid ultimately captures the codon of another. It has been argued that this was a means of vocabulary extension of early genetic codes: several stop codons have been captured by old or novel amino acids (49). A problem with this hypothesis is the high mutational load due to the common nonsense mutations in an early code used for translation. Ambiguity Reduction. The genetic code underwent an ambiguity reduction process (50-52). Initially, groups of related codons were assigned to groups of chemically related amino acids, which was not only tolerated but welcome as an improvement on a system without coding, by protocells. Unambiguity is the product of subsequent evolution. Some statistical evidence based on the analysis of many tRNAs of eight amino acids was presented in favor of this idea (53). Note that some ambiguity is inherent in the stereochemical hypothesis as well, due to the finite interaction energy of the C4N with the amino acid. Even contemporary translation is orders of magnitude more error-prone than replication. The Physicochemical Hypothesis. It has been suggested that the primary force to have shaped the pattern of amino acid-codon assignments was the reduction of chemical distances, particularly of polarity differences, between amino acids coded for by neighboring codons (25, 54). This obviously reduces the mutational load (54) as well as the adverse effects of translational errors (25, 55). Detailed statistical analyses revealed that there is a lot of truth in this idea (56-58). Moreover, it has been shown that the natural code is better in conserving chemical similarity of amino acids as a result of single base changes than all but 2 of 10,000 randomly generated codes, having the same overall redundancy as the natural code (59). The two exceptions were only marginally better. A second question is whether there is any chemical similarity between the amino acids and the doublets/triplets coding for them. A statistical analysis of 14 amino acid and 3 dinucleoside monophosphate properties revealed the following very significant relationship (60): dinucleotide RF = 0.037 x polarity requirement -0.0074 x bulkiness + 0.0320, where RF is the hydrophilicity in the experimental system of Weber and Lacey (61); Polarity requirement of amino acids was measured by Woese et al. (25); bulkiness is the ratio of amino acid side chain volume to length (62). The Pearson product-moment correlation between observed and expected values is 0.95. An extension based on physicochemical similarity will be called a Woese pathway. The Coevolution Hypothesis. It has also been suggested that amino acids with closely related codons have a precursorproduct relationship in biosynthesis (63) and that this pattern was statistically dominant over the reduction of chemical distances between neighboring amino acids (64). There is an apparent sequence homology between aminoacyl-tRNA synthetases of the biosynthetically related Met and Ile, and (separately) of Glu and Gln in Escherichia coli. tRNATYr and tRNAPhe are also closely related in Scenedesmus (65). An extension of the code based on biosynthetic kinship will be called a Wong pathway. Compromise Between the Physicochemical and the Coevolution Hypotheses. Has the code been shaped by the requirement that similar codons should specify chemically similar amino acids or amino acids close to one another in biosyn-

Evolution:

Szathmdry

thesis? Recent statistical analyses revealed that both mechanisms must have contributed to shaping the genetic code, but the first is likely to have been the dominant one (66, 67). The latter investigation rests on Mantel tests on the following distance matrices: (i) Hamming distances of anticodons; (ii) Hamming distances of the hypervariable parts of tRNA ancestral sequences (68) less their anticodons; (iii) distances of the amino acid polarity requirements; and (iv) distances of the amino acids on the metabolic map measured by the number of enzymatic reactions on the pathway. It turns out that there are significant pairwise correlations between i and iii, i and iv, ii and iii, and ii and iv. There is no significant correlation between i and ii or iii and iv. This suggests that the effects of physicochemistry and coevolution must have been independent (67). A nice mechanistic idea is compatible with this conclusion. It is striking that the first (5') position of the codon seems to be associated with biosynthetic kinship: codons of the amino acids belonging to the shikimate, pyruvate, aspartate, and glutamate families tend to have U, G, A, and C in the position, respectively. In contrast, polarity seems to be associated with the middle position: five of the six most hydrophobic amino acids (Val, Ile, Leu, Met, Phe) have U in the middle position, and nearly all of the most hydrophilic (polar or charged) amino acids (His, Lys, Asp, Glu, Asn, Gln, Tyr) have A in the same position. Amino acids with intermediate hydrophilicity have either G or C as the mid-base. This regularity was referred to as the "code within the codons" (69). The authors suggested that this originated before the tRNA/ribosome system. Compromise Between Stereochemistry, Physicochemistry, and Coevolution. It should be noted that the C4N model is essentially in no contradiction to the physicochemical and coevolutionary hypotheses. The problem boils down to the question what makes the code within the codons (first and second codon bases coding for biosynthetic kinship and physicochemical relatedness, respectively; ref. 69) compatible with the C4N model. The answer is that the third anticodon position is preferentially used for binding the nonspecific parts (NH3+ and CO-) of the amino acids: in table 2 of ref. 26, there are altogether 46 entries for the binding of these groups, out of which 23 are provided by the third anticodon base, although it is only one out of four bases comprising the complex. In several cases this base does not bind strongly to the amino acid at all. Amino acid side chains are never recognized by it. It is noteworthy in this regard that one of the oxygens in the COO- group of the amino acid is always kept free without binding, to combine with A during activation. Although the ultimate determinant of code evolution must have been the stereochemistry of synthetase ribozymes (23), the C4N model suggests that this was to a large extent compatible with the Wong and Woese mechanisms. Proteins have more Ala and less His, Cys, Pro, Ser, and Leu than would be expected from codon multiplicities alone (70). This fact is understandable if one assumes that assignment multiplicities were primarily determined by stereochemical effects, which had nothing to do with amino acid functionality in proteins later in evolution. "Selection against the code" (70) would thus have been necessary. Three final conclusions are worth drawing: (i) the code evolved through the takeover of nonsense codons (49); (ii) some ambiguity in the assignments must have been present (50-52); (iii) in many cases the identity of the third letter of the "codon" was insignificant (family boxes; cf. ref. 17). In this scenario "nonsense codon takeover" differs from "stop codon takeover" (49), since in the absence of translation there is nothing to be stopped. Therefore, even a limited initial vocabulary would not have resulted in a high mutational load.

Proc. Natl. Acad. Sci. USA 90 (1993)

9919

DISCUSSION Avoiding the Error Catastrophe. The possibility of an error catastrophe of translation, due to the harmful feedback of erroneous synthetases, was pointed out a long time ago (71). A difficult inverse problem of the error catastrophe is the origin of translation (72, 73): how could translation be started at all, if synthetases (made of protein) were essential in the process, but the performance of the latter must have initially been error-prone? The CCH hypothesis solves this problem (1) by suggesting that the amino acid-codon assignments had evolved before translation, aided by ribozymes rather than protein enzymes. If amino acid coenzymes did not play a crucial role in these ribozymes, there was no error loop associated with the assignment problem. Another Hypothesis. Another hypothesis (74), advocating a pretranslational use for coding, was published at about the same time of, but independent from, the CCH hypothesis (1). It suggests that an ancient genetic code for metabolism linked substrates (including various compounds, others than amino acids) to adapters, and the predecessors of ribosomes functioned as "metabolosomes" anchoring adapter-mounted reactants. Thus, the focus is rather on substrates than coenzymes, and the hypothesis does not consider either the evolution of adapters from forms simpler than present-day tRNAs or how patterns in the genetic code should be related to metabolism. An obvious overlap between the two hypotheses is the application of Wong pathways in the CCH hypothesis: in this case adapter-bound amino acids are substrates rather than catalysts. Tests. Experimental tests of many aspects of the CCH hypothesis are possible. When I first suggested the in vitro evolution of ribozymes using transition-state analogs, I also pointed out that one could select for RNAs binding amino acids specifically. Analysis of the binding sites should reveal any anticodonic or codonic propensity for amino acid recognition (75). More direct attempts at making synthetases, similar to the proposed form (Fig. 1), are also possible. Finally, one could select for ribozymes binding and utilizing CCH-amino acid molecules. Codon swapping was previously suggested (76) to account for the finding (reviewed above) that although there is no correlation between polarity distances and biosynthetic kinship of the amino acids, both seem to be correlated with anticodonic and tRNA ancestral sequence similarity (67). The hypothesized mechanism is based on successive rounds of Osawa-Jukes-type codon reassignments (15-17) and would leave amino acid sequences of proteins intact. It was conjectured that amino acids were first assigned by the Wong mechanism (producing a strongly suboptimal code) and later reassigned because natural selection favored a code closer to optimality. Although theoretically possible, this idea does not fit the scenario developed here: if the codon/anticodon is an essential part of the charging enzyme, amino acid to codon assignments cannot be swapped. The code within the codon idea is now my favored resolution of the conflict between polarity and biosynthetic similarity: the first codon base is correlated with the latter, while the second one is correlated with the former (69). Toward Translation. It is not a goal of this paper to present a hypothesis for the origin of protein synthesis by translation. I just briefly point out that an evolutionary bifurcation of RNAs is likely to have occurred: some RNAs became better and better messengers, others served as shrinking ribozymic cores of evolving protein enzymes (1). "Microgenes" are likely to have encoded small polypeptides (77); this is not to say, however, that known exons are direct descendants of these. It should be obvious that protein synthesis by translation is much easier to evolve, using the genetic code as a preadaptation.

9920

Evolution:

Szathmdry

I thank John Maynard Smith, Christof Biebricher, Maria Ujhelyi, James Griesemer, and two anonymous referees for helpful comments. This work was supported in part by Grants I/3 2046 and T44-62 of the Hungarian National Scientific Research Fund (OTKA). 1. Szathmary, E. (1990) in Evolution: From Cosmogenesis to Biogenesis, eds. Lucacs, B., Befczi, Sz. & Molnar, I. (Central Research Institute for Physics, Budapest), Vol. KFKI-199050/C, pp. 77-83. 2. Gilbert, W. (1986) Nature (London) 319, 818. 3. Darnell, J. E. & Doolittle, W. F. (1986) Proc. Natl. Acad. Sci. USA 83, 1271-1275. 4. Benner, S. A., Allemann, R. K., Ellington, A. D., Ge, L., Glasfeld, A., Leanz, G. F., Krauch, T., MacPherson, L. J., Moroney, S., Piccirilli, J. A. & Weinhold, E. (1987) Cold Spring Harbor Symp. Quant. Biol. 52, 53-63. 5. Benner, S. A., Ellington, A. D. & Tauer, A. (1989) Proc. Natl. Acad. Sci. USA 86, 7054-7058. 6. Woese, C. R. (1967) The Genetic Code (Harper & Row, New York). 7. Orgel, L. E. (1968) J. Mol. Biol. 38, 381-393. 8. White, H. B., III (1976) J. Mol. Evol. 7, 101-104. 9. Orgel, L. E. (1989) J. Mol. Evol. 29, 465-474. 10. Hogenauer, G. (1970) Eur. J. Biochem. 12, 527-532. 11. Uhlenbeck, U. C., Bailer, J. & Doty, P. (1970) Nature (London) 225, 508-510. 12. Pongs, O., Bald, R. & Reinwald, E. (1973) Eur. J. Biochem. 32, 117-125. 13. Kazakov, S. & Altman, S. (1992) Proc. Natl. Acad. Sci. USA 89, 7939-7943. 14. Pyle, A. M., Murphy, F. L. & Cech, T. R. (1992) Nature (London) 358, 123-128. 15. Osawa, S. & Jukes, T. H. (1988) Trends Genet. 4, 191-198. 16. Osawa, S. & Jukes, T. H. (1989) J. Mol. Evol. 28, 271-278. 17. Osawa, S., Jukes, T. H., Watanabe, K. & Muto, A. (1992) Microbiol. Rev. 56, 229-264. 18. Nicoghosian, K., Bigras, M., Sankoff, D. & Cedergren, R. (1987) J. Mol. Evol. 26, 341-346. 19. Hentzen, D., Mandel, P. & Garel, J.-P. (1972) Biochim. Biophys. Acta 281, 228-232. 20. Schuber, F. & Pinck, M. (1974) Biochimie 56, 383-390. 21. Weber, A. L. & Orgel, L. E. (1979) J. Mol. Evol. 13, 193-198. 22. Wachtershauser, G. (1992) Prog. Biophys. Mol. Biol. 58, 85201. 23. Weiner, A. M. & Maizels, N. (1987) Proc. Natl. Acad. Sci. USA 84, 7383-7387. 24. Piccirilli, J. A., McConnell, T. S., Zaug, A. J., Noller, H. F. & Cech, T. R. (1992) Science 256, 1420-1424. 25. Woese, C. R., Dugre, S. A., Kando, M. & Saxinger, W. C. (1966) Cold Spring Harbor Symp. Quant. Biol. 31, 720-736. 26. Shimizu, M. (1982) J. Mol. Evol. 18, 297-303. 27. Yoneda, S., Shimizu, M., Fujii, S. & Go, N. (1985) Biophys. Chem. 23, 91-104. 28. Shimizu, M. (1987) Biophys. Chem. 28, 169-174. 29. Shimizu, M. (1987) J. Phys. Soc. Jpn. 56, 43-45. 30. Shimizu, M. (1988) J. Phys. Soc. Jpn. 57, 54-56. 31. Watanabe, K. & Miura, K. (1985) Biochem. Biophys. Res. Commun. 129, 679-685. 32. Yarus, M. (1988) Science 240, 1751-1758. 33. Yarus, M. (1991) New Biol. 3, 183-189. 34. McClain, W. H. & Nicholas, H. B., Jr. (1987) J. Mol. Biol. 194, 635-642. 35. Reinhold-Hurek, B. & Shub, D. A. (1992) Nature (London) 357, 173-176. 36. Cavalier-Smith, T. (1991) Trends Genet. 7, 145-148.

Proc. Natl. Acad. Sci. USA 90 (1993) 37. Ho, C. K. (1988) Nature (London) 333, 24. 38. Hopfield, J. J. (1978) Proc. Natl. Acad. Sci. USA 75, 43344338. 39. Francklyn, C. & Schimmel, P. (1989) Nature (London) 337, 478-481. 40. Francklyn, C., Shi, J.-P. & Schimmel, P. (1992) Science 255, 1121-1125. 41. Jahn, M., Rogers, M. J. & Soll, D. (1991) Nature (London) 352, 258-260. 42. Himeno, H., Hasegawa, T., Ueda, T., Watanabe, K. & Shimizu, M. (1991) Nucleic Acids Res. 18, 6815-6819. 43. McClain, W. H., Foss, K., Jenkins, R. A. & Schneider, J. (1991) Proc. Natl. Acad. Sci. USA 88, 6147-6151. 44. Schatz, D., Leberman, R. & Eckstein, F. (1991) Proc. Natl. Acad. Sci. USA 88, 6132-6136. 45. Lee, C.-P., Dyson, M. R., Mandal, N., Varshney, U., Bahramian, B. & RajBhandary, U. (1992) Proc. Natl. Acad. Sci. USA 89, 9262-9266. 46. Martinis, S. A. & Schimmel, P. (1992) Proc. Natl. Acad. Sci. USA 89, 65-69. 47. Frugier, M., Florentz, C. & Giege, R. (1992) Proc. Natl. Acad. Sci. USA 89, 3990-3994. 48. Musier-Forsyth, K., Scaringe, S., Usman, N. & Schimmel, P. (1991) Proc. Natl. Acad. Sci. USA 88, 209-213. 49. Lehman, N. & Jukes, T. H. (1988) J. Theoret. Biol. 135, 203-214. 50. Woese, C. R. (1965) Proc. Natl. Acad. Sci. USA 82, 1160-1164. 51. Fitch, W. M. (1966) J. Mol. Biol. 16, 1-12. 52. Woese, C. R. (1967) The Genetic Code (Harper & Row, New York). 53. Fitch, W. M. & Upper, K. (1987) Cold Spring Harbor Symp. Quant. Biol. 52, 759-767. 54. Sonneborn, T. M. (1965) in Evolving Genes and Proteins, eds. Bryson, V. & Vogel, J. H. (Academic, New York), pp. 377397. 55. Swanson, R. (1984) Bull. Math. Biol. 46, 187-203. 56. Sjostrom, M. & Wold, S. (1985) J. Mol. Evol. 22, 272-277. 57. Di Giulio, M. (1989) J. Mol. Evol. 29, 191-201. 58. Di Giulio, M. (1989) J. Mol. Evol. 29, 288-293. 59. Haig, D. & Hurst, L. (1991) J. Mol. Evol. 33, 412-417. 60. Jungck, J. R. (1978) J. Mol. Evol. 11, 211-224. 61. Weber, A. L. & Lacey, J. C., Jr. (1978) J. Mol. Evol. 11, 199-210. 62. Zimmermann, J. M., Eliezer, N. & Simha, R. (1968)J. Theoret. Biol. 21, 170-201. 63. Wong, J. T.-F. (1975) Proc. Natl. Acad. Sci. USA 72, 19091912. 64. Wong, J. T.-F. (1980) Proc. Natl. Acad. Sci. USA 77, 10831086. 65. Wong, J. T.-F. (1988) Microbiol. Sci. 5, 174-181. 66. Di Giulio, M. (1991) Z. Naturforsch. C, 305-312. 67. Szathmary, E. & Zintzaras, E. (1992) J. Mol. Evol. 35, 185-189. 68. Eigen, M., Lindemann, B. F., Tietze, M., Winkler-Oswatitsch, R., Dress, A. & von Haeseler, A. (1989) Science 244, 673-679. 69. Taylor, F. J. R. & Coates, D. (1989) BioSystems 22, 177-187. 70. Jukes, T. H., Holmquist, R. & Moise, H. (1975) Science 189, 50-51. 71. Orgel, L. E. (1963) Proc. Natl. Acad. Sci. USA 49, 517-521. 72. Hoffmann, G. W. (1974) J. Mol. Biol. 86, 349-362. 73. Bedian, V. (1982) Origins Life 12, 181-204. 74. Gibson, T. J. & Lamond, A. I. (1990) J. Mol. Evol. 31, 7-15. 75. Szathmary, E. (1989) Oxford Surv. Evol. Biol. 6, 169-205. 76. Szathmary, E. (1991) J. Mol. Evol. 32, 178-182. 77. Seidel, H. M., Pompliano, D. L. & Knowles, J. R. (1992) Science 257, 1489-1490.