Amino Acid Pairing - Science Direct

0 downloads 0 Views 493KB Size Report
Mar 12, 1979 - chemical bonding (Pauling, 1960b): e.g. hydrophobic residues ... tions, but the principles are stated in so general a manner as to be clearly.
J. theor. Biol. (1982) 94, 885-894

Amino Acid Pairing ROBERT Scold The Salk Institute

for Biological

ROOT-BERNSTEIN Studies, San Diego, California

92138, U.S.A. (Received

12 March

1979, and in revised form 21 August

1981)

A set of amino acid pairings are presented which may allow protein-toprotein information transfers. Amino acid pairing is only possible on a parallel /3 ribbon and involves both the polypeptide backbones and the side chains. Model building revealed that of the 210 possible amino acid pairs of the standard 20 amino acids, no more than 26 could be built to meet standard criteria for bonding. Of these 26, 14 were found to be genetically encoded when the codons are read as if they paired in a parallel manner (i.e. in a manner reflecting the structural parallelism of the amino acid pairings); the other 12 pairings were derivatives of the coded pairings in which a single base of the codon triplet had been varied in accordance with Crick’s (1966) “wobble hypothesis.” Evidence for the pairings is presented from colligative studies of polyamino acids. Ways of testing the hypothesis further are suggested. Its implications for the Central Dogma and theories of the origin of life are discussed.

Introduction Crick, in formulating his Central Dogma of Molecular Biology, rejected the possibility of protein-to-protein information transfer because, “it was most unlikely, for stereochemical reasons, that [such] transfer could be done in the simple way that DNA + DNA transfer was envisaged” (Crick, 1970). I believe, nonetheless, to have found a set of amino acid pairings with stereochemical properties appropriate to such simple information transfer. These pairings are encoded in the genetic code. Their existence is supported by evidence from studies of the colligative properties of various poly-a-amino acids. Criteria for Amino Acid Pairing Amino acid pairing involves interactions

between

both the peptide back-

bones and the side chains of two polypeptides. The backbones must therefore be directionally parallel; the distance between a-carbons must be c. 5.0 A; the pairing side chains must extend from the backbone on the same X85 0022-5193/82/040885+10$02.00/0

@ 1982 Academic Press Inc. (London) Ltd.

886

R.

S. ROOT-BERNSTEIN

side of the paired structure and along the same spatial axis. The only stable structure that presently meets these specifications is a parallel p ribbon (Schulz & Schirmer, 1979). Thus, amino acid pairing is limited to instances in which a parallel p ribbon can be formed. The structure must align the side chains and provide structural stability. Side chain pairings are therefore limited to those that do not interfere with the integrity of the /3 structure. Pairings are also limited to those that can interact across the distance between the backbones without steric hindrance. Other limitations on possible interactions are set by accepted theories of chemical bonding (Pauling, 1960b): e.g. hydrophobic residues should be hidden; hydrophilic should be free; residues may interact by hydrogen bonding, hydrophobic interaction (van der Waals forces), salt bridges, or the stabilization of ionic charge by interaction with an unsaturated carbon ring. Charge transfer complexes may also be formed (Szent-Gyargyi, 1960). Pauling has also stated some specific criteria for the interaction of amino acid residues with a complementary template (Pauling, 1960a). These criteria were stated with regard to possible amino acid-nucleotide interactions, but the principles are stated in so general a manner as to be clearly applicable to amino acid pairings as well: “It may be predicted that it is highly unlikely that any amino acid residue other than a residue of glycine would occupy a glycine locus in a polypeptide chain; the selection of glycine by the template must involve the fitting of the hydrogen atom that serves as the side chain of glycine into a cavity in the template that is just large enough to accommodate a hydrogen atom, and is accordingly too small to accommodate the methyl group of alanine or any other side chain, and the van der Waals repulsion energy becomes so great when atoms are brought into contact at a distance of even O-5 A less than the normal van der Waals contact distance that the selectivity of this template for glycine can be expected to be essentially perfect. On the other hand, a part of the template that is complementary to alanine would have a cavity for the methyl group that would be small enough to reject all amino acids except alanine and the smaller one, glycine, and the selection of alanine rather than glycine would have to be made through the operation of the greater van der Waals attraction (London electronic dispersion energy) of the template for the methyl group than that for the hydrogen atom” (Pauling, 1960~ ). It should be added that proline also presents a unique case, like glycine, upon which to test complementary interaction because its structure creates a bend in any peptide of which it is a component. It is reasonable to assume that only one complement will exist to fit the unique stereochemistry of proline.

AMINO

ACID

887

PAIRING

Models Models of all 210 possible amino acid pairs of the standard 20 amino acids were built on a parallel p ribbon according to the rules outlined above. No more than 26 (there is some value judgment involved in determining acceptable pairings) of the 210 can actually be formed (Table 1). As predicted, both the glycine and the proline “templates” are absolutely specific; only glycine is small enough and has the proper stereochemistry to fit proline stereochemistry; and only the presence of a glycine paired to a proline affords a peptide sufficient flexibility to complement a proline bend in a parallel p ribbon. Similarly, only arginine is long enough to interact across a p ribbon with alanine. The van der Waals forces involved in protecting the hydrophobic methyl group from interaction with solute accounts for arginine’s specificity for alanine over glycine. Similar reasoning was used to evaluate the other 24 possible pairings, emphasis being put on the formation of pairings involving strong bonding (hydrogen bonding or charge transfer complexing) or precise stereochemical fit involving van der Waals forces. Proposed Pairings Genetically

Encoded

Perhaps the most important aspect of the proposed pairings was the discovery that the 14 most likely pairings are genetically encoded. The pairings appear in the code when it is read as if the codon triplets are paired in a parallel rather than antiparallel direction (Table 2). Thus, the structural parallelism of the paired p ribbon is mirrored in the informational parallelism of the coding of the amino acid pairs in the genetic code. The specificity of the bindings of (e.g.) proline to glycine is also reflected in the specificity of the codings. The remaining 12 pairs (Table 1) are simple derivatives of the coded pairs in which a single base of the codon triplet has been varied in accordance with Crick’s “wobble hypothesis” (Crick, 1966). The code is arranged so that amino acids that have similar structures often have similar codes (Pelt & Welton, 1966). Thus, those amino acids that may pair with one another have similar complementary codes. This highly ordered arrangement means that the genetic code has more structure and greater information content than has hitherto been realized. Evidence Preliminary evidence exists for several of the pairings. Poland & Scheraga (1967) report interactions between 1: 1 copolymers of glutamic acid and leucine (Fig. 1). They observed a rare “triple transition” upon heating the

888

R.

S. ROOT-BERNSTEIN

TABLE

1

The 26 most probable amino acid pairings listed are encoded in the genetic code (see Table 2) or are all derivatives of the coded pairings in which has been varied in accordance with the “wobble Coded

pairings

+ + + + + + + + + + + + + +

Pro-& phe-lys arg-ala arg-ser ser-ser leu-asn leu-glu leu-asp his-vat gin-val cys-thr trp-thr ile-tyr met-tyr Uncoded

(1)

(2)

+

+ + + +

(3)

according to whether they not. The uncoded pairings a single base in the triplet hypothesis” (Crick, 1966)

(4)

+ +

? + +

+ + + + +

+ +

+

+

(5)

(6)

+

+ +

+ + + + +

+ + + + +

+ +

+ +

+ +

+ ?

+

?

(7)

+

?

pairings

trp-ser thr-ser thr-arg thr-thr phe-ile trp-vat trp-met trp-ile tyr-lys tyr-arg phe-arg his-ile

+ + + + + + -t + + + + +

+ +

(1) Allows backbone binding. (2) Nonpolar residues hidden. (3) Hydrophilic residues free. (4) Hydrogen bonding of radicals. (5) Hydrophobic radical interaction. (6) Stereochemical fit. (7) Charge transfer complexing (Szent-Gyorgyi, + Positive interaction. Negative interaction. ? Questionable interaction. blank No interaction.

+ + +

+

+ + +

?

1960).

-

? ? +

AMINO

ACID TABLE

889

PAIRING

2

Amino acid pairings as represented in a parallel read ing of the codons. t Note that the three codons that normally code for “termination” signals during translation must code for amino acids during pairing. Macino et al. (1979) have discovered that UGA codes for trp in yeast mitochondria. I predict that UAA and UAG will be found to code for tyr or some close analogue in similar systems 5’

3’

ecu ccc

5’ 3’

CCA CCG

GGA GGG GGU GGC

dY dY dY dY

CAA CAG CAU CAC

GUU GUC GUA GUG

val val val val

arg ax arg w

CGU CGC CGG CGA

GCA GCG GCC GCU

ala ala ala ala

ser ser ser ser

ucu ucc UCA UCG

AGA AGG AGU AGC

ax arg ser ser

CYS CYS (trp)t tv

UGU UGC UGA UGG

ACA ACG ACU ACC

thr thr thr thr

W W (tyr)+ (tyr)+

UAU UAC UAA UAG

AUA AUG AUU AUC

ile met ile ile

phe phe

uuu uuc

AAA AAG

lYS lYS

leu leu leu leu leu leu

UUA UUG cuu cut CUA CUG

AAU AAC GAA GAG GAU GAC

asn asn glu .du asp asp

890

R.

S.

ROOT-BERNSTEIN

00 HN

0 C

0 0

leu

au

FIG. 1. Top: serine-serine, serine-threonine, or threonine-threonine pairing by hydrogen bonding. Bottom: leucine-glutamic acid pairing by steric fit and hydrophobic interaction leaving the polar residue in a hydrophilic position. Leucine may also pair with asparagine and aspartic acid in a similar manner, though the fit is then better when their relative positions are reversed.

0 ”

%

% ‘YS

phe/tyr

FIG. 2. Top: Proline-glycine pairing by steric fit and hydrophobic interaction. No other amino acid residue can interact with glycine and only glycine is small enough to pair with proline. Bottom: phenylalanine-lysine pairing by means of a charge transfer complex (SzentGyorgyi, 1960). Tyrosine may also pair with lysine in this configuration, although an alternative configuration is presented in Fig. 3.

AMINO

ACID

PAIRING

891

copolymer. The sample began to melt as the temperature was raised from 0 to 40°C; the helical content simultaneously dropped; it then increased from 40 to 60°C as melting ceased; and finally the helical content decreased as melting began again above 60” and disappeared as the temperature reached 120°C. I suggest that this “triple transition” is the result of a transition from 0 ribbon to LYhelix to random coil and may be characteristic of all amino acid pairings. It is interesting to note that such a transition has not been reported for other glutamic acid or leucine copolymers. Ryser (1974) reports that a 1: 1 copolymer of lysine and tyrosine (Fig. 2) forms “supramolecular aggregates” that can be dissociated by ultrafiltration through micropore membranes. Such aggregates did not form between D-lysine or L-lysine polymers, poly-ornithine, 1: 1 copoly lysine-alanine, or 1: 1 copoly tyrosine-glutamic acid. Thus, the lys-tyr pairing appears to be specific. There is also evidence for serine-serine pairing (Fig. 1). Before pure poly-L-serine was synthesized, it was expected that it would be water

FIG. 3. Top: tyrosine-lysine pairing by means of a hydrogen bond. Bottom: argininealanine pairing by hydrophobic interaction leaving the charged residue free. A somewhat strained hydrogen bond may also be formed between the arginine residue and the carboxyl oxygen of the alanine peptide backbone. Only the alanine residue allows arginine to assume the conformation necessary to make this bond.

892

R.

S.

ROOT-BERNSTEIN

soluble like DL-serine copolymers because of its hydrophilic side chain (Blout, 1962). Pure poly-L-serine is, however, insoluble in water (Fasman, 1967) indicating that its hydroxyl groups are unavailable for hydration. Fasman has already suggested that the side chains of serine, threonine, and cysteine may hydrogen bond to one another (Fasman, 1967) and all three polyamino acids have been reported to exist in cross and parallel /3 forms (Fasman, 1967; Poland & Scheraga, 1967). Thus, serine, threonine, and possibly cysteine appear to fit the characteristics of pairing amino acids as predicted. Finally, I have carried out experiments indicating that arginine and alanine pair as do proline and glycine. Equimolar, aqueous solutions of the following poly-a-amino acids (supplied by Sigma) were employed in the experiments: L-arginine, L-lysine, DL-alanine, pent&-glycine, Lproline. DL-alanine was used instead of L-alanine for reasons of solubility as was penta-glycine rather than a larger poly-glycine. The osmotic pressure of each solution was determined by freezing point depression. All of the possible pairs of solutions were then made 1 : 1 by volume and the osmotic pressure of the mixtures determined over time. The osmotic pressures of the mixtures arg + lys, arg + gly, arg + pro, lys + ala, lys + gly, lys + pro, ala + gly, ala+pro were each the average of the osmotic pressures of the constituent solutions; i.e. there was no evidence of interaction, nor, according to theory should there have been. The osmotic pressure of the mixture ala+arg, however, was consistently about 8% below the expected value, and the osmotic pressure of pro+gly was approximately 14% below the expected value. I interpret these decreases in osmotic pressure as evidence of the pairings predicted by my theory. The interactions are side chain specific. It must also be noted that there is a possibility that the gly + pro interaction does not form a @ structure, but rather one of the class of structures resembling the collagen triple helix (Fraser & MacRae, 1973). But even in the collagen helix, as in a p ribbon, at each spot that a proline creates a bend in the structure, only a glycine can fit in the adjacent polypeptide strand (Dickerson & Geis, 1969) (Fig. 2). Thus, proline-glycine pairing may have significance even beyond /3 structures. Possible Tests

Much work remains to be done to verify the proposed pairings. Synthesis of complementary sequences of amino acids by the Merrifield process will have to be undertaken to obviate the difficulties of working with homoand copolymers of low solubility. Since a large number of short peptides

AMINO

ACID

PAIRING

893

are readily available, synthesis can be limited to complementary strands, thereby reducing the amount of work necessary for testing. The presence of pairing might then be determined by a number of methods: measurement of colligative properties such as osmotic pressure; affinity chromatography; the methods employed by Poland & Scheraga (1967) or Ryser (1974); or layering onto a sucrose gradient a mixture of labeled, complementary peptides, centrifuging them, and analyzing to see whether two (no interaction) or three (interaction) bands of peptides result. Two caveats are in order: peptides often have acetate or other “salts” bound to their polar residues and some of these may be released into solution as a result of pairing, thus interfering with the interpretation of colligative measurements; also, care must be taken to design complements whose amino and carboxyl end groups do not correspond, since the charges on these groups will repel one another due to the parallel nature of the pairing structure. Some Implications Several considerations make it essential that such research be undertaken. First, amino acid pairing presents a possible challenge to the Central Dogma (Crick, 1970) by postulating a mechanism for protein-to-protein information transfers. Indeed, while there is already some evidence for such pairings, there is no evidence, nor can there ever be, for the Central Dogma. As Crick himself has stated, the “dogma was an idea for which there was no reasonable evidence . . . it’s a negative hypothesis [which] says certain transfers can’t take place . . . you might have to say it could never be proved” (Crick, 1979). It could, however, be disproved, and amino acid pairing presents an hypothesis that could do so; for it, unlike the Central Dogma, is testable. That is an important point, for the history of science demonstrates that while testable hypotheses are often proven incorrect, untestable ones have invariably proven incorrect. Amino acid pairing also has important implications for understanding the origin of the genetic code and the evolution of life. The fact that amino acid pairings are genetically encoded means that there is too much structure in the code for it to have developed by a series of “frozen accidents” as Crick (1968) has suggested. It appears, on the contrary, that the specific amino acid-amino acid interactions and amino acid-codon interactions are responsible for the informational structure of the code. In consequence, theories of the origin of the code will have to take into account possible protein-to-protein information transfers and the parallel coding of amino acid pairings. Protein-to-protein information transfers also make possible evolutionary relationships between proteins that have not been considered

894

R.

S. ROOT-BERNSTEIN

before. These too will have to be taken into account in future theories of the origin of life. One hypothesis for explaining some of these features of the code and its genesis is discussed in the second part of this paper, “On the Origin of the Genetic Code.” I thank Dr Arthur Yuwiler, Chief, Neurobiochemistry, Veterans Administration Hospital, Brentwood, California, for donating the polyamino acids and lab space for my experiments; Drs James Bonner of the California Institute of Technology and Jerry Donohue of the University of Pennsylvania for their encouragement and comments on an earlier version of this paper; and Morton I. Bernstein for research and travel expenses. REFERENCES BLOUT,

E. R. (1962).

In: Polyamino

Acids,

Polypeptides,

and Proteins

pp. 9-10. Madison: University of Wisconsin Press. CRICK, CRICK, CRICK,

F. H. C. (1966). J. mol. Biol. 19,548. F. H. C. (1970). Nature, Lond. 227,561. F. H. C. (1979). In: TheEighth Day of Creation

(H. F. Judson) p. 337.New York: Simon

and Schuster. DICKERSON,

R. E. & GEIS,

I. (1969).

The Structure

Park, California: W. A. Benjamin. FASMAN,

G. (1967).

In: Poly-a-amino

Acids

(G.

Marcel Dekker. FRASER,

R. D. B. & MACRAE,

T. P. (1973).

(M. A. Stahmanned.),

and

Action

of Proteins.

p. 42. Menlo

Fasman, ed.), pp. 574-575. New York:

Conformation

in Fibrous

Proteins.

Ch. 11, pp.

247-290. New York and London: Academic Press. MACINO, G., CORRUZZI, G., NOBREGA, F. G., LI, M. & TZAGOLOFF, A. (1979). Proc. natn. Acad. Sci. U.S.A. 76,3784. PAULING, L. (1960a). In: Aspects of the Origin of Life (M. Florkin, ed.), p. 136. Oxford:

Pergamon Press. PAULING,

Press.

L. (19606).

The Nature

of the ChemicalBond.

Ithaca, New York: Cornell University

PELC, S. R. & WELTON, M. G. E. (1966). Nature, Lond. 209,868. POLAND, D. & SCHERAGA, H. A. (1967). In: Poly-a-amino Acids (G. Fasman, ed.), pp. 481483. New York: Marcel Dekker. RYSER, H. J.-P. (1974). In: Peptides, Polypeptides, and Proteins (E. R. Blout et al. eds), p. 621. New York: Wiley-Interscience. SCHULZ, G. E. & SCHIRMER, R. H. (1979). Principles of Protein Structure. Ch. 5, pp. 66-107. New York: Springer-Verlag. SZENT-GY~RGYI, A. (1960). Introduction to Submolecular Biology. New York: Academic Press.