*Howard Hughes Medical Institute. Department ... Institute of Biosciences and Technology of a ternary ... 1978). The structure of VP39 is known to high resolution.
Molecular Cell, Vol. 1, 443–447, February, 1998, Copyright 1998 by Cell Press
Structural Basis for Sequence-Nonspecific Recognition of 59-Capped mRNA by a Cap-Modifying Enzyme Alec E. Hodel,*§ Paul D. Gershon, † Florante A. Quiocho*‡ * Howard Hughes Medical Institute Department of Biochemistry Baylor College of Medicine Houston, Texas 77030 † Department of Biochemistry and Biophysics Institute of Biosciences and Technology Texas A&M University Houston, Texas 77030
Summary Sequence-nonspecific binding of RNA, recognition of a 7-methylguanosine 59 mRNA cap, and methylation of a nucleic acid backbone are three crucial and ubiquitous events in eukaryotic nucleic acid processing and function. These three events occur concurrently in the modification of vaccinia transcripts by the methyltransferase VP39. We report the crystal structure of a ternary complex comprising VP39, coenzyme product S-adenosylhomocysteine, and a 59 m7 G–capped, single-stranded RNA hexamer. This structure reveals a novel and general mechanism for sequence-nonspecific recognition of the mRNA transcript in which the protein interacts solely with the sugar-phosphate backbone of a short, single-stranded RNA helix. This report represents the first direct and detailed view of a protein complexed with single-stranded RNA or 59capped mRNA.
Introduction Protein-mediated sequence-nonspecific recognition of single-stranded nucleic acids is crucial in a variety of biological processes including replication (Chase and Williams, 1986), DNA repair (Lohman and Ferrari, 1994), RNA processing (Kozak, 1992), and translation (Spirin, 1994). The vaccinia protein VP39 is an excellent model system for the study of RNA processing in a single domain protein. In addition to serving as a processivity factor for the vaccinia poly(A) polymerase (Gershon and Moss, 1993), VP39 acts at the mRNA 59 end as a cap 0 (m7 G(59)pppN . . . )–specific (nucleoside-29-O-)-methyltransferase (Barbosa and Moss, 1978). To perform this catalytic function, VP39 must specifically recognize the m7G cap while also binding to the mRNA transcript in a sequence-nonspecific manner. Although the target for methylation is the ribose O2 9 of the first transcribed nucleotide, VP39 apparently requires some length of the mRNA transcript to anchor the target nucleotide in the active site for proper methylation (Barbosa and Moss, 1978). The structure of VP39 is known to high resolution ‡ To whom correspondence should be addressed. § Present Address: Department of Biochemistry, Emory University
School of Medicine, Atlanta, Georgia 30322.
in complex with the product S-adenosylhomocysteine, the cofactor S-adenosylmethionine, and with mono- and dinucleotide mRNA cap analogs (Hodel et al., 1996, 1997). These structures showed the location of the active site and the mechanism of m7G recognition; however, they failed to elucidate the mode of binding of the mRNA transcript. Here, we report the crystal structure of a ternary complex between VP39, S-adenosylhomocysteine, and a 59 m7G–capped, single-stranded RNA hexamer. Results and Discussion To explore the structural details of the recognition of single-stranded RNA in the methyltransferase reaction, we prepared a short mRNA transcript of six nucleotides capped at the 59 end with N7 -methylguanosine. The oligomer, with the sequence m7 G(59)pppGpApApApApA, was transcribed such that the bridging phosphates in the transcript, but not the triphosphate bridge, were phosphorothioate derivatives. This modification was necessary for the stability of the RNA product against nuclease contaminants in the protein preparation. This oligomer was then cocrystallized with the fully active VP39 truncation variant DC26 in the presence of the coenzyme product S-adenosylhomocysteine (AdoHcy), and the structure of the ternary complex was determined by molecular replacement. Density for AdoHcy and the RNA, especially the backbone (Figure 1), was clearly visible in a 2F o 2 Fc map using only the refined protein model (Hodel et al., 1996) for phasing. The structure of the bound capped RNA can be divided into three parts: (1) m7 G and the triphosphate bridge; (2) a trimer of the first three transcribed ribonucleotides (numbered G1-A2-A3); and (3) and a trimer of the last three nucleotides (A4-A5-A6). The cap adopts an extended conformation with m7G at one end of the active site cleft and the triphosphate bridge directed down the cleft toward AdoHcy and the active site (Figures 2A and 3A). The binding of m7 G and its interactions with the protein are identical to that observed in the high-resolution complexes with mono- or dinucleotide cap analogs (Hodel et al., 1997). The recognition of the methylated base is achieved by stacking between two aromatic residues (Y22 and F180) and hydrogen bonding to two acidic side chains (E233 and D182). The six transcribed RNA residues form two groups of stacked trimers each with a roughly A-form helical geometry. The first trimer binds to the active site of VP39 with its bases protruding toward the solvent. The sugarphosphate backbone binds to the protein surface by way of numerous intermolecular contacts and hydrogen bonds (Figure 3B). The RNA backbone takes a dramatic turn between the third and fourth nucleotides resulting in an approximate right angle between the two trimer axes (Figures 2A and 3B). The second trimer is directed away from the protein surface where it contacts a symmetry-related protein molecule at a location distant from the active site (Figure 2B).
Molecular Cell 444
Figure 1. Electron Density for the First Trimer of RNA Bound in the Methyltransferase Active Site The 2.7 A˚ density is calculated as a 2Fo 2 Fc map using only the protein model to calculate the phases. Contoured at 1s, only the density near the RNA is displayed for clarity. The surrounding atoms of the protein are shown without density for context. This figure was rendered using RIBBONS (Carson, 1991).
The conformation of the first RNA trimer in the active site of VP39 implies a novel mechanism for sequencenonspecific recognition of single-stranded RNA. The protein interacts primarily with the sugar-phosphate
backbone through several hydrogen bonds and salt bridges (Figures 3B and 3E). None of the bases of the first trimer form similar interactions with the protein. The RNA bases interact exclusively with each other in a Figure 2. mRNA–VP39 Complex (A) Stick model showing RNA bound to the active site cleft of VP39. The protein is rendered as a solvent-accessible surface. The sulfur atom of AdoHcy (colored green and labeled with an arrow) defines the methyltransferase active site. (B) The two RNA contacts mapped on a single VP39 molecule resulting from the interaction of the RNA with two symmetry-related protein molecules. The protein is rendered as a transparent solvent-accessible surface. Shown in front of the protein is the second trimer (A4A5-A6) binding site based on the interactions with a symmetry-related molecule. Behind the protein are the 59 cap and the first trimer (G1-A2-A3) of the transcript bound in the methyltransferase active site. (C) The proximity of the second RNA contact site to the VP55 dimerization interface. The major red patch denotes surface residues defined in Shi et al. (1997) as the VP55 dimerization ‘hot spot.’ The smaller red patch (R107 to the right) is another part of the dimerization interface (see Shi et al., 1997). This figure was produced using GRASP (Nicholls et al., 1991).
X-Ray Structure of 59-Capped mRNA–Protein Complex 445
Figure 3. Details of the Interaction between the RNA and the Protein (A)–(D) show stereo stick models of the structure with the following color scheme: carbon is white for the first protein molecule, green for the symmetry mate; oxygen is red; nitrogen is blue; phosphorous is pink; and sulfur is yellow. (A) The 59 cap triphosphate bridge and the methyltransferase active site. (B) The first trimer of RNA. (C) The second trimer of RNA binding to a symmetrically related protein molecule. (D) The residues at the methyltransferase active site. S-adenosylmethionine (in green) is fit to AdoHcy in the structure. The donor methyl group is connected to the target hydroxyl by a dashed yellow line. The hydrogen bond between K175 and the target hydroxyl is also shown. (E) A schematic representation of the RNA showing hydrogen bonds with the protein as dashed lines. Residues from one protein molecule are shown in black lettering, whereas residues from the symmetry-related protein molecule are shown in outlined lettering. (A)–(D) were rendered using MIDAS (Huang, et al., 1991).
three-base stack. Thus, the protein appears to recognize the backbone conformation of a helical trimer of stacked bases. Although nucleic acid recognition through the sugar-phosphate backbone has been observed in numerous instances in double-stranded DNA (Steitz, 1990), this mechanism contrasts with that observed in the sequence-nonspecific recognition of single-stranded DNA (Shamoo et al., 1995; Bochkarev et al., 1997). In the cocrystal of the single-stranded DNA-binding domain of replication protein A with a strand of DNA, the bases of
the DNA form several intramolecular pair-wise stacking interactions, but these stacks are interrupted by intercalation with protein aromatic side chains (Bochkarev et al., 1997). There are also several contacts and hydrogen bonds between the protein and the bases of the DNA, suggesting that these interactions must be plastic enough to accommodate other base sequences. In contrast, the more general mechanism of recognition suggested by the RNA–VP39 complex would not require such plasticity of the protein–RNA interface. The only
Molecular Cell 446
sequence-specific effects would be manifested through the differing potential of various trimer sequences to form a stack with the correct geometry. The stacking of single-stranded oligonucleotides into short helical segments has been observed in solution with stabilities that vary depending on the purine/pyrimidine mix (Rinkel et al., 1987; Davis, 1995). This, along with the observation of a similarly stacked trimer formed by the last three bases (A4-A5-A6) of the transcript, leads to the conclusion that such short-range stacks are energetically favorable, if not common, and may frequently form the primary basis of sequence-nonspecific recognition by proteins. Of the methyltransferases with known three-dimensional structures, VP39 most resembles the catechol O-methyltransferase (COMT) both in substrate chemistry and, to some extent, in the nature of the active site (Borchardt, 1980; Valega´rd et al., 1994). COMT is a single domain protein that catalyzes the methylation of a hydroxyl group on catechols, whereas VP39 catalyzes the methylation of a ribose hydroxyl. One primary difference between VP39 and COMT is that the latter requires a Mg2 1 ion for activity (Borchardt, 1980). This Mg2 1 ion is bound at the active site, coordinating to the substrate target hydroxyl (Valega´ rd et al., 1994). This cation, combined with two basic protein side chains in the vicinity, is proposed to facilitate catalysis by inducing the deprotonation of the target hydroxyl through the depression of its pKa. Methylation would then proceed via direct nucleophilic attack on the AdoMet donor methyl. At the methyltransferase active site of VP39, the O29 of G1 (the target of methylation) is hydrogen-bonded to the Nz of Lys-175 and is positioned 3.9 A˚ from the sulfur atom of AdoHcy and 4.7 A˚ from Nz of Lys-41 (Figure 3D). With AdoMet bound, the proximity of the positively charged sulfur atom along with these lysines could promote a decrease in the pKa of the ribose O29 in a manner similar to the mechanism proposed for COMT. The second RNA trimer contacts a symmetry-related protein on a face opposite that of the methyltransferase active site (Figure 2B). Although there are only two hydrogen bonds between the protein and the A4-A5 dinucleotide backbone, the final nucleotide, A6, makes a greater number of interactions with the protein at this site (Figures 3C and 3E). Most notably, the adenosine base is bound in a hydrophobic pocket lined by F251, F261, and F265 and is surrounded by protein polar groups that hydrogen bond to N 1, N6, and N7 of the base. The complementarity of this pocket in shape and functionality suggests that it is a specific adenosine binding site (a similarly placed guanosine would only satisfy one of the four protein–base hydrogen bonds). These observations suggest that the adenosine binding pocket might be part of a second RNA binding site. This second site may have a role in the mRNA O29-methylation reaction or may function in the polyadenylation activity of the poly(A) polymerase. Supporting the latter possibility, this second RNA contact site is provocatively juxtaposed to the dimerization interface for VP55 (Figure 2C), which has been identified by three complementary approaches: mutagenesis combined with a direct VP55binding assay, protein footprinting, and site-specific photo-cross-linking (Shi et al., 1997).
In conclusion, we present the three-dimensional structure of a 59-capped RNA oligonucleotide in complex with the RNA methyltransferase VP39 and AdoHcy. This structure reveals a novel mode for the sequencenonspecific recognition of nucleic acids and suggests both a mechanism for catalysis and a putative second site for RNA binding. As is the case with numerous other vaccinia systems (Moss, 1990), this protein provides an illuminating model for the general mRNA processing machinery found in eukaryotic cells.
Experimental Procedures Sample Preparation The VP39 truncation mutant DC26 was prepared as previously reported (Hodel et al., 1996). Details of the preparation and purification of the capped RNA oligo will be reported elsewhere. In summary, a T7 transcription reaction was performed using DNA oligo templates coding for the transcript GAAAAA. This reaction was incubated in the presence of ATP-aS (DuPont, NEN) and m7G(59)pppG (New England Biolabs). The products were separated by HPLC using a C18 reverse-phase column (Vydac), and the identities of the resulting peaks were determined by electrospray mass spectrometry and methyltransferase assays. The peak used in cocrystallization contained a mixture of the desired product, m7GpppGpApApApApA, and a shorter hydrolysis product, ppGpApApApA, at a ratio of z3:1.
Crystallization and Data Collection Crystals were grown by vapor diffusion in hanging drops over a well of precipitant containing 15%–20% PEG 8000, 0.1 M cacodylate (pH 7.0), and 0.125 M NH4SO4. The drop initially comprised 1 ml each of the well solution, a solution of DC26 at 0.26 mM, and a solution containing a mixture of RNA (z0.9 mM) and AdoHcy (z1 mM). Diffraction quality crystals were generated by macro-seeding in the above conditions. The crystals grew in the space group P212121 (a 5 61.8 A˚; b 5 64.6 A˚; c 5 99.5 A˚ ) with one ternary complex per asymmetric unit. Crystals were flash-frozen in a cryoprotectant solution of 20% PEG 8000, 0.1 M Tris–HCl (pH 8.5), 0.125 M NH4SO4, and 25% glycerol at 21608C with an MSC X-stream cryogenic cooler. Diffraction data to 2.7 A˚ was collected on a Siemens SMART 2K CCD detector with Go¨bel mirrors mounted on a Rigaku rotating anode running at 50 kV and 90 mA. The data (22,144 observations, 9,755 unique, 92% complete to 2.7 A˚) were reduced and scaled using the Siemens SAINT software (Rsym5 9.5% to 2.7 A˚ ).
Molecular Replacement and Refinement The structure was determined by molecular replacement using the model of the protein (Hodel et al., 1996) and the program AMORE (Navaza, 1994). The oriented protein model was then refined by simulated annealing in the program XPLOR (Bru¨nger, 1992). Models for the AdoHcy and the capped RNA were then built into a 2Fo 2 Fc map using the phases from the protein alone and the program XtalView (McCree, 1993). A bulk solvent correction was calculated and the entire model was then refined through positional and group B-factor refinement to an R factor of 21.4% (Rfree 5 28.5%) with good geometry (RMS bond length deviation 5 0.01 A˚, angle 5 1.78).
Acknowledgments We thank K. E. Schmid for assistance in growing crystals. This work was supported in part by National Institutes of Health grant 1 RO1 GM51953-01A1 (P. D. G.), National Science Foundation grants BIR9413229 and STI-9512521 (F. A. Q.), and the Baylor College of Medicine Research Office (F. A. Q.). F. A. Q. is a Howard Hughes Medical Institute Investigator.
Received October 10, 1997; revised November 14, 1997.
X-Ray Structure of 59-Capped mRNA–Protein Complex 447
References Barbosa, E., and Moss, B. (1978). mRNA(nucleoside-29-)-methyltransferase from vaccinia virus. J. Biol. Chem. 253, 7692–7702. Bochkarev, A., Pfuetzner, R.A., Edwards, A.M., and Frappier, L. (1997). Structure of the single-stranded-DNA-binding domain of replication protein A bound to DNA. Nature 385, 176–181. Borchardt, R. (1980). Pharmacogenics and Detoxification. In Enzymatic Basis of Detoxification, W.B. Jakoby, ed. (New York, NY: Academic Press), pp. 43–62. Bru¨nger, A.T. (1992). XPLOR Version 3.1 (New Haven, CT: Yale University Press). Carson, M. (1991). Ribbons 2.0. J. Appl. Cryst. 24, 958–961. Chase, J.W., and Williams, K.R. (1986). Single-stranded DNA binding proteins required for DNA replication. Annu. Rev. Biochem. 55, 103–136. Davis, D.R. (1995). Stabilization of RNA stacking by pseudouridine. Nucleic Acids Res. 23, 5020–5026. Gershon, P.D., and Moss, B. (1993). Stimulation of poly(A) tail elongation by the VP39 subunit of the vaccinia virus–encoded poly(A) polymerase. J. Biol. Chem. 268, 2203–2210. Hodel, A.E., Gershon, P.D., Shi, X., and Quiocho, F.A. (1996). The 1.85 A˚ structure of vaccinia protein VP39: a bifunctional enzyme that participates in the modification of both mRNA ends. Cell 85, 247–256. Hodel, A.E., Gershon, P.D., Shi, X., Wang, S.-M., and Quiocho, F.A. (1997). Specific protein recognition of an mRNA cap through its alkylated base. Nat. Struct. Biol. 4, 350–353. Huang, C.C., Pettersen, E.F., Klein, T.E., Ferrin, T.E., and Langridge, R. (1991). Midas 2.0. J. Mol. Graphics 9, 230–235. Kozak, M.A. (1992). Consideration of alternative models for the initiation of translation in eukaryotes. Crit. Rev. Biochem. Mol. Biol. 27, 385–402. Lohman, T.M., and Ferrari, M.E. (1994). Escherichia coli singlestranded DNA-binding protein: multiple DNA-binding modes and cooperativities. Annu. Rev. Biochem. 63, 527–570. McCree, D.E. (1993). Practical Protein Crystallography (San Diego, CA: Academic Press). Moss, B. (1990). Regulation of vaccinia virus transcription. Annu. Rev. Biochem. 59, 661–688. Navaza, J. (1994). AMoRe: an automated package for molecular replacement. Acta Cryst. A50, 157–163. Nicholls, A., Sharp, K.A., and Honig, B.A. (1991). Protein folding and association: insights from the interfacial and the thermodynamic properties of hydrocarbons. Proteins 11, 281–282. Rinkel, L.J., van der Marel, G.A., van Boom, J.H., and Altona, C. (1987). Influence of base sequence on the conformational behaviour of DNA polynucleotides in solution. Eur. J. Biochem. 166, 87–101. Shamoo, Y., Friedman, A.M., Parsons, M.R., Konigsberg, W.H., and Steitz, T.A. (1995). Crystal structure of a replication fork singlestranded DNA binding protein (T4 gp32) complexed to DNA. Nature 376, 362–366. Shi, X., Bernhardt, T.G., Wang, S.-M. and Gershon, P.D. (1997). The surface region of the bifunctional vaccinia RNA modifying protein VP39 which interfaces with poly(A) polymerase is remote from the RNA binding cleft used for its mRNA 59 cap methylation function. J. Biol. Chem. 272, 23292–23302. Spirin, A.S. (1994). Storage of messenger RNA in eukaryotes: envelopment with protein, translational barrier at 59 side, or conformational masking by 39 side? Mol. Reprod. Dev. 38, 107–117. Steitz, T.A. (1990). Structural studies of protein–nucleic acid interaction. Q. Rev. Biophys. 23, 205–280. Valega´rd, K., Murray, J.B., Stockley, P.G., Stonehouse, N.J., and Liljas, L. (1994). Crystal structure of catechol O-methyltransferase. Nature 371, 623–626. Protein Data Bank Accession Numbers Atomic coordinates and structure factors have been deposited in the Protein Data Bank under entry 1AV6.