de novo

6 downloads 0 Views 1MB Size Report
Here we examined the folding intermediates generated by fire- .... Protease treatment of ribosome-bound luciferase nascent chains was used to assess the ...
© 1999 Nature America Inc. • http://structbio.nature.com

articles

Co-translational domain folding as the structural basis for the rapid de novo folding of firefly luciferase

© 1999 Nature America Inc. • http://structbio.nature.com

Judith Frydman1, Hediye Erdjument-Bromage2, Paul Tempst2 and F. Ulrich Hartl3 The 62 kDa protein firefly luciferase folds very rapidly upon translation on eukaryotic ribosomes. In contrast, the chaperone-mediated refolding of chemically denatured luciferase occurs with significantly slower kinetics. Here we investigate the structural basis for this difference in folding kinetics. We find that an N-terminal domain of luciferase (residues 1–190) folds co-translationally, followed by rapid formation of native protein upon release of the full-length polypeptide from the ribosome. In contrast sequential domain formation is not observed during in vitro refolding. Discrete unfolding steps, corresponding to domain unfolding, are however observed when the native protein is exposed to increasing concentrations of denaturant. Thus, the co-translational folding reaction bears more similarities to the unfolding reaction than to refolding from denaturant. We propose that cotranslational domain formation avoids intramolecular misfolding and may be critical in the folding of multidomain proteins.

It is generally accepted that the information necessary to specify the native three-dimensional structure of a protein is inherent in its complete amino acid sequence1. How the native state is reached is far from clear, but recent studies indicate that folding may take place through multiple pathways in which the conformational space available to the polypeptide becomes increasingly restricted2–5 The assumption that the native state is usually the thermodynamically most stable leads to the suggestion that it is possible to gain information on the folding pathway by studying both folding and unfolding processes2–5. However, efficient, reversible folding and unfolding is generally observed only for small proteins. This has hindered our understanding of the folding of larger proteins, containing more than one domain. Chemical denaturation of these proteins followed by dilution into an aqueous solution often leads to inefficient folding and the formation of intermolecular aggregates. Aggregation is usually more pronounced when folding is performed under the conditions prevailing in the cell, that is, high protein concentrations and a restricted set of physical parameters, including pH, salt concentration and temperature. In recent years, it has become clear that in the cell the folding of many newly-synthesized polypeptides is assisted by molecular chaperones6–8, which act by preventing off-pathway reactions such as aggregation7,9. However, another important difference between in vitro refolding and folding in vivo is that during translation the N-terminus of the polypeptide is available for folding before the C-terminus. Although the N-terminal portion of the polypeptide could fold as it emerges from the ribosome, the cooperative nature of the interactions that stabilize folded structures makes it necessary that a complete folding domain (~50–300 amino acids) is available for productive folding into a native tertiary structure2,3. Consequently, a mechanism involving

co-translational folding may be of particular significance for the biogenesis of multidomain proteins, raising the question of what folding pathway is followed by such a polypeptide during translation. While sequential domain folding during translation has recently been demonstrated for artificial fusion proteins10, evidence in support of the hypothesis that this mechanism may be critical for the folding of authentic multidomain proteins is still missing. Here we used the 62 kDa protein firefly luciferase (Mr 62,000) to test this hypothesis. Its structure has been solved to 2 Å resolution, showing a large N-terminal domain and a smaller C-terminal domain11. Firefly luciferase is a good example of the ‘large’ (that is, multidomain) class of proteins, and it has been widely used as a model substrate for chaperone studies. It should be noted that although luciferase is normally located in the specialized peroxisomes of the firefly, it is thought that many peroxisomal proteins fold in the cytosol prior to their uptake into the organelles12. Indeed, when expressed in various eukaryotic cells, import of luciferase into peroxisomes is inefficient and most of the enzyme is localized in the cytosol, where it is fully active13. In the absence of chaperones, in vitro refolding of luciferase upon dilution from denaturant is usually inefficient and leads to aggregation14–16. Unassisted refolding is only observed at very low (nM) protein concentrations but is exceedingly slow, reaching equilibrium only after ~72 h17. In contrast, denatured luciferase is efficiently refolded by several different chaperone systems. For example, chaperone-mediated refolding in rabbit reticulocyte lysate occurs with a half-time of ~8 min14,15,18. Even more striking, when luciferase is translated in reticulocyte lysate or wheat germ lysate, it folds within £1 min18,19. Here we examined the folding intermediates generated by firefly luciferase during de novo folding in a eukaryotic translation

Department of Biological Sciences, Stanford University, Stanford, California, 94305 USA. 2Molecular Biology Program, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, New York 10021, USA. 3Department of Cellular Biochemistry, Max-Planck-Institut für Biochemie, Am Klopferspitz 18a, D-82152 Martinsried, Germany.

1

Correspondence should be addressed to J.F. e-mail: [email protected]

nature structural biology • volume 6 number 7 • july 1999

697

© 1999 Nature America Inc. • http://structbio.nature.com

articles

© 1999 Nature America Inc. • http://structbio.nature.com

reaction. Radiosequencing and mass spectroscopy have permitted the identification of a previously detected protease-resistant fragment of luciferase of 22 kDa as an N-terminal subdomain of luciferase (residues 1–190) that folds during translation. The same 22 kDa domain is present in an unfolding intermediate of luciferase at low concentrations of denaturant. Strikingly, this intermediate is not observed during the chaperone-mediated refolding of denatured luciferase. Our results highlight the parallels between luciferase folding during translation and the unfolding reaction and suggest that co-translational domain folding provides the structural basis for the rapid de novo folding of luciferase. Co-translational formation of a protease-resistant luciferase domain Luciferase was translated in a rabbit reticulocyte lysate in the presence of [35S]methionine18. To gain insight into the folding pathway during translation, we measured the appearance of protease-resistant fragments of luciferase as the polypeptide progresses toward its native structure. This approach takes advantage of the fact that compact, folded protein structures are resistant to mild concentrations of protease. To achieve a homogeneous population of growing chains, the translation reaction was synchronized by addition of an inhibitor of translation initiation, aurintricarboxylic acid (ATCA), 3 min after the addition of mRNA. Formation of fulllength luciferase was observed after 12 min (Fig. 1a, lanes 1–6), and synthesis was complete after 15–18 min. However, full luciferase enzyme activity was only reached ~3 min later (data not shown), indicating that folding is completed post-translationally. Treatment of these reactions with proteinase K at different times during translation revealed the formation of a 22 kDa proteaseresistant fragment of luciferase that could be observed after 8 min of translation, before the synthesis of the complete polypeptide (Fig. 1a). This 22 kDa fragment was the major product of proteinase K treatment between 10 and 15 min of translation. By 18 min the major protease-resistant product was the full-length folded luciferase, while the 22 kDa fragment diminished in intensity. These results, essentially reproducing earlier observations18, are

a

most consistent with co-translational folding of a luciferase domain, followed by the post-translational incorporation of this domain into the structure of native luciferase. Since the amount of the 22 kDa fragment at early time-points is comparable to that of the folded full-length luciferase at later time-points, our results also suggest that the majority of the nascent (that is, ribosomebound) chains contain this folded domain. As expected, when nascent chains of luciferase were isolated from a nonsynchronized translation by centrifugation and subjected to proteinase K treatment, the 22 kDa fragment was readily detected (Fig. 1b). Notably, most of the 35S label present in the range of nascent chains was converted to the 22 kDa band (Fig. 1b, lanes 1 and 2), confirming the notion that the majority of ribosome-bound polypeptides contain this protease-resistant structure. A similar fragment was also detected upon incubation of nascent chains with intermediate concentrations of trypsin but was digested by higher trypsin concentrations (Fig. 1b, lanes 5–8). The generation of the 22 kDa fragment by trypsin validates the use of proteinase K as a tool to detect compactly folded protein structures during translation. Since the molecular mass of [35S]Met-tRNA is ~23 kDa, we included high concentrations of RNase A (0.5 mg ml-1) in the protease assay to eliminate the possibility that the 22 kDa fragment was a peptidyl-tRNA. The 22 kDa fragment was not sensitive to RNase treatment, and neither its size nor its yield were affected by the presence of RNase A (Fig. 1b, lanes 3 and 4). Luciferase unfolds in a domainwise fashion upon chemical denaturation It was important to determine whether the 22 kDa proteaseresistant fragment formed during translation corresponded to a defined, independently folding domain of native luciferase. Only if this were the case can the conclusion be reached that luciferase folds co-translationally, involving a folding intermediate in which this domain is already present. Our analysis was facilitated by the availability of the crystal structure of luciferase11 (see Fig. 5). The large N-terminal

b

Fig. 1 A protease-resistant luciferase domain forms co-translationally. a, Formation of a 22 kDa protease-resistant fragment during a synchronized luciferase translation. Firefly luciferase was translated in reticulocyte lysate at 30 °C, as described in Methods. Translations were synchronized 3 min after initiation of translation. At the indicated times, translation was stopped by threefold dilution of aliquots into ice-cold buffer B. One-half of each reaction was treated with proteinase K to determine the formation of protease-resistant species (right panel). The samples were analyzed by 15% SDS-PAGE and fluorography. b, The protease-resistant domain is formed on nascent chains. Protease treatment of ribosome-bound luciferase nascent chains was used to assess the presence of the 22 kDa folded domain. A nonsynchronized luciferase translation reaction (50 ml) was stopped by threefold dilution in buffer C, and the ribosome–nascent chain complexes were isolated by ultracentrifugation. Equal aliquots of the resuspended ribosomal fraction were incubated for 10 min at 4 °C in the presence or absence of the indicated concentrations of proteinase K or trypsin. Where indicated, RNase A (0.5 mg ml-1) was also included in the incubation.

698

nature structural biology • volume 6 number 7 • july 1999

© 1999 Nature America Inc. • http://structbio.nature.com

articles a

© 1999 Nature America Inc. • http://structbio.nature.com

c

d

domain (amino acids 20–435, white in Fig. 5b) contains a distorted antiparallel b-barrel and two topologically symmetrical subdomains composed of an eight-strand b-sheet flanked on either side by a-helices. The smaller C-terminal a+b domain (amino acids 441–544, magenta in Fig. 5b) lies on top of the b-barrel with the interface of the two domains forming the active site of the enzyme. The region connecting the N- and C- terminal domains is highly flexible and thus not defined in the structure. The last amino acid resolved in the structure is Lys 544, which points away from the protein into the solution, indicating that the very C-terminus of the protein containing the peroxisomal import sequence Ser-Lys-Leu (548–551) is highly unstructured (see Fig. 5b). Since proteins appear to be imported into peroxisomes in a folded state, it would seem reasonable that the peroxisomal import sequence is exposed to the solution12. To gain insight into the structural units of luciferase, we analyzed its unfolding at increasing concentrations of the chemical nature structural biology • volume 6 number 7 • july 1999

b

Fig. 2 Firefly luciferase unfolds in a domainwise manner upon chemical denaturation. a, Luciferase unfolding measured by Trp fluorescence. Firefly luciferase (35 mg) was incubated in 0.6 ml buffer A containing increasing concentrations of guanidinium chloride (GdmCl) for 1 h at 25 °C. The loss of enzymatic activity was assessed using 1 ml of sample, and the remainder was divided into two. One half was used for tryptophan fluorescence while the other half was subjected to proteinase K treatment (b). Tryptophan fluorescence was excited at 295 nm and was recorded between 300 and 380 nm (at a rate of 1 nm (0.5 s)-1 on a Spex Fluorimeter. Unfolding is expressed as the decrease in emission intensity at 336 nm (the Emax for the native protein) and is shown as a function of GdmCl concentration. b, Protease-resistant species present in the unfolding intermediates. Firefly luciferase was denatured as described in (a), and the presence of protease-resistant species was examined following treatment with proteinase K (10 mg ml-1) for 10 min at 4 °C. The proteolysis products were analyzed by 15% SDS-PAGE and silver staining. The higher staining intensity of the 13 kDa protease-resistant species over the 22 kDa fragment and the full-length protein was due to its increased sensitivity to the silver staining protocol. c, Identification of the 22 kDa fragment: HPLC analysis of tryptic digest. The 22 kDa fragment of luciferase was excised from a nitrocellulose blot, digested with trypsin and the resulting peptide mixture was separated by reversed-phase HPLC. Peptides A–H were collected by hand and analyzed by a combination of automated Edman degradation and MALDI-TOF mass spectrometry (see Table 1). d, Amino acid sequence of firefly luciferase indicating the position of the sequenced peptides and the location of Trp residues. The sequence information obtained for peptides in the 22 kDa fragment ( Table 1) and the 13 kDa fragment (see text) is indicated on the amino acid sequence of full-length luciferase. As shown in Table 1, several peptides originated from overlapping regions in the luciferase sequence.

denaturant guanidinium chloride (GdmCl) by monitoring Trp fluorescence. The fluorescence emission maximum of the native protein is 336 nm, whereas the emission maximum in the presence of 6 M GdmCl is shifted to 350 nm, consistent with complete denaturation (data not shown). Progression of denaturant-induced unfolding was measured by following the decrease of fluorescence intensity at 336 nm (Fig. 2a). Incubation of native luciferase with up to 100 mM GdmCl did not cause either an appreciable loss of enzymatic activity or a change of the fluorescence spectrum. Increasing the denaturant concentration above 250 mM resulted in the complete loss of enzymatic activity and in a dramatic decrease in fluorescence intensity, suggesting that the protein had undergone a major unfolding transition. A further drop in fluorescense intensity was observed when the GdmCl concentration was raised beyond 2 M, reaching the same level measured in 6 M GdmCl (Fig. 2a). Thus, between GdmCl concentrations of 500 mM and 699

© 1999 Nature America Inc. • http://structbio.nature.com

articles

© 1999 Nature America Inc. • http://structbio.nature.com

a

b

Fig. 3 Identification of the co-translationally folded domain as the 22 kDa N-terminal fragment of luciferase. a, Chain-length dependent formation of the N-terminal luciferase domain. C-terminal deletions of luciferase were generated by transcription and translation of pGEM-luc linearized with EcoNI (generating a polypeptide comprising amino acids 1–541; lanes 3, 4), with EcoRI (amino acids 1–197; lanes 5, 6) or with BbvI (amino acids 1–122; lanes 7, 8). Full-length luciferase (FL) was also included as a control (lanes 1, 2). After 40 min translation was stopped by addition of puromycin and RNase A, and half of the reaction was analyzed by protease treatment. A sample containing [ 35S]Methionyl-tRNA was included in the gel for comparison. Note the small difference in migration between the 1–197 (lane 5) luciferase truncation and the proteaseresistant 22 kDa species (lane 6). b, Radiosequencing of tryptic peptides derived from the co-translationally formed 22 kDa luciferase domain. To determine the identity of the 22 kDa protease-resistant structure generated co-translationally, luciferase was translated in the presence of [3H]leucine (and a small amount of [35S]methionine). Isolated ribosome[3H,35S]-nascent chain complexes were subjected to proteinase K treatment to generate the 22 kDa fragment, which was analyzed as described for Fig. 2c. The radioactive tryptic peptides of peaks A, C and E were subjected to several chemical sequencing cycles. The output of each sequencing cycle was analyzed by scintillation counting. For each peptide, the predicted amino acid sequence and the expected [3H]leucine (L) are indicated. The results were in complete agreement with the expected sequence of the corresponding luciferase fragment. Note the weak radioactive signal in peptide E at a position where [35S]methionine was expected (M M ).

1,500 mM, an unfolding intermediate(s) of luciferase exists. To correlate the fluorescence measurements with the domain structure of the protein, we incubated luciferase with proteinase K at different concentrations of GdmCl (Fig. 2b). Native luciferase is relatively resistant to treatment with proteinase K. However, protease treatment at low concentrations of GdmCl resulted in the disappearance of full-length luciferase (and of enzymatic activity) and the production of two protease-resistant fragments of ~22 kDa and ~13 kDa. Both fragments were observed at GdmCl concentrations of up to 1 M, although the 22 kDa fragment was significantly more labile than the 13 kDa fragment (in some experiments the 13 kDa fragment was still observed at concentrations above 1 M GdmCl). However, at GdmCl concentrations above 1.5 M both fragments were degraded, indicating their unfolding (Fig. 2a,b). Thus, our data suggest that luciferase undergoes two unfolding steps; the first gives rise to an unfolding intermediate that retains two compact, relatively protease-resistant structures of 22 kDa and 13 kDa that unfold at higher concentrations of denaturant. Notably, when the protein folds during translation, an intermediate containing a 22 kDa protease-resistant fragment is also observed, raising the possibility that this structure is shared by both the unfolding and the de novo folding intermediates. Identification of the 22 kDa and the 13 kDa proteaseresistant luciferase fragments To characterize the unfolding intermediate generated at 0.75 M GdmCl, the 22 kDa and 13 kDa proteolytic fragments of luciferase produced under these conditions were subjected to N-terminal sequencing. The 13 kDa fragment started at position 422 (Asp-IleAla-Tyr-Trp-Asp-Glu-Asp-Glu-His-Phe-Phe…). Notably, the estimated molecular weight of a segment spanning amino acids 422–544 (at the end of the compact C-terminal domain) is 13.4 kDa. Moreover, the presence in this compact fragment of one of the two Trp residues in luciferase (Trp 426, Fig. 2d) is consistent with the observation that the unfolding intermediate retained a degree of native Trp fluorescence. Thus, the 13 kDa fragment appears to correspond to an independently folded domain at the Cterminus of luciferase (amino acids 422–544)11. In contrast, the 22 kDa fragment yielded no N-terminal sequence during Edman degradation, suggesting that its N-terminus was blocked. 700

nature structural biology • volume 6 number 7 • july 1999

© 1999 Nature America Inc. • http://structbio.nature.com

© 1999 Nature America Inc. • http://structbio.nature.com

articles a

b

Fig. 4 The 22 kDa domain is not observed during the chaperone-mediated refolding of denatured luciferase. a, Kinetics of luciferase renaturation. Denatured luciferase was diluted into reticulocyte lysate and incubated at 30 °C in the presence or absence of Mg-ATP. Aliquots were removed at the times indicated and luciferase activity (solid lines) was assayed as described. Luciferase folding was also monitored by the appearance of protease-resistant luciferase (dashed lines, see b). b, Appearance of protease-resistant luciferase during refolding. The time course described in (a) was analyzed for the presence of proteaseresistant species as described in Fig. 1. The proteolysis products were separated on 15% SDS-PAGE and visualized by western blot using a rabbit polyclonal antiserum raised against luciferase. The right panel demonstrates that the antibodies can detect the 22 kDa and 13 kDa luciferase subdomains as these protease-resistant fragments, generated from the unfolding intermediates of luciferase (Fig. 2a), are detected by western blot. The unfolding intermediates containing the 22 kDa and the 13 kDa protease-resistant domains are observed irrespective of whether luciferase is denatured in reticulocyte lysate or in buffer. FL, Full-length luciferase; T, total. c, Same as (b) except that 35S-labeled luciferase was used and the proteolysis products were visualized by fluorography. The right panel is an overexposed image of the same gel to uncover the heterogeneity of the protease fragments produced.

c

Identification of the 22 kDa fragment was therefore attempted using internal peptides obtained by digestion with trypsin followed by separation by reverse-phase HPLC (Fig. 2c). Peptides in selected peaks (peptides A–H in Fig. 2c) were subjected to delayed-extraction matrix-assisted laser-desorption/ionization reflectron timeof-flight mass spectrometry (MALDI-TOF MS; Table 1). The monoisotopic experimental mass, m/z (mass/charge ratio), permitted the sequence assignment of all the peptides, which corresponded closely to the calculated monoisotopic mass (MH+), of fragments in the N-terminal region of luciferase (Table 1). The sequence assignments made by mass spectrometry were confirmed by automated Edman degradation sequence analysis. The most N-terminal fragment (peptide C) corresponded to amino acids 9–28, and the most C-terminal sequence that was retrieved ended with amino acid 190 (peptide G, see Fig. 2d). Since the N-terminus of the 22 kDa fragment was apparently blocked, it is likely that the fragment begins with the N-terminal amino acid of luciferase. The C-terminal boundary of the fragment is probably very close to amino acid 190, as the expected molecular mass of a fragment comprising residues 1–190 is 21.4 kDa, in close agreement with the mobility of the fragment on SDS-PAGE. Thus, the unfolding intermediate of luciferase characterized in this study contains two compact domains, one at the N-terminus, comprising amino acids 1–190, and one at the C-terminus of ~125 amino acids. These independently stable units represent subdomains of the N-terminal and C-terminal structural domains of luciferase, respectively.

on the ribosome-bound luciferase chains during de novo folding. First, we determined the minimal chain length of luciferase that is required for the folding of the 22 kDa fragment. Truncated mRNAs encoding luciferase polypeptides of increasing length were translated and released from the ribosome with puromycin. The formation of a folded structure was investigated by treatment of the translation reaction with proteinase K. C-terminal deletions of up to 355 amino acids, resulting in a luciferase chain comprising amino acids 1–197, still yielded the 22 kDa fragment (Fig. 3a), whereas further deletions did not give rise to protease-resistant species. Notably, protease treatment of the translation product of residues 1–197 (22.2 kDa) gave rise to the 22 kDa fragment but resulted in a very small decrease in size, thereby supporting the notion that the C-terminal boundary of this folded region is close to position 190 in the luciferase sequence. The identity of the 22 kDa protease-resistant structure generated co-translationally with the N-terminal subdomain present in the unfolding intermediate was confirmed by direct sequencing. The problem presented by the low protein amounts produced upon translation in vitro was overcome by using a radiosequencing approach. The use of [3H]leucine as the labeled amino acid during translation was particularly convenient, since several peptides generated by tryptic digest contained leucine residues very close to the N-terminus (see Fig. 2c,d). To facilitate the detection of nascent chains during the sample preparation, we also included a low concentration of [35S]methionine in the 22 kDa subdomain is formed in a co-translational manner translation. Ribosome-nascent chain complexes were isolated by We next investigated whether the 22 kDa N-terminal structure centrifugation through a sucrose cushion to remove full-length present in the luciferase unfolding intermediate is also formed luciferase and prematurely released chains, as well as most nature structural biology • volume 6 number 7 • july 1999

701

© 1999 Nature America Inc. • http://structbio.nature.com

articles

© 1999 Nature America Inc. • http://structbio.nature.com

a

b

Fig. 5 Position of different domains in the luciferase crystal structure. a, Ribbon diagram of luciferase highlighting the N-terminal subdomain. The subdomain comprising amino acids 1–189 (shown in red) folds co-translationally and is connected by amino acids 190–191 (in cyan) to the rest of the protein. b, Ribbon diagram of luciferase highlighting the C-terminal domain. Amino acids 422–544 are shown in magenta. The Trp residues are shown in space-fill. The amino acids at the boundary of the unstructured interface between the N- and C-terminal domains (435, 441, 523, 529) are shown in yellow as balls and sticks. Residue Lys 544 (in red) is the last amino acid resolved in the structure.

cytosolic proteins. The ribosome-[3H]-nascent chain complexes were then subjected to proteinase K treatment to generate the 22 kDa fragment, which was separated by SDS-PAGE and transferred to a nitrocellulose filter. The 22 kDa fragment was detected by fluorography, excised from the filter and mixed with a nitrocellulose piece containing unlabeled 22 kDa proteolytic fragment obtained from purified denatured luciferase which served as an internal control for the subsequent analysis. The mixture of radiolabeled and unlabeled fragments was subjected to tryptic digest and separated by reverse-phase HPLC. The unlabeled 22 kDa fragment produced the absorbance pattern we had previously characterized (Fig. 2c). When the eluted fractions were monitored for the presence of radioactivity, [3H]Leu counts were recovered only in the peptide peaks for which the presence of Leu residues was predicted by the sequence obtained for the unlabeled fragment (not shown). Moreover, in peptides in which two leucines were expected we detected twice the amount of radioactivity. The radioactive tryptic peptides of three selected peaks were then subjected to several chemical sequencing cycles, and the presence of [3H]leucine in the output of each cycle was measured (Fig. 3b). The results were in complete agreement with the expected sequence of the corresponding luciferase fragment. Moreover, we obtained a weak signal for 35S at a position where a Met was expected. This arose from the trace amounts of [35S]Met that had been added to visualize the protein fluorographically (Fig 3b, peptide E). These results demonstrate that an N-terminal subdomain of luciferase, corresponding to residues 1–190, folds during translation. The same domain is preserved in an unfolding intermediate of luciferase. The 22 kDa subdomain is not observed during chaperone-mediated refolding Luciferase denatured by urea or GdmCl refolds efficiently in a chaperone-dependent manner14,20. Upon dilution into reticulocyte lysate, denatured luciferase is bound by endogenous chaperones and refolds in an ATP-dependent process. Since luciferase folding during translation occurs in a domainwise manner, we examined whether chemically denatured luciferase refolds following a similar pathway. Refolding was monitored by the 702

appearance of both protease-resistant domains and enzymatic activity. As previously reported, refolding in this system occurred with slower kinetics than luciferase folding during translation18. The analysis of the renaturation reaction by proteinase K treatment yielded resistant full-length luciferase with kinetics that corresponded to the formation of enzymatically active enzyme (Fig. 4a). Notably, although the protein refolded very efficiently to the native state (Fig. 4a), we failed to detect any folding intermediate with the protease resistance pattern observed during unfolding or de novo folding measured by either western blot analysis (Fig. 4b; see right panel for western blot of unfolding intermediates) or using radiolabeled luciferase (Fig. 4c). The absence of a detectable 22 kDa intermediate was not due to competition by the lysate proteins for proteinase K digestion, since luciferase is completely digested at the early time-points of the refolding reaction or in the absence of added ATP (Fig. 4). Overexposure of the refolding kinetics using radiolabeled luciferase provided insight into the folding intermediates generated during the in vitro reaction. As shown in the right panel in Fig. 4c, this revealed a complex pattern of protease-resistant products, including one at 22 kDa, that decreased as luciferase folded. The small proportion of luciferase chains that generate protease-resistant species (1,000 folding intermediates during biosynthetic folding and thus for times slower than folding during translation17. The complete the rapid speed of luciferase folding during translation. Our polypeptide apparently forms an unproductive intermediate(s) observations stress the significance of co-translational domain- that can be destabilized by detergents or, more efficiently, by wise folding as a mechanism to control the error rate of the fold- molecular chaperones. While this may explain the efficient refolding of luciferase in the presence of chaperones, this reacing process. The recent crystal structure of luciferase (Fig. 5) allows us to tion is still 8–10 times slower than co-translational folding. correlate the intermediates present in the de novo folding and Importantly, the difference in luciferase folding intermediates unfolding pathways with the final structure. Luciferase contains observed during translation and during refolding may explain a large N-terminal domain (residues 1–435) and a small C-ter- previously observed differences in the chaperone requirements minal domain (residues 441–544). Upon chemical denaturation, of these two processes20. The possibility of co-translational folding may have influenced an unfolding intermediate forms at a GdmCl concentration of 250 mM. Enzymatic activity is rapidly lost at this point, accom- the folding pathways of many multidomain proteins by estabpanied by a decrease in the fluorescence intensity of the two lishing a hierarchy in the folding of the individual domains of a tryptophan residues of luciferase (at positions 417 and 426) to an protein. Indeed, it is striking that during translation, the majoriintermediate level, in good agreement with the findings of ty of the luciferase chains transit through an intermediate conSeckler and colleagues17. This unfolding intermediate retains two taining the folded N-terminal subdomain (Fig. 1)18. In contrast, compact structures: the 22 kDa N-terminal portion of the large the chaperone-mediated refolding of the chemically denatured N-terminal domain and the 13 kDa C-terminal domain. Analysis full-length protein appears to proceed through an ensemble of of the luciferase crystal structure indicates that, within the N-ter- different conformations, most of which may be kinetically minus, amino acids 1–190 are folded into a compact subdomain trapped. The co-translational formation of the N-terminal (indicated in red in Fig. 5a)11. Although residues 4–20 are domain may provide a scaffold for further folding, thereby preunstructured in the crystal, our data suggest that this segment is venting the formation of kinetic traps and facilitating rapid foldpart of the N-terminal subdomain (Fig. 2c,d and Table 1). ing. It has been observed that when expressed individually, Residues 189–191 (cyan in Fig. 5a) connect this subdomain to domains often refold spontaneously and with high efficiency21. the rest of the protein and form a loop that would be easily acces- However, the same domain integrated within the complete sible to proteases. Residues 192–435 of the N-domain apparently polypeptide may be unable to refold, ultimately leading to aggreform a less stable structure that is unfolded at lower concentra- gation. For example, the 89 kDa protein aspartokinase-homosertions of denaturant. We did not observe co-translational folding ine dehydrogenase possesses an N-terminal domain of 45 kDa of this part of the domain, although in some experiments the that contains the kinase activity. If analyzed individually, this Nkinetic analysis of translations revealed a ~50 kDa protease-resis- terminal domain refolds spontaneously with good kinetics21. tant band (see Fig. 1a). Since the amounts of this fragment var- However, renaturation of the complete protein under the same ied between translations and did not decrease as the full-length conditions is considerably slower and more inefficient, and does protein folded, it is most likely derived from prematurely not yield kinase activity21, suggesting that unfavorable interacreleased chains. Our data suggest that the C-terminal domain tions with the rest of the protein can occur during folding of a comprises amino acids 422–544 (magenta in Fig. 5b), in contrast domain. De novo folding of luciferase in reticulocyte lysate has been to the crystal structure where residues 422–435 are assigned to the N-terminal aspect of the interface between N- and C-termi- described to involve the chaperones Hsp70/Hsp40 and the chapnal domains (Fig. 5b). However, this region is not well defined in eronin TRiC (also known as CCT)18. The participation of TRiC the structure and there is no structural information for amino may have to do with the fact that the large N-terminal domain of acids 435–441 and 523–529 (indicated as a black ribbon in Fig. luciferase, containing the co-translationally folding subdomain 5b). Importantly, the assignment of amino acid 422 as the 1–190, does not fold before completion of synthesis. Nascent boundary of the C-terminal domain is consistent with the Trp luciferase may thus expose a significant number of hydrophobic fluorescence data. There are two Trp residues in luciferase sepa- residues that may be recognized by TRiC. More generally, rated by a stretch of 10 amino acids. One of them (Trp 426, domainwise folding of modular polypeptides may be indepenmagenta in Fig. 5b) is located within the observed 13 kDa pro- dent of the chaperonin10. In contrast, stabilization of nascent tease-resistant C-terminal domain. The other one (Trp 417, cyan chains by Hsp70/Hsp40 may be important for the de novo foldin Fig. 5b) would be located in the linker region. Interestingly, ing of many multidomain proteins at early times during translathe unfolding intermediate that retains the 13 kDa domain also tion when the structural information to fold a domain is not yet retains a considerable degree of native-like Trp fluorescence at available. Our results suggest that for multidomain proteins, the folding 336 nm, and the decrease in native fluorescence between 1.5 and 2 M GdmCl correlates with the unfolding of the 13 kDa domain. intermediates observed during in vitro refolding may not reflect Our results demonstrate that the compact, protease-resistant how the protein folds during translation. The concept that effiN-terminal subdomain of luciferase (amino acids 1–190) is not cient folding of some multidomain proteins may critically nature structural biology • volume 6 number 7 • july 1999

703

© 1999 Nature America Inc. • http://structbio.nature.com

articles Table 1 MALDI-TOF MS Analysis of the peptides obtained from the 22 kDa fragment Peptide Measured (m/z) A 974.454 B 839.505 C 2,155.041 D 2,027.070 E 2,022.005 E’ 2,006.062 F 3,937.537 G 4,165.087 H 3,921.642

Calculated(MH+) 974.480 839.572 2,155.082 2,026.987 2,022.058 Metox 2,006.063 3,937.738 Metox 4,164.860 3,921.738

Residue position in gDa (p.p.m.) Residue position in sequence -0.026 (27) 70–77 0.067 (79) 142–148 -0.041 (20) 9–28 0.083 (40) 10–28 -0.053 (26) 113–130 -0.001 (1) 113–130 0.201 (51) 156–188 0.227 (53) 156–190 -0.096 (24) 156–188

3H-Leu

Yes Yes Yes Yes Yes Yes ND ND Yes

© 1999 Nature America Inc. • http://structbio.nature.com

The tryptic peptides A –H were purified as shown in Fig. 2c and analyzed using a REFLEX III instrument as described in Methods. The mass/charge ratios (m/z) were used to assign the sequence for all the peptides and corresponded closely to the calculated monoisotopic mass (MH+). Edman degradation sequence analysis of the N-termini of these peptides confirmed the amino acid sequence deduced from the MALDI-TOF MS analysis.

depend on a co-translational mechanism may help in optimizing the large-scale production of proteins of medical interest. In addition, a number of diseases have been shown to be associated with defects in protein folding22. Importantly, some mutations may specifically destabilize the formation of productive folding intermediates, rather than the native state itself23,24. Our finding that different folding pathways may be followed by a protein during translation or during in vitro refolding underscores the importance of characterizing the pathways of protein folding in vivo and of understanding their relationship to the folding pathways observed in vitro. Methods Translations, ribosome isolation and luciferase assay. Firefly luciferase RNA was transcribed from plasmid pGEM-luc (Promega, Madison, Wisconsin) and translated in 50% nuclease-treated reticulocyte lysate (Promega) in the presence of [35S]methionine (0.8 mCi ml-1) as recommended by the manufacturer. The reaction volume was adjusted with buffer A (20 mM HEPES-KOH, pH 7.4, 100 mM KAcO, 5 mM Mg(AcO)2, 0.5 mM EDTA, 1 mM dithiothreitol (DTT)). Where indicated, translations were synchronized by addition of 75 mM aurintricarboxylic acid (ATCA) 3 min after initiation of translation by the addition of luciferase mRNA. Unless otherwise indicated, translation was stopped at the indicated times by threefold dilution of aliquots into ice-cold buffer B (25 mM Tris-phosphate pH 7.8, 2 mM trans-1,2-cyclo-hexanediaminetetraacetate (CDTA) and 1 mg ml-1 BSA) containing 2 mM Met and either 2 mM cycloheximide or 2 mM puromycin. Luciferase activity was determined at 25 °C. Amounts of total translation products and of fulllength luciferase were determined by Phosphorimager quantitation. The first 270 residues of luciferase contain 11 out of the 14 methionine residues of the protein. Specific enzyme activities (SEA) were calculated from the amounts of full-length protein based on the specific radioactivity of [35S]Met in the reaction mixture. The SEA of translated luciferase was equivalent to that of a purified protein standard. To obtain C-terminal deletions of luciferase, pGEM-luc was linearized with EcoNI (generating a polypeptide comprising amino acids 1–541), with EcoRI (1-196) or with BbvI (1–122). After 40 min translation was stopped by addition of puromycin and RNase A and the reaction was analyzed by protease treatment. Protease treatment (proteinase K or trypsin) was performed essentially as described18. Briefly, samples were incubated for 10 min at 4 °C with the indicated concentration of protease, and the reaction was stopped by the addition of PMSF to 1 mM. Isolation of ribosome–nascent chain complexes. A 50 ml translation reaction (20 min at 30 °C) was diluted threefold in buffer C (buffer A, 5 mM Mg(AcO)2, 0.5 mM cycloheximide, 1 mM DTT, 1 U ml-1 RNAsin, 1 mM PMSF, 2 mg ml-1 aprotinin, 0.5 mg ml-1 leupeptin). The reaction mixture was spun through a 350 ml cushion of 35% sucrose in buffer C in a TL120 rotor for 10 min (120,000 r.p.m.,

704

4 °C). The ribosomal pellet was resuspended in buffer A containing puromycin, separated into equal aliquots and treated with proteinase K or trypsin as described. In vitro unfolding and refolding. Firefly luciferase was purchased from Sigma or expressed in E. coli as described14,20. The E. coli expression system was used to obtain purified, 35S-labeled luciferase. The pure protein (57 mg ml-1) was denatured by incubation in buffer A-5 mM DTT containing varying concentrations of GdmCl, for 1 h at 25 °C. Similar results were obtained with urea as the denaturant. An aliquot of 1 ml was used to determine enzymatic activity, one-half of the reaction was treated with proteinase K (10 mg ml-1) and the intrinsic Trp fluorescence was determined in the other half. The pattern of proteinase K-resistant fragments was visualized by SDS-PAGE followed by silver staining. Trp fluorescence was excited at 295 nm and was recorded between 300 and 380 nm (at a rate of 1 m (0.5 s) -1, using quartz microcuvettes and a Spex Fluorimeter. Background fluorescence, measured for each GdmCl concentration, was subtracted from the recorded spectra. To analyze refolding, purified luciferase was denatured in 6 M GdmCl, 5 mM DTT, 50 mM HEPES-KOH, pH 7.4, for 30 min at 30 °C and diluted 100-fold into reticulocyte lysate (Promega) that had been desalted in buffer A to remove ATP 14,20. After removing protein aggregates by centrifugation (20,000 g for 15 min at 4 °C), refolding was initiated by addition of 1 mM ATP and an ATP-regenerating system (50 mg ml-1 creatine kinase, 8 mM phosphocreatine), as described14,20. At different time points the reactions were stopped on ice by threefold dilutions of aliquots into ice-cold buffer B. Activity was measured as described, and one-half of each reaction was treated with proteinase K (10 mg ml-1) at 4 °C. Proteolysis products were visualized by either fluorography (in the case of the 35S-labeled luciferase) or by western blot analysis using a rabbit polyclonal antiserum raised against luciferase. Peptide analysis and sequence determination. To identify the 22 kDa fragment of luciferase, the protein band was excised from a nitrocellulose blot and processed for internal amino acid sequence analysis as described25. Briefly, in-situ proteolytic cleavage was done with 0.2 mg trypsin (Promega) in 25 ml 100 mM NH4HCO3 (supplemented with 1% Zwittergent 3-16) at 37 oC for 2 h. The resulting peptide mixture was reduced and alkylated with 0.1% b-mercaptoethanol (BioRad) and 0.3% 4-vinyl pyridine (Aldrich), respectively, and fractionated by reversed-phase HPLC. An enzyme blank was analzyed on an equally sized strip of nitrocellulose. HPLC solvents and system configuration were as described26 except that a 2.1 mm column (214 TP54 Vydac C4; Separations Group) was used with gradient elution at a flow rate of 100 ml min-1. Trp containing peptides were identified by ratio analysis of absorbances at 297 and 277 nm, monitored in real time using an Applied Biosystems model 1000S diode-array detector27. Fractions were collected by hand, kept on ice, and stored at -70 oC before analysis. Purified peptides were analyzed by a combination of automated Edman degradation and MALDI-TOF MS. After storage, column fractions were supplemented with neat TFA (to give a final concen-

nature structural biology • volume 6 number 7 • july 1999

© 1999 Nature America Inc. • http://structbio.nature.com

© 1999 Nature America Inc. • http://structbio.nature.com

articles tration of 10%) before loading onto the sequencer disks. Mass analysis (on 0.1 ml aliquots) was carried out using a REFLEX III instrument (Bruker-Franzen) with a-cyano-4-hydroxy cinnamic acid (Hewlett-Packard) as the matrix in the presence and absence of three peptide calibrants (6.25 fmol each of APID, P8930 and Ho1017; with respective monoisotopic masses of 2,108.155 Da, 1,307.762 Da and 969.539 Da in the protonated form) as described 28. Peptide monoisotopic masses were calculated using ProComp version 1.2 software (P.C. Andrews, University of Michigan, Ann Arbor). Voltages used were 25 kV ion acceleration, 26.5 kV reflector and 1.4 kV multiplier. For chemical sequencing (on 95% of the sample) we used a model 477A instrument from AB. Stepwise liberated PTHamino acids were identified using an ‘on-line’ 120A HPLC system

(AB) equipped with a PTH C18 (2.1 ´ 220 mm, 5 mm particle size) column (AB). Instruments and procedures were optimized for femtomole-level phenyl thiohydantoin amino acid analysis as described27. For radiosequencing, the 477A instrument was disconnected from the LC system, and the amino acid derivatives after each cycle collected for scintillation counting.

1. Anfinsen, C.B. Principles that govern the folding of protein chains. Science 181, 223–30 (1973). 2. Creighton, T.E. Protein folding. (W.H. Freeman, New York; 1992). 3. Jaenicke, R. Protein folding: local structures, domains, subunits, and assemblies. Biochemistry 30, 3147–3160 (1991). 4. Dill, K.A. & Chan, H.S. From Levinthal to pathways to funnels Nature Struct. Biol. 4, 10–19 (1997). 5. Dobson, C.M., Sali, A. & Karplus, M. Protein-folding: a perspective from theory and experiment. Angew. Chem. Int. Edn. Engl. 37, 868–893 (1998). 6. Gething, M.-J. & Sambrook, J. Protein folding in the cell. Nature 355, 33–45 (1992). 7. Hartl, F.U. Molecular chaperones in cellular protein folding Nature 381, 571–579 (1996). 8. Ellis, R.J. The “bio” in biochemistry: protein folding inside and outside the cell. Science 272, 1448–1449 (1996). 9. Bukau, B. & Horwich, A.L. The Hsp70 and Hsp60 chaperone machines Cell 92, 351–366 (1998). 10. Netzer, W. & Hartl, F. Recombination of protein domains facilitated by cotranslational folding in eukaryotes Nature 388, 343–349 (1997). 11. Conti, E., Franks, N.P. & Brick, P. Crystal structure of firefly luciferase throws light on a superfamily of adenylate-forming enzymes Structure 4, 287–298 (1995). 12. McNew, J.A. & Goodman, J.M. The targeting and assembly of peroxisomal proteins: some old rules do not apply Trends Biochem. Sci. 21, 54–58 (1996). 13. deWet, J.R., Wood, K.V., DeLuca, M., Helinsky, D.R. & Subramani, S. Firefly luciferase gene: structure and expression in mammalian cells Mol. Cell. Biol. 7, 725–737 (1987). 14. Nimmesgern, E. & Hartl, F.U. ATP-dependent protein refolding activity in reticulocyte lysate. Evidence for the participation of different chaperone components. FEBS Lett. 331, 25–30 (1993). 15. Schumacher, R.J. et al. ATP-Dependent chaperoning activity of reticulocyte lysate J. Biol. Chem. 269, 9493–9499 (1994). 16. Freeman, B.C., Myers, M.P., Schumacher, R. & Morimoto, R.I. Identification of a regulatory motif In Hsp70 that affects ATPase activity; substrate-binding and interaction with Hdj-1 EMBO J. 14, 2281–2292 (1995).

17. Herbst, R., Schafer, U. & Seckler, R. Equilibrium intermediates in the reversible unfolding of firefly (Photinus pyralis) luciferase J. Biol. Chem. 272, 7099–7105 (1997). 18. Frydman, J., Nimmesgern, E., Ohtsuka, K. & Hartl, F.U. Folding of nascent polypeptide chains in a high molecular mass assembly with molecular chaperones Nature 370, 111–117 (1994). 19. Kolb, V.A., Makeyev, E.V. & Spirin, A.S. Folding of firefly luciferase during translation in a cell-free system EMBO J. 13, 3631–3637 (1994). 20. Schneider, C. et al. Pharmacological shifting of a balance between protein refolding and degradation mediated by Hsp90 Proc. Natl. Acad. Sci. USA 93, 14536–14541 (1996). 21. Garel, J.R. In Protein folding (ed. Creighton, T.E.) 405–454 (W.H. Freeman, New York; 1992). 22. Taubes, G. Misfolding the way to disease Science 271, 1493–1495 (1996). 23. Mitraki, A., Fane, B., Haasepettingell, C., Sturtevant, J. & King, J. Global suppression of protein folding defects and inclusion body formation Science 253, 54–58 (1991). 24. Qu, B.H. & Thomas, P.J. Alteration of the cystic-fibrosis transmembrane conductance regulator folding pathway: effects of the delta-F508 mutation on the thermodynamic stability and folding yield of NBD1. J. Biol. Chem. 271, 7261–7264 (1996). 25. Lui, M., Tempst, P. & Erdjument-Bromage, H. Methodical analysis of protein–nitrocellulose interactions to design a refined digestion protocol. Anal. Biochem. 241, 156–166 (1996). 26. Elicone, C., Lui, M., Geromanos, S., Erdjument-Bromage, H. & Tempst, P. Microbore reversed-phase high-performance liquid chromatographic purification of peptides for combined chemical sequencing / laser-desorption mass spectrometric analysis. J. Chromatogr 676, 121–137 (1994). 27. Erdjument-Bromage, H., Lui, M., Sabatini, D.M., Snyder, S.H. & Tempst, P. Highsensitivity sequencing of large proteins: partial structure of the rapamycinFKBP12 target. Protein Sci. 3, 2435–2446 (1994). 28. Tempst, P., et al. In Mass spectrometry in the biological sciences. (eds Burlingame, A.L. & Carr, S.A.) 105–133 (Humana Press, Totowa, New Jersey; 1996).

nature structural biology • volume 6 number 7 • july 1999

Acknowledgments J.F. is supported by an NIH grant. P.T. is supported by an NSF grant and an NCI grant to the Sloan-Kettering Structural Chemistry Laboratory. Received 9 October, 1998; accepted 20 January, 1998.

705