Aug 7, 1986 - distances dN 2 daN (ii+ 1), dNN dNN (iji +1), and dON. dON (i i+1). ..... Nestor, J. J., Jr., Newman, S. R., DeLuttro, B., Todaro, G. J.. & Schreiber ...
Proc. Nati. Acad. Sci. USA Vol. 83, pp. 8594-8598, November 1986 Biophysics
Identification of two anti-parallel ,8-sheet conformations in the solution structure of murine epidermal growth factor by proton magnetic resonance (NMR determination of protein structure/sequence-specific NMR assignments/oncogenesis/transforming growth factor/ molecular design)
G. T. MONTELIONEtt, K. WUTHRICHt, E. C. NICE§, A. W. BURGESS§, AND H. A. SCHERAGAt¶ tBaker Laboratory of Chemistry, Cornell University, Ithaca, NY 14853-1301; tInstitut fur Molekularbiologie und Biophysik, Eidgenossische Technische Hochschule-Honggerberg, CH-8093 Zurich, Switzerland; and §Ludwig Institute for Cancer Research, Melbourne Tumour Biology Branch, Victoria 3050, Australia
Contributed by H. A. Scheraga, August 7, 1986
homogeneous by both C18 reversed-phase and Pharmacia Mono-Q anion-exchange (pH 6.5) analytical HPLC using conditions identical to those described previously (16, 17). The amino acid composition of these preparations, subsequent to acid hydrolysis [Asx, 7.4 (7); Glx, 3.4 (3); Ser, 6.6 (6); Gly, 5.8 (6); His, 1.1 (1); Arg, 3.8 (4); Thr, 2.2 (2); Ala, 0.1 (0); Pro, 2.0 (2); Tyr, 4.5 (5); Val, 2.2 (2); Met, 0.9 (1); Cys, 6.1 (6); Ile, 2.2 (2); Leu, 3.8 (4); Phe, 0.1 (0); Lys, 0.1 (0)], is consistent with the published sequence (2). Although preliminary studies revealed chemical instability of murine EGF at neutral pH, analytical Mono-Q HPLC subsequent to NMR measurements at slightly acidic pH revealed negligible chemical decomposition. Samples for NMR spectroscopy were prepared at 6-7 mM protein concentration at pH 3.0 ± 0.1. Protein concentration was determined using 8278 = 19,000 cm-' M-1 (unpublished result). For 2H20 solutions, the pH meter reading (pH*) is reported without correction for isotope effects. For measurements in 2H20 after exchange of all labile protons, EGF solutions in 2H20 were first heated to 550C for 5 min at pH* 3.0 to accelerate hydrogen/deuterium exchange and then lyophilized and redissolved in 99.9% 2H20 at pH* 3.0 ± 0.1. For measurements in H20 the protein was dissolved directly in 85% H20/15% 2H20. Slowly exchanging amide protons were identified in freshly prepared EGF solutions in 2H20. 'H-NMR spectra were obtained at 28 ± 1°C on Bruker spectrometers at 500, 360, and 300 MHz. Two-dimensional two-quantum-filtered correlation spectroscopy (2QF-COSY) (18), 2D nuclear Overhauser effect spectroscopy (NOESY) (19, 20), 2D relayed coherence-transfer spectroscopy (RELAYED-COSY) (21), and 2D two-quantum spectroscopy (2Q-spectroscopy) (22) were carried out as described elsewhere. For measurements in H20 the solvent resonance was suppressed by selective saturation (23). All 2D-NMR data sets were obtained with time-proportional phase incrementation (24, 25) and transformed as phase-sensitive spectra. Throughout the text, dAB(iJ) designates the distance between proton types A and B located, respectively, in the amino acid residues i andj, where N, a, and , denote the amide protons, CaH, and COH, respectively. Examples used in the text are dN (i,i+2), dN (i,i+3), dNN (i,i+2), and the sequential distances dN 2 daN (ii+ 1), dNN dNN (iji +1), and dON dON (i i+ 1).
Epidermal growth factor (EGF) is a small ABSTRACT mitogenic protein. Proteins with sequence homology with EGF or with its membrane-bound protein receptor have been proposed to play a role in oncogenesis. This report describes solution NMR data that provide evidence that the solution conformation of murine EGF includes an anti-parallel (8-sheet structure involving residues S2-P4, V19-123, and S28-N32; a small anti-parallel (3-sheet involving residues Y37-S38 and T44-R45; and a multiple-bend (or short irregular helix) structure for residues C6-C14 that is disulfide bonded to the V19-123/S28-N32 (3-sheet. Implications of these results for structure and function studies of EGF and for molecular design of EGF and homologous a-type transforming growth factors are discussed.
The murine epidermal growth factor (EGF) is a small 53residue protein containing three disulfide bonds (1, 2). EGF stimulates cell proliferation in vitro and in vivo, and human EGF (urogastrone) also functions physiologically to inhibit gastric acid secretion (3). EGF has sequence (4) and functional (5) homology with a-type transforming growth factor (TGF-a), which has been isolated from virus-transformed mammalian cells (6) and from certain human melanoma cell lines (4, 7). TGF-a competes with 1251I-labeled EGF for EGF receptor sites (6, 7). Several larger proteins also have polypeptide sequences homologous to EGF (8). The EGF receptor protein has amino acid sequence homology with the v-erbB oncogene protein of avian erythroblastosis virus (9). On the basis of these and other observations, EGF and TGF-a have been proposed to play an important role in the molecular mechanisms controlling normal cell growth, oncogenesis (10, 11), and wound healing (12). No crystal structure of EGF is available, but there are 1H NMR data (13, 14) that indicate the presence of compact backbone structure, possibly corresponding to a "tiered p-sheet" (14). Its small size makes EGF a good candidate for complete solution structure determination by NMR spectroscopy (15). Such structural data could provide guidance for molecular design of EGF and TGF-a analogs. This paper is a preliminary report on two-dimensional (2D) NMR studies of the structure and dynamics of murine EGF in solution.
MATERIALS AND METHODS Murine EGF (type al) from male submaxillary glands was purified by methods described elsewhere (16, 17). These preparations were rechromatographed and found to be >99%
Abbreviations: EGF, epidermal growth factor; TGF-a, a-type transforming growth factor; 2D, two-dimensional; NOE, nuclear Overhauser effect; NOESY, 2D nuclear Overhauser effect spectroscopy; RELAYED-COSY, 2D relayed coherence-transfer spectroscopy; 2Q-spectroscopy, 2D two-quantum spectroscopy; 2QFCOSY, 2D two-quantum-filtered correlation spectroscopy. $To whom reprint requests should be addressed.
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.
8594
Proc. Natl. Acad. Sci. USA 83 (1986)
Biophysics: Montelione et al. RESULTS The experimental data presented in this paper deal primarily with the identification of regular backbone structure in the solution conformation of EGF. Using established methods (15, 26-28), sequential nuclear Overhauser effects (NOEs) observed in NOESY spectra recorded with Tm = 200 ms (Tm, mixing time in NOESY) were used to determine the sequence-specific 1H-NMR assignments (unpublished results). These assignments form the basis for the present structural interpretation of the NMR data. In determining these sequence-specific resonance assignments, no inconsistencies with the published amino acid sequence were noted, although several weak resonances remain unassigned. These may correspond to protein chemical heterogeneity that is not resolved by HPLC, or may arise from multiple conformations of the EGF molecule in slow exchange on the proton chemical shift time scale. As in previous work on other proteins (29-32), several NMR criteria were used for identifying regular backbone structures in EGF. Most important among these were the short backbone-proton-backbone-proton distances identified by NOESY. The short-range NMR data used for this conformational analysis are summarized in Fig. 1. An empirical calibration of NOESY-cross-peak intensity versus backbone-proton-backbone-proton distance in EGF was obtained from the following considerations. Of the 44 sequential distances daN (excluding those involving glycine aHs), which may range from 2.2 to 3.6 A (27), 30 gave rise to clearly observable NOEs in the NOESY spectrum recorded with Tm = 65 ms. The presence or absence of 11 cross peaks could not be determined unambiguously because of spectral overlap or other technical reasons. The remaining three NOEs corresponding to daN (13,14), daN (40,41), and daN (43,44), which were not observed with Tm = 65 ins, could be identified in a NOESY spectrum recorded with Tm = 200 ms. In addition, of the 47 sequential distances dNN, which can range from 2.0 to 4.8 A (27), only 18 gave rise to observable NOEs in the 65 ms NOESY spectrum. We estimate from these observations that backbone-proton-backbone-proton distances -3.5 A give rise to observable NOEs in the NOESY spectrum recorded with Tm = 65 ms. Using this approximate upper bound for the proton-proton backbone distances associated with the Tm = 65 ms NOESY
JHNa
00
daN
aMME
0
0
ID(XXXI
cross peaks, extended polypeptide backbone conformations were identified using the established criteria (33) that these sequences contain continuous segments of short daN distances. Independent additional evidence for extended backbone conformation is the presence of several successive spin-spin coupling constants 3JHNa > 8.0 Hz (34). Fig. 1 shows that these two criteria coincide for residues 2-3, 14-15, 19-24, 26-35, 37-38, 44-46, and 50-53. Indirect support for extended backbone conformations in these regions comes from the absence of NOEs indicating medium-range proton-proton distances -3.5 A, except in the polypeptide sequence residues 24-27 (discussed below). Involvement of these extended backbone conformations in anti-parallel and parallel p-structures can be assessed only with additional evidence for short backbone-protonbackbone-proton distances between neighboring strands (33). As an illustration of the quality of the 2D-NMR spectra recorded with EGF, Fig. 2 shows an expansion of a NOESY spectrum in which the cross peaks arising from short d, 8.0 Hz at 280C are designated (0) (x identifies residues for which 3JHNa could not be measured for technical reasons). 3JHNG values for glycine are not included. Sequential NOEs indicating distances daN and dNN < 3.5 A are indicated, respectively, by vertically and horizontally shaded bars, where x again indicates that the sequential NOE could not be determined for technical reasons. Sequential NOEs involving glycine a protons are not included. NOEs corresponding to medium-range interproton distances d,, (i,i+2), dN(ii+3), and dNN (ii+2) are indicated (, -). All NOE data indicated were obtained from a NOESY spectrum recorded in H20 at pH 3.0 and T = 28TC with Tm = 65 ins, except for dNN (iji+2), which could be observed only with Tm = 200 ins.
8596
Biophysics: Montelione et al.
Proc. Natl. Acad. Sci. USA 83 10
E0.
Residue number 20 30
(1986) 40
50
.01) E
a
3
c
a)
(02,
ppm
FIG. 2. Spectral region (w, = 4.0-5.5 ppm, w2 = 4.0-5.5 ppm) from a contour plot of an absorption-mode 500 MHz 1H NOESY spectrum of murine EGF. The spectrum was recorded in 2H20 at pH* 3.1, 280C, 6 mM protein concentration, using a mixing time of 100 ms. The cross peaks manifesting short distances d.ar (ij) are identified with the corresponding amino acid sequence numbers. The ambiguity in assigning the NOE associated with daa (37,45) could be resolved by the unambiguous identification of other long-range NOEs (see Fig. 4).
with the postulated hydrogen bonds between the S2-P4 and V19-123 p-strands. There are two disulfide bonds C6-C20 and C14-C31 in the ,p-sheet structure shown in Fig. 4A (35). The constraints imposed on the backbone conformation by these disulfide bonds were evaluated by construction of a mechanical molecular model. The presence of the C6-C20 disulfide bond requires a kink in the polypeptide backbone at G5, which is independently indicated also by a strong NOE arising from a short dNN (5,6) distance (Fig. 1). The presence ofthe disulfide bond C14-C31 requires a chain reversal in the C14-G18 segment. The sequential and medium-range NOEs in this segment (Fig. 1) indicate that N16-G17 form a p-bend. Slow exchange of the 018 amide proton further points to the presence of a backbone-backbone hydrogen bond within this p3-bend. A second short p-sheet structure involves the residues Y37-S38 and T44-R45 as anti-parallel strands (Fig. 2). A schematic planar drawing presenting the interstrand distance constraints with which this structure was determined is shown in Fig. 4B. Two interstrand hydrogen bonds account for the slow exchange of the S38 and T44 amide protons. The intervening polypeptide segment G39-Q43 adopts an incompletely characterized five-residue chain reversal to which the disulfide bond C33-C42 (35) is connected. A mechanical model was used to complete the backbone conformational analysis for the C33-D46 polypeptide segment. With the C33-C42 disulfide bond as a fixed distance constraint, hydrogen-bonded interactions between V34 and Y37 were identified (Fig. 4B), which account for the slow exchange of the amide protons in these residues (Fig. 1). A NOE manifesting a short daN (35,37) distance (Fig. 1) and NOEs reflecting short. contacts between the side chains of V34 and Y37 indicate that the chain reversal in the segment 34-37 includes a p-bend at 135-G36. Overall, the polypeptide segment C33-D46 thus appears to form a disulfide-con-
FIG. 3. Diagonal plot of backbone-proton-backbone-proton NOEs in EGF. Both axes are calibrated with the amino acid sequence, and each filled square indicates that at least one NOE was observed between the two connected residues. Sequential NOEs corresponding to short darn and dNN distances are plotted above and below the diagonal, respectively. Nonsequential interresidue NOEs indicating short daN (ij), dNN (ij), or daa (ij) distances are plotted symmetrically with respect to the diagonal. Most of these NOEs were observed at pH* 3.1, 280C in a NOESY spectrum recorded with a mixing time of 65 ms in H20. The exceptions are that the d (ij)-type NOEs were measured with a mixing time of 100 ms in a 2H20 solution after complete exchange of all labile protons, daN (39,42), d,,,N (39,43), and dxN (39,44) in a freshly prepared 2H20 solution with a mixing time of 200 ms, and dNN (3,22) and dNa (23,29) in H20 with a mixing time of 200 ms.
strained "double-hairpin" structure with a 8-bend at I35-G36 and a five-residue chain reversal at G39-Q43. Beyond residue 46 the presently available data are not sufficient for a conformational characterization of the Cterminal segment residues 47-53. A third structural element identified from backboneproton-backbone-proton NOEs involves residues C6-C14. Within this polypeptide segment, residues S8-C14 exhibit a continuous network of sequential NOEs indicating short dNN distances (Fig. 1). In addition, three NOEs indicating short daN (i,i+2) and dNN (iji+2) distances are also observed (Fig. 1). This indicates that the residues C6-C14 adopt a multiplebend backbone conformation. Possibly this could also be characterized as a distorted helix. However, neither the NOEs associated with short dowN (ii+3) and dass (ii+3) distances characteristic of a-helices (33) nor the slow exchange anticipated for hydrogen-bonded protons were observed (Fig. 1). The segment residues 6-14 are connected with the anti-parallel p-sheet V19-N32 by the disulfide bonds C6-C20 and C14-C31.
DISCUSSION From an analysis of circular dichroism spectra, Holladay et al. (36) estimated that murine EGF contains -22% 8-sheet and no a-helix, which is consistent with our NMR results. Based on NMR measurements Mayo (14) proposed a ,-sheet structure involving residues 29-37 and 46-53 as anti-parallel strands. The presence of this large p-sheet structure is not confirmed by the present analysis of a more complete set of NMR data.
Proc. Natl. Acad. Sci. USA 83 (1986)
Biophysics: Montelione et al. A
I,5'
H-N (S2) C-H 0C-G N-H H-C (Y3) C0-
N (P4) C-H 0=C N-H (G5)
@
,"
C=0 H-C (123) N-H 0C
(H22) C-H H-N
C=0 * H-C (M 21) N-H 0=C (C20) C-H H-N /CO
H-C (V19) N-H
B
(Cd
0~~~~~ -N-C /
\1
(G39) H-C-H. H-N
H-
/CO
@)
~~H-C (S38)
(T'
0:
N-H N- H H-C (V34) C=O
=c (Y37) C-H H-N
\ H-
HFIG. 4. Secondary structures in murine EGF. (A) Anti-parallel P-sheet S2-P4/V19-123/S28-N32. (B) Anti-parallel (3-sheet Y37S38/T44-R45. Sequential NOEs and spin-spin couplings 3JHN,,, for these peptide segments are shown in Fig. 1. The slowly exchanging amide protons (Fig. 1) are indicated by boldfaced lettering, and postulated interstrand hydrogen bonds by wavy lines (). Interstrand NOEs observed either in H20 solution with a mixing time of 65 ms or in 2H20 with Tm = 100 ms are indicated by solid arrows, those which could be observed only with Tm = 200 ms either in H20 or in a freshly prepared 2H20 solution by dotted arrows.
One aim of this structural study is to provide guidance for structure-function studies of EGF and homologous TGF-a. Although reports of biologically active synthetic fragments of EGF (37) and TGF-a (38) have appeared in the literature, the relationship between growth factor structure and receptor binding remains uncertain. EGFs and TGF-as from different species have especially strong sequence homology in the C33-L47 peptide segment, for which a "double-hairpin" backbone conformation is indicated in the native solution structure of murine EGF (Fig. 4B). Synthetic fragments corresponding to analogs of the disulfide loop C33-C42 exhibit only weak binding affinity for EGF receptors on human cells (38). This could be explained with the hypothesis that the conformation of this segment is also important for the
8597
binding affinity. In the C33-D46 "double hairpin" of the EGF solution conformation (Fig. 4B), the segment C33-C42 interacts with the Q43-D46 sequence, which includes backbone-backbone hydrogen-bonded interactions between S38 and T44. Since the isolated synthetic C33-C42 disulfide loop peptides are too short to form this local structure, the molecular model of Fig. 4B offers an explanation for the reduced binding affinity of such model peptides. An important tool for structure-function studies of EGF and TGF-a will be genetic engineering using site-directed mutagenesis. It is well known that high yields of cloned protein from biosynthetic precursor can be obtained by CNBr cleavage, provided that the protein is devoid of methionines, except for sequence position -1 (39). Murine, human, guinea pig, and rat EGFs all contain a single methionine (40) at position 21. From the structure presented in Fig. 4A, we can now advise the molecular biologists that in genetically engineered EGF the single methionine should be replaced with an amino acid residue that conserves the P-sheet propensity of methionine-21. We thank T. W. Thannhauser for carrying out the amino acid analyses. This work was supported by Grant GM-24893 from the National Institute of General Medical Sciences, Project 3.198-9.85 from the Schweizerischer Nationalfonds, Grant DMB84-01811 from the National Science Foundation, and Grant RR-01317 from the National Institutes of Health Research Resource for Multinuclear Magnetic Resonance at Syracuse University. Support was also received from the National Foundation for Cancer Research and from the Cornell Biotechnology Center. 1. Cohen, S. (1962) J. Biol. Chem. 237, 1555-1562. 2. Savage, C. R., Jr., Inagami, T. & Cohen, S. (1972) J. Biol. Chem. 247, 7612-7621. 3. Gregory, H. (1975) Nature (London) 257, 325-327. 4. Marquardt, H., Hunkapiller, M. W., Hood, L. E., Twardzik, D. R., De Larco, J. E., Stephenson, J. R. & Todaro, G. J. (1983) Proc. Natl. Acad. Sci. USA 80, 4684-4688. 5. Smith, J. M., Sporn, M. B., Roberts, A. B., Derynck, R., Winkler, M. E. & Gregory, H. (1985) Nature (London) 315, 515-516. 6. De Larco, J. E. & Todaro, G. J. (1978) Proc. Natl. Acad. Sci. USA 75, 4001-4005. 7. Todaro, G. J., Fryling, C. & De Larco, J. E. (1980) Proc. Nati. Acad. Sci. USA 77, 5258-5262. 8. Banyai, L., Varadi, A. & Patthy, L. (1983) FEBS Lett. 163, 37-41. 9. Downward, J., Yarden, Y., Mayes, E., Scrace, G., Totty, N., Stockwell, P., Ullrich, A., Schlessinger, J. & Waterfield, M. D. (1984) Nature (London) 307, 521-527. 10. Sporn, M. B. & Roberts, A. B. (1985) Nature (London) 313, 745-747. 11. Sporn, M. B. & Todaro, G. J. (1980) N. Engl. J. Med. 303, 878-880. 12. Buckley, A., Davidson, J. M., Kamerath, C. D., Wolt, T. B. & Woodward, S. C. (1985) Proc. Natl. Acad. Sci. USA 82, 7340-7344. 13. DeMarco, A., Menegatti, E. & Guarneri, M. (1983) FEBS Lett. 159, 201-206. 14. Mayo, K. H. (1984) Biochemistry 24, 3783-3794. 15. Wuthrich, K. (1986) NMR of Proteins and Nucleic Acids (Wiley, New York). 16. Burgess, A. W., Knesel, J., Sparrow, L. G., Nicola, N. A. & Nice, E. C. (1982) Proc. Natl. Acad. Sci. USA 79, 5753-5757. 17. Burgess, A. W., Lloyd, C. J. & Nice, E. C. (1983) EMBO J. 2,
2065-2069. 18. Piantini, V., S0rensen, 0. W. & Ernst, R. R. (1982) J. Am. Chem. Soc. 104, 6800-6801. 19. Macura, S. & Ernst, R. R. (1980) Mol. Phys. 41, 95-117. uthrich, K. (1986) J. 20. Otting, G., Widmer, H., W Magn. Reson. 66, 187-193. 21. Wagner, G. (1983) J. Magn. Reson. 55, 151-156. 22. Wagner, G. & Zuiderweg, E. R. P. (1983) Biochem. Biophys. Res. Commun. 113, 854-860.
8598
Biophysics: Montelione et al.
23. Wider, G., Hosur, R. V. & Wuthrich, K. (1983) J. Magn. Reson. 52, 130-135. 24. Redfield, A. G. & Kunz, S. (1975) J. Magn. Reson. 19, 250-254. 25. Marion, D. & Wuthrich, K. (1983) Biochem. Biophys. Res. Commun. 113, 967-974. 26. W~ithrich, K. (1983) Biopolymers 22, 131-138. 27. Billeter, M., Braun, W. & Wiithrich, K. (1982) J. Mol. Biol. 155, 321-346. 28. Wagner, G. & Wuthrich, K. (1982) J. Mol. Biol. 155, 347-366. 29. Zuiderweg, E. R. P., Kaptein, R. & Wuthrich, K. (1983) Proc. Natl. Acad. Sci. USA 80, 5837-5841. 30. Williamson, M. P., Marion, D. & Wuthrich, K. (1984) J. Mol. Biol. 173, 341-359. 31. Kline, A. D. & Wuthrich, K. (1985) J. Mol. Biol. 183, 503-507. 32. Wagner, G., Neuhaus, D., Worgotter, E., Vasak, M., Kagi, J. H. R. & Wuthrich, K. (1986) J. Mol. Biol. 187, 131-135. 33. Wuthrich, K., Billeter, M. & Braun, W. (1984) J. Mol. Biol. 180, 715-740.
Proc. Nati. Acad. Sci. USA 83 (1986) 34. Pardi, A., Billeter, M. & Wuthrich, K. (1984) J. Mol. Biol. 180, 741-751. 35. Savage, C. R., Jr., Hash, J. H. & Cohen, S. (1973) J. Biol. Chem. 248, 7669-7672. 36. Holladay, L. A., Savage, C. R., Jr., Cohen, S. & Puett, D. (1976) Biochemistry 15, 2624-2633. 37. Komoriya, A., Hortsch, M., Meyers, C., Smith, M., Kanety, H. & Schlessinger, J. (1984) Proc. Natl. Acad. Sci. USA 81, 1351-1355. 38. Nestor, J. J., Jr., Newman, S. R., DeLuttro, B., Todaro, G. J. & Schreiber, A. B. (1985) Biochem. Biophys. Res. Commun. 129, 226-232. 39. Brousseau, R., Scarpulla, R., Sung, W., Hsiung, H. M., Narang, S. A. & Wu, R. (1982) Gene 17, 279-289. 40. Simpson, R. J., Smith, J. A., Moritz, R. L., O'Hare, M. J., Rudland, P. S., Morrison, J. R., Lloyd, C. J., Grego, B., Burgess, A. W. & Nice, E. C. (1985) Eur. J. Biochem. 153, 629-637.