articles
Structural basis of substrate recognition and specificity in the N-end rule pathway
© 2010 Nature America, Inc. All rights reserved.
Edna Matta-Camacho1,2, Guennadi Kozlov1,2, Flora F Li1 & Kalle Gehring1,2 The N-end rule links the half-life of a protein to the identity of its N-terminal residue. Destabilizing N-terminal residues are recognized by E3 ubiquitin ligases, termed N-recognins. A conserved structural domain called the UBR box is responsible for their specificity. Here we report the crystal structures of the UBR boxes of the human N-recognins UBR1 and UBR2, alone and in complex with an N-end rule peptide, Arg-Ile-Phe-Ser. These structures show that the UBR box adopts a previously undescribed fold stabilized through the binding of three zinc ions to form a binding pocket for type 1 N-degrons. NMR experiments reveal a preference for N-terminal arginine. Peptide binding is abrogated by N-terminal acetylation of the peptide or loss of the positive charge of the N-terminal residue. These results rationalize and refine the empirical rules for the classification of type 1 N-degrons. We also confirm that a missense mutation in UBR1 that is responsible for Johanson-Blizzard syndrome leads to UBR box unfolding and loss of function. The selective degradation of many short-lived proteins in eukaryotic cells is carried out by the ubiquitin system1. Although prokaryotes and eukaryotes use distinct proteolytic machineries for degradation of substrates, recent findings indicate that they share common principles of substrate recognition2. In eukaryotes, specific degradation signals (degrons) include a set of N-terminal signals called N-degrons. These signals are recognized by N-recognins, a class of E3 ubiquitin ligases that bind to specific destabilizing N-terminal residues of protein substrates2,3. The targeting by N-recognins is followed by polyubiquitination of substrates and their delivery to the 26S proteasome3. The N-degron is defined by a rule, termed the N-end rule, that relates the in vivo half-life of a protein to the identity of its N-terminal residue4. N-degrons can be classified into type 1, composed of basic residues (arginine, lysine and histidine), and type 2, composed of bulky hydrophobic residues (phenylalanine, leucine, tryptophan, tyrosine and isoleucine)5. In mammals, at least four N-recognins mediate the N-end rule pathway, namely UBR1 (E3-α1), UBR2 (E3-α2), UBR4 (RBAF600 or ZUBR1) and UBR5 (EDD or HYD)6. These N-recognins share an ~70-residue zinc finger–like motif termed the ubiquitin-recognin (UBR) box (Fig. 1). The mammalian genome encodes at least three more UBR box–containing proteins (UBR3, UBR6 and UBR7); however, these UBRs cannot recognize type 1 or type 2 N-degrons and do not seem to be N-recognins 7,8. The UBR boxes of UBR1 and UBR2 are highly conserved (sequence identity between humans and mouse, 95%; humans and yeast, 46%) and they are thought to have a similar role in the N-end rule pathway 8. In Saccharomyces cerevisiae, this pathway is mediated by a single N-recognin, UBR1, a 225-kDa protein with distinct recognition sites for type 1, type 2 and internal degrons9.
UBR-type N-recognins do not recognize either unmodified or N-terminally acetylated methionine and small, uncharged N-terminal residues such as alanine, valine, threonine, serine and cysteine. It was recently discovered that N-terminal acetylation of these residues creates specific degradation signals10. This new class of N-degrons, termed AcN-degrons, is recognized by N-recognins of a separate, non-UBR class, exemplified by the yeast Doa10 ubiquitin ligase, an integral membrane protein of the endoplasmic reticulum10. Dysfunction in the N-end rule pathway leads to disease. Loss of UBR1 is responsible for a developmental disorder, Johanson-Blizzard syndrome (JBS), which is characterized by pancreatic insufficiency, nasal wing aplasia and, frequently, mental retardation11,12. In mice, loss of either UBR1 or UBR2 leads to infertility and reduced survival13,14. These phenotypes are due, in part, to defects in chromatin inactivation and transcriptional silencing through ubiquitination of histone H2A15,16. The double knockout, UBR1−/−UBR2−/−, results in early embryonic lethality7. Previous studies have delineated UBR boxes as the site of recognition of type 1 N-degrons by UBR1 and UBR2, but little is known about the mechanism of recognition and specificity. Here we report the crystal structures of the UBR boxes of UBR1, UBR2 and a complex with a type 1 N-degron derived from a known N-end rule substrate. The structures identify residues involved in substrate recognition and refine the definition of N-degrons that are recognized by UBR1 and UBR2. RESULTS Structure of the UBR box reveals a new protein fold We determined the structure of the UBR box from human N-recognins UBR1 and UBR2 (Table 1). The structures consist of
1Department
of Biochemistry, McGill University, Montreal, Quebec, Canada. 2Groupe de recherche axé sur la structure des protéines, McGill University, Montreal, Quebec, Canada. Correspondence should be addressed to K.G. (
[email protected]). Received 13 June; accepted 20 July; published online 12 September 2010; doi:10.1038/nsmb.1894
1182
VOLUME 17 NUMBER 10 OCTOBER 2010 nature structural & molecular biology
articles
© 2010 Nature America, Inc. All rights reserved.
b His166
2+
Rmerge I/σI
UBR1 UBR box
UBR2 UBR box
UBR2 UBR box + RIFS
P 21
P1
P 21
29.70, 49.26,
29.39, 61.46,
29.12, 36.94,
43.83
72.81
29.82
90.00, 100.51,
65.05, 89.98,
90.00, 109.60,
90.00
90.01
90.00
50–2.08
50–2.60
50–1.60
(2.12–2.08)a
(2.64–2.60)
(1.63–1.60)
0.115 (0.332)
0.076 (0.133)
0.049 (0.159)
20.5 (3.7)
12.1 (6.1)
38.6 (12.1)
94.8 (79.1)
99.2 (85.3)
7.0 (4.1)
2.1 (1.8)
7.0 (5.3)
Resolution (Å)
43.1–2.09
66.1–2.61
28.1–1.60
No. reflections
7,043
12,503
7,577
0.194 / 0.248
0.230 / 0 .288
0.190 / 0.210
No. of atoms
1,137
4,357
612
Protein
1,091
4,297
576
40
36
33
6
24
3
Protein
16.4
32.6
17.9
Water
19.4
29.9
28.07
Zinc ions
10.6
29.2
16.08
0.012
0.007
0.009
1.47
1.094
1.170
Redundancy Refinement
Rwork / Rfree
Water Zinc ions B-factors
R.m.s. deviations Bond lengths (Å) Bond angles (°) aValues
Cys149
17
11 UBR2
100 13
20
24
15 UBR3
100 32
43
17 UBR4
100 27
19 UBR5
Cys115
Cys112
100 24 UBR6
149 151
136
133
124
100 UBR7
Penultimate pocket
98.7 (86.4)
Completeness (%)
His133
13 UBR1
13
UBR1 UBR2 UBR3 UBR4 UBR5 UBR6 UBR7
Cell dimensions
Resolution (Å)
Cys127
Zn
15 17 21 16
α2
112
α1
C
d
Zn
2+
Zn
Cys124
115
N
2+
Data collection
α, β, γ (°)
2+
Zn
Table 1 Data collection and refinement statistics
a, b, c, (Å)
Cys151 Zn
2+
His136
Zn
Cys99
40
100 40
166
β2
100 76
2+
163
β1
L2
Cys163
127
L1
two antiparallel β-strands, β1 (Gly106–Cys112) and β2 (Tyr138– Ser143); two α-helices, α1 (Met125–Asp130) and α2 (Val132– Asn135); and two long ordered loops, L1 (Leu98–Gly106) and L2 (Ser143–Glu167) (Fig. 1a). In the case of the UBR2, a small extra β-sheet was present between residues in loops 1 and 2. The structures
Space group
c
U BR 1 U BR 2 U BR 3 U BR 4 U BR 5 U BR 6 U BR 7
a
99
Figure 1 Global fold of the UBR boxes from UBR1 and UBR2. (a) Topology of the UBR box as two zinc fingers with three bound zinc ions. (b) Zinc coordination by cysteine and histidine residues. The first zinc finger is atypical, with the motif CX24CX2CX21CXCX11CX2H coordinating two adjacent zinc ions that share a coordinating residue (Cys127). The second zinc finger is more typical, with a single zinc and the consensus motif CX2CX20HX2H. (c) Sequence conservation between UBR boxes in human UBR1 through UBR7. The numbers in the squares represents the percentage of sequence identity. (d) Sequence alignment of the UBR boxes of human UBR1 to UBR7 with identification of the zinccoordinating histidine and cysteine residues. Above the alignment, open boxes (α-helices) and arrows (β-strands) indicate the positions of regular secondary-structure elements in UBR1. Vertical arrows highlight the residues involved in recognition of N-degron elements: the N terminus (red) and the penultimate residue (black).
in parentheses are for the highest-resolution shell. Each data set was collected from a single crystal.
Negatively charged pocket
of the two domains are similar, reflecting the high degree of conservation of the primary sequences (Supplementary Fig. 1). The UBR box tertiary structure is stabilized by three zinc ions, which produce two contiguous zinc fingers (Fig. 1b). The first zinc finger is unusual and consists of two zinc ions that are each tetrahedrally coordinated but share a cysteine ligand (Cys127). The second zinc finger is more typical: two of the zinc-coordinating residues are located in the linker between strand β1 and helix α1 (Cys112 and Cys115), another in the helix α2 (His133) and the last in the linker between α2 and β2 (His136) (Fig. 1b). These two zinc fingers form a rigid scaffold to frame the N-degron–binding site (see below). Sequence comparison of the different UBR proteins shows that the domains of UBR1, UBR2 and UBR3 are the most similar (Fig. 1c). The UBR boxes of UBR1 and UBR2 share 76% identity over 69 residues. The next most closely related pair is UBR4 and UBR6, which together with UBR5 form a second group of more loosely related UBR boxes. All but one of the zinc-coordinating residues are conserved across the UBR family (Fig. 1d), which suggests that all of the UBR boxes fold in a similar manner. A structural similarity search using DaliLite v.3 (ref. 17) established that the UBR box is a novel structural fold and distinct from known zinc fingers or RING domains. Crystal structure of the UBR box in complex with a peptide A high-resolution (1.6 Å) crystal structure was obtained for the complex between the UBR box of UBR2 protein and a tetrapeptide (Arg-Ile-PheSer, referred to as RIFS), from a known N-end rule substrate, Sindbis virus RNA polymerase18. The complex structure reveals the mechanism of specificity for N-end rule substrates (Fig. 2). The N-terminal arginine of the N-degron is rigidly positioned in a negatively charged groove on the surface of the UBR box opposite the zinc-binding sites. The peptide N-terminal amino group packs tightly against a bulky hydrophobic residue, Phe148, forming an aromatic hydrogen bond19. The amino group makes two additional coordinating hydrogen bonds with Asp150 and the backbone carbonyl of Phe148 (Fig. 2c). Together, these interactions assure the absolute specificity of the UBR box for N-terminal substrates. The selectivity for basic amino acids arises from the large, negative surface adjacent to Phe148. In the cocrystal, the arginine side chain is rigidly positioned by hydrogen bonds to the side chains of UBR box residues Asp150 and Asp153. A bound water molecule mediates an additional hydrogen bond to Asp118 (Fig. 2c). The N-terminal
nature structural & molecular biology VOLUME 17 NUMBER 10 OCTOBER 2010
1183
articles Figure 2 Structure of the UBR box–N-degron complex. (a) Electrostatic potential surface representation of UBR2 UBR box with the bound RIFS peptide shown in green. Binding is primarily determined by a large, negatively charged pocket that binds the N-terminal residue and a more hydrophobic pocket for the penultimate residue. (b) Electron density omit map (2Fo – Fc, 1σ contour) of the RIFS peptide bound to the UBR box. The side chains of key UBR2 residues involved in peptide binding are shown. (c) UBR box recognition elements that form hydrogen bonds with and stabilize the N-terminal and second peptide residues. Atoms are coded by color: red, oxygen; blue, nitrogen, green, carbon (peptide), white, carbon (UBR2). (d) Schematic of the elements responsible for N-degron recognition. Hydrogen bonds are represented by dashed straight lines and hydrophobic interactions by dashed curves.
a
b
Phe3
Phe148 Ser4 Asp150
Ile2 Penultimate pocket
Asp153 Asp118
Negatively charged pocket
c
Phe148
d
Arg1
Phe148
Asp150 +
Asp150
–
Ile2
© 2010 Nature America, Inc. All rights reserved.
Ile2
Arg1 Asp153 Arg1 arginine has low B-factors (~16 Å2) with well+ – defined density for both the backbone and Thr120 Asp153 side chain (Fig. 2b). H2O Val122 The crystal structure revealed additional – Asp118 Asp118 interactions at the second position in the Thr120 N-degron. The peptide backbone for the first two amino acids is rigidly fixed, essentially as one strand of an Comparisons of the bound and unbound structures of the intermolecular β-sheet by backbone hydrogen bonds to Thr120 and UBR boxes provide insight into the thermodynamics of recogniPhe148. This orients the side chain of the peptide isoleucine residue tion and binding. In both UBR1 and UBR2, the UBR box is rigid toward a hydrophobic collar formed by Phe103, Val122 and Thr120. and undergoes only small rearrangements upon peptide binding. Following the isoleucine residue, the peptide turns away from the The binding site for the peptide amino group is preformed; the UBR box and becomes less ordered. same side chain of Phe148 adopts the same conformation with We confirmed the mechanism of N-terminal selectivity by exam- or without peptide. Thus, the entropy cost of reorganization and ining the structure of the free UBR box of UBR1 (Supplementary rigidification of the macromolecule upon substrate binding is relaFig. 2). The proximity of two protein molecules in the crystal tively small. Notably, calorimetry revealed a large entropy increase asymmetric unit allowed the N-terminal portion of one molecule upon binding that is probably due to the release of bound water to bind to the substrate-binding pocket of the other. Notably, the molecules in the UBR–N-degron complex. backbone atoms of the first two residues show hydrogen bonds and intermolecular contacts identical to those observed in the How does the UBR box interact with type 1 N-degrons? UBR2–RIFS cocrystal. We used NMR spectroscopy and isothermal titration calorimetry (ITC) to characterize the ligand-binding properties of the UBR boxes of UBR1 and UBR2 (Table 2). The two techniques are complementary and allow Table 2 UBR box substrate-binding specificity affinity measurements over many orders of magnitude (Fig. 3). NMR Dissociation constant (μM)a is applicable for weak binding ligands and potentially allows identificaLigandb UBR1 UBR2 tion of the binding surface. ITC requires higher affinities but allows decomposition of ΔG into enthalpic and entropic contributions. Amino acids We initially titrated the UBR boxes from UBR1 and UBR2 with ArginineNH2 200 ± 100 170 ± 35 the free amino acid arginine to determine the minimal binding LysineNH2 530 ± 100 505 ± 50 element. Both UBR box domains gave excellent 1H-15N NMR spectra HistidineNH2 NT c 15,000 ± 8,000 with specific amide chemical shift changes upon arginine addition; Arginine 7,500 ± 750 5,600 ± 350 fitting of the chemical shift changes as a function of the arginine conPeptides centration yielded a Kd of around 6 mM for both domains (Table 2). RIFS 24 ± 4.7 19.3 ± 2.6 The perturbations occurred in the fast-exchange regime because of KIFS NT 110 ± 11.5 the short lifetimes of the bound and unbound states. It is likely that HIFS NT 370 ± 50 the negative charge of the arginine carboxyl group inhibited bindAc RIFS NT 21,000 ± 3,700 ing, so we repeated the experiment using carboxyl amidated arginine RDFS 10 ± 1.0 12.4 ± 1.2 (ArgNH2). This resulted in larger amide chemical shift perturbations REFS 38 ± 10 57.8 ± 7.1 (Supplementary Fig. 3) along with a 30-fold improvement in bind RLFS 21 ± 2.1 35.3 ± 8.6 ing affinity. The charge of the amino acid side chain was essential RWFS 30 ± 3.3 50 ± 5.0 for high-affinity binding. Titrations with amidated lysine showed RAFS 81 ± 20 98 ± 20 somewhat weaker binding; amidated histidine showed binding only at aAffinity and s.d. determined by NMR (K > 80 μM) or ITC (K < 80 μM). bNH2, C-terminally d d amidated; Ac, N-terminally acetylated. cNT, not tested. millimolar concentrations (Supplementary Fig. 3). NMR titrations
1184
VOLUME 17 NUMBER 10 OCTOBER 2010 nature structural & molecular biology
articles a
b N (p.p.m.)
Ser111 110 Gly106
© 2010 Nature America, Inc. All rights reserved.
9.0
1
H (p.p.m.)
–1
µcal s
0.2
0
Val132
40
0.06 0.04
0 RAFS:UBR1 UBR box Kd = 81 ± 20 µM
0
200
8.0
400
600
800
[Peptide] (µM)
of injectant
Gly147
Time (min) 20 30
0.02
0.4
–1
105
10
0.08 0.6
kcal mol
Gly160
0 mM 0.10 mM 0.19 mM 0.37 mM 0.82 mM
0 0.10
15
∆ chemical shift (p.p.m.)
RAFS
c
0.25 0.20
RIFS:UBR2 UBR box Kd = 19 ± 2.6 µM
0.15 0.10 0.05 0
Figure 3 UBR box binding of type 1 N-degrons. (a) 15N-1H NMR correlation spectra of the UBR1 UBR 0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 box titrated with increasing amounts of a tetrapeptide derived from an N-end rule substrate. Arrows Molar ratio 1 indicate the direction of peak shifts. (b) Magnitude of the change in the H chemical shift of a selected peak plotted as a function of total ligand concentration. The curve of best fit is shown along with the binding affinity. (c) ITC trace for the binding of a tetrapeptide to the UBR2 UBR box. The upper curve shows the baseline-corrected thermogram, and the lower graph shows the integrated areas of the heat absorbed along with a fit from which the stoichiometry (N), molar association constant (K), enthalpy (ΔH) and entropy (ΔS) are calculated.
with amidated phenylalanine and methionine showed no chemical shift changes (data not shown). We obtained essentially identical results for the UBR boxes of UBR1 and UBR2. The fundamental role of the N-terminal positive charges was confirmed by studies with N-degron tetrapeptides. For these higheraffinity interactions, we used ITC to directly measure the heat (ΔH) released or absorbed upon peptide binding. These measurements were technically challenging because the ΔH of binding was small, slightly positive or negative depending on the peptide. The thermogram for RIFS binding to the UBR box of UBR2 showed unusual endothermic binding, with a ΔH of 315 ± 7 cal mol−1 and an affinity of 25 μM (Fig. 3c). Approximately one-third of the peptide-binding reactions were endothermic. In all cases, binding was entropically driven, with a ΔS that was between 16 and 23 cal mol−1 deg−1 (Supplementary Fig. 4) and provided between 75% and 100% of the binding free energy. Comparison of peptides with different N-terminal residues recapitulated the results with amidated amino acids. Substitution of arginine by lysine led to a factor of 5 increase in the Kd. We observed relatively poor binding for a peptide with an N-terminal histidine, and we were unable to detect appreciable binding for tetrapeptides with glycine, phenylalanine or glutamate as the N-terminal residue. The peptide amino group was equally important. Capping of the N terminus by an acetyl group completely blocked binding of tetrapeptide AcRIFS (Table 2). We obtained similar results with acetylated, amidated arginine and with a pentapeptide, AcGRIFS, with an internal arginine residue (data not shown). Specificity for the second residue and beyond We next tested the effect of the second position on the N-degron recognition. A series of tetrapeptides, RXFS, were compared, with aspartate, glutamate, isoleucine, leucine, tryptophan or alanine as the second amino acid. We observed minor differences in affinities to the UBR boxes of UBR1 and UBR2. For both domains, we observed the highest affinity with the peptide RDFS, which mimics the product of N-arginylation of secondary N-degrons (which are characterized by an acidic N-terminal residue)20. In agreement with the observation of hydrophobic contacts in the crystal structure, we observed the next best affinities with the peptides having long aliphatic side chains: RIFS and RLFS. Binding of RWFS was weaker. NMR titrations of RAFS
showed fast exchange kinetics and were used to measure its affinity (Fig. 3a,b). For the six amino acids tested, the penultimate position provided a range of affinities for the peptides in which the best affinity (RDFS) was eightfold higher than the worst (RAFS). We also carried out experiments with a longer peptide, RIFSTDTGPGGC, from Sindbis virus RNA polymerase, which had been previously studied by surface plasmon resonance8. 15N-1H correlation spectra of the UBR boxes from both UBR1 and UBR2 showed nearly identical chemical shift changes in titrations with the 12-mer peptide and the truncated tetrapeptide RIFS. These confirm, as observed in the crystal structure, that residues beyond position 4 do not contact the UBR box. Similarly, ITC measurements showed no significant difference in affinity between the tetrapeptide and 12-mer in binding to the UBR box of UBR2 (data not shown). Mutagenesis of key residues in N-degron binding To determine the functional significance of the atomic contacts observed in the cocrystal, we carried out mutational analysis of the UBR box from UBR1. On the basis of a previous mutagenesis study8, we generated eight mutants—D118A, D118E, D118F, D118L, H136R, G147F, F148A and D150A—and examined their folding and ability to bind the peptide RIFS using one-dimensional NMR spectroscopy. The H136R mutation of UBR1 had previously been shown to give rise to the recessive genetic disease JBS12. As predicted, the loss of coordination of the zinc by His136 compromised domain folding. The mutant UBR box showed low solubility in solution and the 1H NMR spectrum was characteristic of an unfolded protein (Supplementary Fig. 5). We observed no chemical shift changes upon titration with the N-degron RIFS. The remaining mutant proteins were well folded, with characteristic, up-field methyl resonances at 0.2 p.p.m. and −0.2 p.p.m. In the wild-type protein, these resonances shifted upon ligand binding (Supplementary Fig. 5). We observed differential effects at Asp118 depending on the size of the side chain. The mutations D118F and D118L abrogated binding, whereas the D118E and D118A mutant proteins were still able to bind, although with lower affinity. These results suggest that the side chain of Asp118 is important for positioning the water molecule that stabilizes the arginine side chain in the negatively charged pocket (Fig. 2d). Mutants G147F and D150A similarly did not bind peptides.
nature structural & molecular biology VOLUME 17 NUMBER 10 OCTOBER 2010
1185
articles
© 2010 Nature America, Inc. All rights reserved.
Unexpectedly, the F148A mutant retained the ability to interact with the RIFS peptide. We also tested whether this mutant could bind the N-terminally blocked ligand AcRIFS, bearing an α-amino acetyl group. We hypothesized that the loss of the Phe148 aromatic ring might alleviate a steric clash responsible for the lack of binding of the acetylated peptide. However, we did not observe binding, indicating that the specificity of the UBR box for N termini is likely to lie in the positive charge of the N-terminal amino group. DISCUSSION Structural implications for the family of UBR box proteins The family of mammalian UBR box proteins comprises seven members. In the present study, we determined the structures of the UBR boxes from human UBR1 and UBR2 and show that they adopt a previously undescribed structural fold incorporating three zinc ions. The zinc ions are tetrahedrally coordinated by cysteine and histidine residues, but two of the zinc ions share a cysteine residue, so that the three zinc ions are coordinated by 11 ligands. Unexpectedly, the UBR boxes of UBR4 through UBR7 lack one coordination residue (corresponding to His166 of UBR1). These UBR box domains probably assume the same fold as UBR1 and UBR2, but either they complex only two zinc ions or they contain other elements to stabilize the third zinc ion. Zinc is essential for the structural integrity of the UBR boxes. The loss of single zinc ligand (H136R) in UBR1 leads to an unfolded domain in vitro and to a genetic disease in humans. Similarly, loss of a conserved cysteine residue in the UBR box of the Arabidopsis thaliana BIG protein (a structural and functional ortholog of mammalian UBR4) leads to a loss-of-function phenotype in plants21. This cysteine-to-threonine mutation (corresponding to C127T in UBR1) perturbs auxin transport and the expression of light-regulated genes. In the UBR1 and UBR2 structures, Cys127 has a unique role in coordinating the binding of two zinc ions. Among the seven mammalian UBR box–containing proteins, the UBR boxes of UBR1 and UBR2 seem to be largely responsible for the recognition of type 1 N-degrons (those with a basic N-terminal residue). The human UBR3 UBR box contains many of the structural and functional elements required for type 1 N-degron recognition; however, a study of the full-length protein suggested that it does not bind N-degrons7. One possible explanation is the absence of Phe148 (Fig. 1d). In the cocrystal structure, the N-degron N terminus packs tightly against the phenylalanine side chain, which acts as a hydrogen bond acceptor. This interaction is predicted to contribute approximately 3 kcal mol−1 of stabilizing enthalpy19, although it is not essential for peptide binding. UBR3 also lacks an aspartic acid residue, Asp118, which contributes to arginine binding in the acidic pocket through a bound water molecule (Fig. 2c). The glycine residue that precedes Phe148 is also likely to be important for peptide binding. Although it is present in UBR3, this glycine is absent from UBR4 through UBR7. In the cocrystal structure, Gly147 has an unusual backbone phi angle of 90°, which allows proper positioning of the peptide into the substrate-binding groove. This is a disallowed angle for nonglycine residues, and changes to phenylalanine, as in the UBR boxes of UBR4, UBR6 and UBR7, or even alanine, in the case of UBR5, are likely to strongly alter the local protein conformation. UBR box binding specificity The specificity of the isolated UBR boxes of UBR1 and UBR2 mirrors the specificity of the N-end rule pathway. By NMR and ITC, we observed an absolute requirement for a free N-terminal amino group. Acetylation of the N-degron N terminus decreased binding 1186
by three orders of magnitude (Table 2). We also observed a strong bias in UBR specificity toward arginine as the N-terminal amino acid. Pioneering studies in yeast previously showed that proteins with N-terminal arginine residues have the shortest half-lives4. Studies with the mammalian N-recognins measured the highest affinity for Arg-Ala dipeptide binding to UBR1 (ref. 8). Both as carboxylated amino acids and in peptides, we found a threefold to fivefold preference for binding of arginine over lysine. Although an N-terminal histidine is classified as a type 1 N-degron, we observed weak affinity for amidated histidine to the UBR boxes of either UBR1 or UBR2. In the context of a peptide, this difference was smaller. Nonetheless, histidine as an N-degron signal was 15 times less effective in binding to the UBR2 UBR domain (Table 2). The poor affinity for histidine is probably due to the absence of a positive charge. The pKa of histidine is approximately 6 and, as an N-terminal amino acid, the charged α-amino group further decreases the pKa. At neutral pH, the imidazole side chain would carry little charge. The UBR boxes of UBR1 and UBR2 select ligands based in large part on charge complementarity at the N terminus. The penultimate position has a much smaller role in modulating the affinity. The binding site presents a mix of hydrophobic and hydrophilic residues that promote the binding of both large hydrophobic residues and small acidic residues. It is noteworthy that the highest-affinity ligand, RDFS, mimics the product of the enzyme arginyl tRNA protein transferase (ATE1), which transfers arginine to an acceptor protein with an acidic N-terminal residue. ATE1 is conserved across eukaryotes and has diverse and essential roles in tissue development and signaling22–25. The acidic N-terminal residue is considered a secondary N-degron, and its arginylation is an example of regulation of N-end rule degradation20. Conclusions The crystal structures of the UBR boxes from human ubiquitin ligases UBR1 and UBR2 reveal the basis for type 1 N-degron recognition. Binding to the acidic UBR pocket underlies the principal requirements for N-end rule substrates: a free N-terminal amino group and a positively charged N-terminal side chain. The identification of mutations that specifically abrogate the type 1 N-degron recognition without disruption of the domain structure can be used in future studies to delineate the role of UBR boxes in other N-end rule pathways. Methods Methods and any associated references are available in the online version of the paper at http://www.nature.com/nsmb/. Accession codes. Protein Data Bank: Coordinates and structure factors for UBR boxes of UBR1, UBR2 and the UBR2–RIFS peptide complex have been deposited under accession numbers 3NY1, 3NY2 and 3NY3, respectively. Note: Supplementary information is available on the Nature Structural & Molecular Biology website. Acknowledgments We thank J.-F. Trempe and M. Ménade for technical assistance and helpful discussions. E.M.C. is funded by the Canadian Institutes of Health Research (CIHR) and McGill University. Data acquisition at the Macromolecular Diffraction (MacCHESS) facility at the Cornell High Energy Synchrotron Source (CHESS) was supported by US National Science Foundation award DMR 0225180 and US National Institutes of Health award RR-01646. This study was funded by CIHR grant MOP-14219.
VOLUME 17 NUMBER 10 OCTOBER 2010 nature structural & molecular biology
articles COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests.
© 2010 Nature America, Inc. All rights reserved.
Published online at http://www.nature.com/nsmb/. Reprints and permissions information is available online at http://npg.nature.com/ reprintsandpermissions/. 1. Hershko, A. & Ciechanover, A. The ubiquitin system. Annu. Rev. Biochem. 67, 425–479 (1998). 2. Bachmair, A., Finley, D. & Varshavsky, A. In vivo half-life of a protein is a function of its amino-terminal residue. Science 234, 179–186 (1986). 3. Varshavsky, A. The N-end rule. Cold Spring Harb. Symp. Quant. Biol. 60, 461–478 (1995). 4. Bachmair, A. & Varshavsky, A. The degradation signal in a short-lived protein. Cell 56, 1019–1032 (1989). 5. Varshavsky, A. Discovery of cellular regulation by protein degradation. J. Biol. Chem. 283, 34469–34489 (2008). 6. Tasaki, T. et al. Biochemical and genetic studies of UBR3, a ubiquitin ligase with a function in olfactory and other sensory systems. J. Biol. Chem. 282, 18510– 18520 (2007). 7. Tasaki, T. et al. A family of mammalian E3 ubiquitin ligases that contain the UBR box motif and recognize N-degrons. Mol. Cell. Biol. 25, 7120–7136 (2005). 8. Tasaki, T. et al. The substrate recognition domains of the N-end rule pathway. J. Biol. Chem. 284, 1884–1895 (2009). 9. Xia, Z. et al. Substrate-binding sites of UBR1, the ubiquitin ligase of the N-end rule pathway. J. Biol. Chem. 283, 24011–24028 (2008). 10. Hwang, C.S., Shemorry, A. & Varshavsky, A. N-terminal acetylation of cellular proteins creates specific degradation signals. Science 327, 973–977 (2010). 11. Johanson, A. & Blizzard, R. A syndrome of congenital aplasia of the alae nasi, deafness, hypothyroidism, dwarfism, absent permanent teeth, and malabsorption. J. Pediatr. 79, 982–987 (1971). 12. Zenker, M. et al. Deficiency of UBR1, a ubiquitin ligase of the N-end rule pathway, causes pancreatic dysfunction, malformations and mental retardation (JohansonBlizzard syndrome). Nat. Genet. 37, 1345–1350 (2005).
13. Kwon, Y.T., Xia, Z., Davydov, I.V., Lecker, S.H. & Varshavsky, A. Construction and analysis of mouse strains lacking the ubiquitin ligase UBR1 (E3α) of the N-end rule pathway. Mol. Cell. Biol. 21, 8007–8021 (2001). 14. Kwon, Y.T. et al. Female lethality and apoptosis of spermatocytes in mice lacking the UBR2 ubiquitin ligase of the N-end rule pathway. Mol. Cell. Biol. 23, 8255–8271 (2003). 15. An, J.Y. et al. UBR2 mediates transcriptional silencing during spermatogenesis via histone ubiquitination. Proc. Natl. Acad. Sci. USA 107, 1912–1917 (2010). 16. Ouyang, Y. et al. Loss of Ubr2, an E3 ubiquitin ligase, leads to chromosome fragility and impaired homologous recombinational repair. Mutat. Res. 596, 64–75 (2006). 17. Holm, L., Kaariainen, S., Rosenstrom, P. & Schenkel, A. Searching protein structure databases with DaliLite v.3. Bioinformatics 24, 2780–2781 (2008). 18. de Groot, R.J., Rumenapf, T., Kuhn, R.J., Strauss, E.G. & Strauss, J.H. Sindbis virus RNA polymerase is degraded by the N-end rule pathway. Proc. Natl. Acad. Sci. USA 88, 8967–8971 (1991). 19. Levitt, M. & Perutz, M.F. Aromatic rings act as hydrogen bond acceptors. J. Mol. Biol. 201, 751–754 (1988). 20. Tasaki, T. & Kwon, Y.T. The mammalian N-end rule pathway: new insights into its components and physiological roles. Trends Biochem. Sci. 32, 520–528 (2007). 21. Gil, P. et al. BIG: a calossin-like protein required for polar auxin transport in Arabidopsis. Genes Dev. 15, 1985–1997 (2001). 22. Kwon, Y.T. et al. An essential role of N-terminal arginylation in cardiovascular development. Science 297, 96–99 (2002). 23. Brower, C.S. & Varshavsky, A. Ablation of arginylation in the mouse N-end rule pathway: loss of fat, higher metabolic rate, damaged spermatogenesis, and neurological perturbations. PLoS ONE 4, e7757 (2009). 24. Graciet, E. et al. The N-end rule pathway controls multiple functions during Arabidopsis shoot and leaf development. Proc. Natl. Acad. Sci. USA 106, 13618–13623 (2009). 25. Holman, T.J. et al. The N-end rule pathway promotes seed germination and establishment through removal of ABA sensitivity in Arabidopsis. Proc. Natl. Acad. Sci. USA 106, 4549–4554 (2009).
nature structural & molecular biology VOLUME 17 NUMBER 10 OCTOBER 2010
1187
ONLINE METHODS
© 2010 Nature America, Inc. All rights reserved.
Protein expression and purification. The UBR boxes from human UBR1 and UBR2 (encoding UBR1 residues Gln97–Glu167 and UBR2 residues Leu98– Glu167) were cloned into a pGEX-6P-1 vector (GE Healthcare) and expressed in Escherichia coli BL21 (DE3) in rich LB medium as a fusion with an N-terminal glutathione S-transferase (GST) tag. Site-directed mutagenesis was performed using QuickChange (Stratagene) and confirmed by DNA sequencing. For NMR experiments, proteins were labeled by growth of E. coli BL21 in M9 medium with 15N-ammonium sulfate and 13C-glucose as the sole sources of nitrogen and carbon. GST fusion proteins were purified by affinity chromatography on glutathione-Sepharose resin, and the tag was removed by cleavage with PreScission Protease (GE Healthcare), leaving a Gly-Pro-Leu-Gly-Ser N-terminal extension. The UBR boxes were further purified by gel filtration (Superdex 75; GE Healthcare) in buffer containing 20 mM Tris, pH 7.6, 10 mM NaCl and 2 mM β-mercaptoethanol. The peptides were synthesized by fluorenylmethyloxycarbonyl solid-phase peptide synthesis and purified by reverse-phase chromatography on a C18 column (Vydac). The composition and purity of the peptides were verified by ion-spray quadruple MS. Tetrapeptides were C-terminally amidated to avoid a C-terminal negative charge. Protein crystallization. Crystallization conditions were screened by hanging drop vapor diffusion using the Classics II Suite kit (Qiagen). The best crystals of the UBR1 UBR box were obtained by equilibrating a 1 μl drop of the protein mixture (8 mg ml−1) in 20 mM Tris-HCl, pH 7.6, 10 mM NaCl and 2 mM β-mercaptoethanol, mixed with 1 μl of reservoir solution containing 0.1 M Bis-Tris, pH 6.5, and 25% (w/v) PEG 3350 and suspended over 1 ml of reservoir solution. Crystals grew in 3 d at 20 °C and were cryoprotected by addition of 15% (v/v) glycerol. The crystals belong to the primitive monoclinic space group P21 with two protein molecules per asymmetric unit and a solvent content of 33.7%. UBR2 UBR box crystals were obtained by equilibrating the protein in 20 mM Tris-HCl, pH 7.6, 10 mM NaCl and 2 mM β-mercaptoethanol with a reservoir solution containing 0.96 M sodium citrate, pH 7.0. Crystals grew in 2 d at 20 °C and were cryoprotected by addition of 15% (v/v) ethylene glycol. UBR2 crystallized in the P1 space group with eight protein molecules per asymmetric unit and a solvent content of 29.9%. UBR2–peptide crystals were obtained by equilibrating a 1:1.5 ratio of the UBR box–RIFS peptide mixture in 20 mM Tris-HCl, pH 7.6, 10 mM NaCl and 2 mM β-mercaptoethanol with a reservoir solution containing 0.1 M Bis-Tris, pH 5.5, and 25% (w/v) PEG 3350. Crystals grew in 12 h at 20 °C and were cryoprotected by addition of 15% (v/v) ethylene glycol. The crystals belong to the primitive monoclinic space group P21 with one protein molecule per asymmetric unit and a solvent content of 65.4%. Structure refinement. Diffraction data from a single crystal of UBR1 box were collected on an ADSC Quantum-210 CCD detector (Area Detector Systems Corp.) at beamline F2 at the Cornell High-Energy Synchrotron Source (CHESS) in Ithaca, New York. Data processing and scaling were performed with HKL2000 (ref. 26). The structure of the UBR1 box was determined by SAD phasing using the program Solve/Resolve27. Data at the maximum peak wavelength (λ = 1.2836 Å) were used to locate six zinc atoms in the asymmetric unit. The initial model obtained from Resolve software was extended manually using Coot28 and several cycles of refinement using REFMAC29. At the last stage of refinement, we also applied the translation-libration-screw (TLS) option30. The final model has good stereochemistry, with 91.7% of residues in the most favorable regions of the
nature structural & molecular biology
Ramachandran plot, 8.3% in additionally allowed regions and 0% in generously allowed or disallowed regions according to the PDB validation server (http:// deposit.pdb.org/validate/). Diffraction data on the free UBR2 box were collected on an ADSC Quantum210 CCD detector (Area Detector Systems Corp.) at CHESS beamline A1 (λ = 0.9769 Å). Diffraction data from UBR2-box:RIFS crystal was collected on a Rigaku R-Axis IV++ image plate detector at the GRASP X-ray Diffraction Facility (λ = 1.5418 Å). Molecular replacement using the UBR1 UBR box structure was used to determine both UBR2 structures. The initial model obtained from Phaser was improved by several cycles of refinement, using REFMAC and model refitting. Extra density corresponding to the peptide bound was extended manually using Coot28. Structure figures were made with PyMOL (http://www.pymol.org/). Ramachandran statistics report 97.2% of residues in most favorable, 2.8% in additionally allowed and 0% in generously allowed and disallowed regions for the free UBR2 UBR box. The corresponding values for the complex are 93.1%, 6.9%, 0% and 0%. Refinement statistics for all three crystals are given in Table 1. Nuclear magnetic resonance spectroscopy. NMR samples were prepared in 90% NMR buffer (20 mM Tris-HCl, pH 7.6, 10 mM NaCl and 2 mM β-mercaptoethanol) and 10% D2O. Partial NMR resonance assignments of the UBR1 UBR box were carried out using HNCACB, CBCA(CO)NH and 15N-correlated NOESY experiments on 13C,15N-labeled and 15N-labeled protein. For NMR titrations, unlabeled N-degrons amino acids were added stepwise to 0.2 mM 15N-labeled protein and chemical shift changes were fit to the following equation: C = Cmax
K d + Ptot + Ltot −
(K d + Ptot + Ltot )2 − 4Ptot Ltot 2Ptot
where C is the chemical shift perturbation, Cmax the chemical shift perturbation at saturation, Kd the dissociation constant, Ptot the total concentration of the labeled protein and Ltot the total ligand concentration. NMR experiments were performed at 28 °C on a Bruker 600-MHz spectrometer. Spectra were processed using NMRPipe31 and analyzed with XEASY32. Isothermal titration calorimetry. Experiments were carried out on a MicroCal iTC200 titration calorimeter (MicroCal) in 50 mM Tris-HCl buffer, pH 7.6, 10 mM NaCl and 10 μM ZnCl2 at 20 °C. The reaction cell contained 200 μl of 0.4 mM protein and was titrated with 19 injections of 2 μl of 5.0 mM peptide. The binding isotherm was fit with a binding model that uses a single set of independent sites to determine the thermodynamic binding constants and stoichiometry. 26. Otwinowski, Z. & Minor, W. Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326 (1997). 27. Terwilliger, T.C. Maximum-likelihood density modification. Acta Crystallogr. D Biol. Crystallogr. 56, 965–972 (2000). 28. Emsley, P. & Cowtan, K. Coot: model-building tools for molecular graphics. Acta Crystallogr. D Biol. Crystallogr. 60, 2126–2132 (2004). 29. Murshudov, G.N., Vagin, A.A., Lebedev, A., Wilson, K.S. & Dodson, E.J. Efficient anisotropic refinement of macromolecular structures using FFT. Acta Crystallogr. D Biol. Crystallogr. 55, 247–255 (1999). 30. Winn, M.D., Murshudov, G.N. & Papiz, M.Z. Macromolecular TLS refinement in REFMAC at moderate resolutions. Methods Enzymol. 374, 300–321 (2003). 31. Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995). 32. Bartels, C., Xia, T.H., Billeter, M., Guntert, P. & Wuthrich, K. The program XEASY for computer-supported NMR spectral analysis of biological macromolecules. J. Biomol. NMR 6, 1–10 (1995).
doi:10.1038/nsmb.1894