Jan 29, 1996 - ... a 20-bp oligonu- cleotide that is sufficient to direct a labial-like expression ... for the HOX-EXD-DNA complex that is tested by charac- terizing the effects ..... Chang, C.-P., Shen, W.-F., Rozenfeld, S., Lawrence, H. J., Larg- man, C. ... Flegel, W. A., Singson, A. W., Margolis, J. S., Bang, A. G., Posa- kony, J. W. ...
Proc. Natl. Acad. Sci. USA Vol. 93, pp. 5223-5228, May 1996
Biochemistry
A structural model for a homeotic protein-extradenticle-DNA complex accounts for the choice of HOX protein in the heterodimer (cooperative DNA binding/pbx/labial/Ultrabithorax/MATal)
SIu-KwONG CHAN AND RICHARD S. MANN* Department of Biochemistry and Molecular Biophysics, Columbia University College of Physicians and
Surgeons, 630 West 168th Street, New York NY 10032
Communicated by Richard Axel, Columbia University College of Physicians and Surgeons, New December 20, 1995)
ABSTRACT The genes of the homeotic complex (HOX) encode DNA binding homeodomain proteins that control developmental fates by differentially regulating the transcription of downstream target genes. Despite their unique in vivo functions, disparate HOX proteins often bind to very similar DNA sequences in vitro. Thus, a critical question is how HOX proteins select the correct sets of target genes in vivo. The homeodomain proteins encoded by the Drosophila extradenticle gene and its mammalian homologues, the pbx genes, contribute to HOX specificity by cooperatively binding to DNA with HOX proteins. For example, the HOX protein labial cooperatively binds with extradenticle protein to a 20-bp oligonucleotide that is sufficient to direct a labial-like expression pattern in Drosophila embryos. Here we have analyzed the protein-DNA interactions that are important for forming the labial-extradenticle-DNA complex. The data suggest a model in which labial and extradenticle, separated by only 4 bp, bind this DNA as a heterodimer in a head-to-tail orientation. We have confirmed several aspects of this model by characterizing extradenticle-HOX binding to mutant oligonucleotides. Most importantly, mutations in base pairs predicted to contact the HOX N-terminal arm resulted in a change in HOX preference in the heterodimer, from labial to Ultrabithorax. These results demonstrate that extradenticle prefers to bind cooperatively with different HOX proteins depending on subtle differences in the heterodimer binding site.
York, NY, January 29, 1996 (received for review
proteins (11-17). For many of the cooperative interactions described to date, the interaction with EXD/PBX requires a short stretch of amino acids present in many HOX proteins known as the hexapeptide (also called the YPWM, or pentapeptide motif) (13, 14, 17). However, in these DNA binding
assays heterodimer formation does not show specificity for different hexapeptide-containing HOX proteins. Moreover, the only DNA sequence required for these hexapeptidedependent interactions is the in vitro-derived EXD/PBX consensus binding site, 5'-ATCAATCAA (17-19). Thus, these studies could not address if EXD/PBX proteins contribute to HOX specificity. In contrast to the consensus EXD/PBX binding site, two natural DNA sequences have been described that promote interactions between EXD/PBX and specific HOX proteins (11, 16). One of these sequences, a 20-bp oligonucleotide identified in the 5' region of the mouse Hoxb-1 gene (repeat 3), promotes cooperative DNA binding between EXD and Hoxb-1 but not with Hoxb-4 (16). In mouse embryos, repeat 3 can direct expression in rhombomere 4 of the mouse hindbrain, where Hoxb-1 is expressed (16). In Drosophila, repeat 3 generates an expression pattern that is very similar to labial, the Drosophila homologue of Hoxb-1, and, in vitro, EXD and labial protein (LAB) cooperatively bind to this sequence as a heterodimer (20). Because of the specificity exhibited by repeat 3 in vivo and in vitro, we have analyzed in detail how EXD and LAB bind to this sequence. The data suggest a model for the HOX-EXD-DNA complex that is tested by characterizing the effects of point mutations in repeat 3.
Throughout the animal kingdom the choice between alternative developmental pathways is governed by a set of transcription factors encoded by homeotic complex (HOX) genes (1). HOX proteins all contain a homeodomain close to their C termini that directs sequence specific DNA binding (2, 3). However, in part because the HOX family of homeodomains have similar amino acid sequences (4), HOX proteins often bind to similar DNA sequences in vitro (see, for example, refs. 5 and 6). Thus, an important question is how HOX proteins select and regulate the correct sets of target genes in vivo. The genetic characterization of the extradenticle gene (exd) of Drosophila melanogaster indicated that it may play a role in HOX specificity because its gene product appeared to modify the activity, but not the expression, of the HOX genes (7, 8). exd encodes a homeodomain protein (extradenticle protein, EXD) with extensive identity to three humanpbx genes (9, 10). Interestingly,pbx-1 was independently identified because of its association with pre-B cell leukemias when fused to the E2A gene, again suggesting an important role in controlling cell fates (9, 10). Consistent with its proposed role as a HOX cofactor, EXD/PBX proteins cooperatively bind to DNA with HOX
MATERIALS AND METHODS Proteins. EXD protein refers to a histidine-tagged, homeodomain-containing 74 amino acid fragment (11). Although other work has shown that sequences C terminal to the PBX-1 homeodomain are important for cooperative binding with HOX proteins to a consensus binding site (17), we find that longer forms of EXD are not significantly better at forming complexes with LAB on repeat 3 (data not shown). LAB or LABAAA proteins refer to histidine-tagged, homeodomaincontaining, 477 amino acid portions of labial (20). In LABAAA, the sequence YKWM in the hexapeptide was changed to AAAM (20). UBX protein refers to the entire UBX-Ia open reading frame fused to glutathione-S-transferase in pGEX2KT (21). Both UBX and LAB contain the wild-type hexapeptide motifs. All proteins were purified to at least 60% homoAbbreviations: HOX, homeotic complex; LAB, labial protein; EXD, extradenticle protein; EMSA, electrophoretic mobility shift assay; DMS, dimethyl sulfate; ELbs, EXD-LAB binding site; UBX, Ultrabithorax. *To whom reprint requests should be addressed. e-mail: rsmlO@ columbia.edu.
The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact.
5223
Biochemistry: Chan and Mann
5224
Proc. Natl. Acad. Sci. USA 93
geneity after their over-expression in Escherichia coli by standard techniques (data not shown) (16). DNase I Footprinting. Probes for the top or bottom strand were generated by XhoI or BamHI digestion of pBS(rpt3) (containing a single repeat 3 oligonucleotide cloned into the EcoRV site of pBluescript), end-filling with [32P]dCTP or -dGTP, redigesting with BamHI or XhoI, and gel-purificating from a 6% polyacrylamide gel. Binding was in a volume of 20 [lI [10 mM Tris HCl/15 mM Hepes, pH 7.9/50 mM NaCl/5 mM MgCl2, 1 mM DTT, 20 ,ug BSA/100 ng poly(dG-dC)/5% glycerol] for 20 min at room temperature. The reactions were incubated on ice for 5 min before 2.5 ng pancreatic DNase I (Worthington) was added. Digestion was stopped after 5 min by adding 200 gl stop buffer (2.5 M NH4OAc/0.025 mg of tRNA per ml). DNAs were precipitated, resuspended in formamide loading buffer, and resolved on an 8 M urea 15% polyacrylamide gel. Hydroxyl Radical Footprinting. The probes were prepared as above. Binding was the same as for footprinting except that the volume was 50 ,ul and the [glycerol] was 0.5%. The reactions were performed as described (22). DNAs were purified and resolved by PAGE as above. Interference Studies. The probes were generated by digesting pBS(rpt3) with EcoRI or HindIll, end-filled with [32P]dCTP or -dATP, redigesting with HindIll or EcoRI, and gel-purified as above. Purine methylation by dimethyl sulfate (DMS) methylates adenine in the minor groove and guanine in the major groove. The methylation, pyrimidine elimination
(1996)
by hydrazine, and strand cleavage reactions were performed as described (23, 24). Electrophoretic mobility shift assay (EMSA). All probes (listed in Fig. 4A4) were double stranded 20-bp oligonucleotides that were labeled by end-filling a single cytosine overhang with [32P]dGTP. Special care was taken to generate probes with similar specific activities which were confirmed by autoradiography after gel-purification. A 1-ng probe was used for each binding reaction. Reactions and EMSA were as described (16).
RESULTS AND DISCUSSION To characterize how EXD and LAB bind to repeat 3, we first used the DNase I footprinting technique (Fig. 1A; the sequence of the oligo is shown in Fig. 2). We measured protection from DNase I cleavage due to EXD, LAB, or EXD+LAB (summarized in Fig. 2). On its own, EXD protected positions 4-16 of the top strand and positions 1-15 of the bottom strand; "100 to 200 nM EXD was required for half-maximal protection (Fig. 1A, lanes 14-18 and 25-28). The sequence protected by EXD includes a 7/9 match to the previously defined EXD/PBX consensus binding site, 5'-ATCAATCAA (compared with base pairs 4-12 of the bottom strand; Fig. 2) (18, 19). On its own, LAB weakly protected positions 4-9 of the top strand only at high concentrations (>100 nM) (Fig. 1A, lanes 2-7). No specific protection by LAB of the bottom strand was observed at concentrations up to 200 nM (data not shown). Strikingly, in the presence of 100 nM EXD -0.5 nM LAB was tIit(U6*ItISfilill 4 tLA tAA '41t11 t.11111l11 to1 ,.A
I
^ I
.
mu
,r
rh
B
r
r,
.(1S C
(1
A
C A(i!
(i
C.
1'C
G
At
(i, t
(li(
(1lC'
L)(
J I I)M1: la) -'K'( 5-1CA(K
T 1'lOP:
C
A()T IA t
5'-(iJJJ'GATl(;ATaGUCG('l(I .
I 51
3' end labeled
ATOP
5' endLabeled (i+A
.
+
t
J+L
5t(
3 end labeled (+1' +
3' end labeled G+A
3' end labe led '+T
FIG. 1. Defining the EXD-LAB binding site in repeat 3. (A) DNase I protection. 3'-End-labeled top strand (Left) and bottom strand (Right) DNAs were digested with DNase I in the presence of varying concentrations of EXD and/or LAB as indicated. The sequence of the labeled strand is indicated to the left of each panel; lanes 1 and 19 show guanine (G) and cytosine plus thymine (C+T) sequencing reactions, respectively. (B) Hydroxyl radical protection. 3'-End-labeled top strand (Left) and bottom strand (Right) DNAs were cleaved by hydroxyl radicals in the absence (lanes F) or presence (lanes B) of LAB+EXD. The sequences of the labeled strand are indicated to the left of each panel; the lanes labeled G and C+T refer to sequencing reactions. (C) Interference due to DMS methylation or hydrazine modification. Probes for the top or bottom strand were either 3' or 5' end-labeled as indicated. The labeled probes were modified by DMS (the panels labeled G+A) or hydrazine (the panels labeled C+T) and then used to form complexes with LABAAA (LAAA), EXD, or LAB+EXD (L+E). Protein-DNA complexes were separated from the unbound probe (lanes F) by EMSA, and DNAs were purified, cleaved at the sites of modification, and resolved by PAGE.
Proc. Natl. Acad. Sci. USA 93 (1996)
Biochemistry: Chan and Mann EXD+LAB EXD
*--v
*
to
LAB
GGGG TGATGGATGG C.CTG 14 :17 CCCCAC TACCTACCCG CGAC 1 2
EXD EXD+LAB
3 4 5
6 7
8 9 10 11 12 13
15 16
18 19 20
:AA
A AA '-AOAOA A4& :A
FIG. 2. Summary of the interference and protection data for EXD, LAB (or LABAAA), and LAB + EXD. Purines that, when methylated by DMS, interfered with binding are indicated by circles and pyrimidines that, when modified by hydrazine, interfered with binding are indicated by triangles. The size of the symbol correlates with the amount of interference. Bases protected by the heterodimer from
cleavage by hydroxyl radicals are boxed, protection from digestion by DNase I is summarized by the thick black lines. ELbs refers to the EXD-LAB binding site as determined by the limits of protection from hydroxyl radical cleavage.
required for half-maximal protection (Fig. 1A, lanes 8-13 and 20-24). Thus, the presence of EXD stimulates LAB binding to repeat 3 by "200-fold. Protection by the heterodimer extended from positions 4 to 20 of the top strand and from positions 1 to 17 of the bottom strand (Fig. 2). Digestion by DNase I is partially dependent on the DNA sequence and, because of its large size, cannot accurately determine protein-DNA contacts. A sequence-nonspecific method that detects contacts between proteins and the phosphate backbone of DNA measures protection from cleavage by hydroxyl radicals (22). Using this technique, the EXD-LAB heterodimer protected bases 5-14 of the top strand and bases 7-16 of the bottom strand (Fig. 1B). From these data, we define the minimal protected region, from base pairs 5-16, as the ELbs (Fig. 2). Chemical modification of DNA is a complementary approach for analyzing protein-DNA interactions. Here we have
determined which guanine or adenine bases, when methylated by DMS, and which cytosine or thymine bases, when modified by hydrazine, interfere with binding (Fig. 1C; summarized in Fig. 2). These interference studies were performed with the EXD-LAB heterodimer, EXD alone, and a mutant form of LAB, LABAAA, which has three alanine substitutions within the hexapeptide. It was necessary to use LABAAA instead of wild-type LAB because, in the absence of EXD, binding of LAB to repeat 3 is inhibited by the hexapeptide, thus preventing the isolation of repeat 3-LAB complexes by EMSA (20). Several features of the modification data are noteworthy. First, the data suggest that EXD and LAB bind to the left and right sides of the ELbs, respectively, because modification of the top strand at positions 5 or 6 blocked EXD binding whereas modification at positions 8, 10, or 11 blocked LABAAA binding (Fig. 2). Second, for both strands, the interference patterns around the left edge (positions 1 to 6) of the ELbs were similar for the binding of EXD, alone, and for the binding the heterodimer (Fig. 2). Specifically, for both EXD and the heterodimer, there was no interference due to modification at positions 1 to 4, and interference was apparent due to modification at positions 5 or 6. These data suggest that EXD binds to the left side of the ELbs and that it binds in a similar manner with or without LAB. Third, for both strands, the interference pattern at the right side of the ELbs was more extensive for the heterodimer than it was for either EXD or LABAAA by themselves (Fig. 2). These differences suggest that in the heterodimer LAB binds to repeat 3 differently than LABAAA binds on its own. Finally, the heterodimer appears to make
5225
additional DNA contacts that are not made by either protein alone. For example, modification at positions 7, 9, or 12 of the top strand interfered with heterodimer binding but did not interfere with LABAAA or EXD binding (Fig. 2). These additional contacts may account for an increase in the stability of the heterodimer-DNA complex, as observed previously with other HOX-EXD/PBX complexes (11, 15). A Model for the EXD-LAB-Repeat 3 Complex. We have used features that are common to all homeodomain-DNA interactions (2, 25-27), in particular the conserved contact between Asn-51 and an adenine, to generate a model for the EXD-LAB-repeat 3 complex. Within the ELbs there are four adenines: positions 7 and 11 of the top strand and positions 8 and 12 of the bottom strand. The interference data suggests that EXD and LAB bind to the left and right sides of the ELbs, respectively. Thus Asn-51 of EXD probably contacts A7 or A8 and Asn-51 of LAB contacts All or A12. This suggests four possible orientations (Fig. 3A). Of these four, two are less likely due to steric clashes whereas two orientations appear to have no steric hindrance (models 1 and 2, Fig. 3 A and B). Although the data presented above cannot rule out either model, the hydroxyl radical protection data favor model 1 (Fig. 3B). Specifically, hydroxyl radical cleavage in the presence of the heterodimer generated a staggered protection pattern: the top strand was protected from positions 5 to 14 and the bottom strand was protected from positions 7 to 16. When these data are projected upon a three-dimensional representation of the two models, they are more consistent with the orientation and shape of the heterodimer in model 1 (Fig. 3B). Conversely, the hydroxyl radical protection data are difficult to reconcile with model 2. Additional support for model 1 comes from a comparison with how the yeast homeodomain protein MATal binds to DNA. The EXD and MATal homeodomains are 65% identical (9, 10, 29, 30). According to model 1, EXD and MATal also bind to similar DNA sequences. Specifically, biochemical and structural studies have demonstrated that Asn51 of MATal contacts the adenine in the sequence, 5'-TGATG (28, 31) (Fig. 3C). The left side of the ELbs also contains the sequence 5'-TGATG and model 1 predicts that Asn-51 of EXD contacts this adenine (Fig. 3C). Furthermore, Arg-55 of MATal, which is conserved in EXD and all three PBX proteins, contacts the 5' guanine of this sequence in the major groove (31). Thus, in addition to EXD and MATal having similar homeodomains, they may also recognize similar DNA sequences. Tests of the Model. To test this model, we have used EMSA to characterize complex formation with mutant repeat 3 oligonucleotides. For comparison, wild-type repeat 3 weakly formed complexes with EXD alone (Fig. 4B, lane 1 and Fig. 4C, lane 2) but did not form complexes with LAB alone (Fig. 4B, lane 2 and Fig. 4C, lane 3). When EXD and LAB are both present in the reaction, complexes were readily formed (Fig. 4B, lanes 3 to 5 and Fig. 4C, lane 5). These EXD- and LAB-dependent complexes contain both EXD and LAB as determined by antibody supershift experiments (20). EXD binds to the left side of the ELbs. We first tested if EXD binds to the left side of the ELbs. For this test, we mutated position 6 of repeat 3 because, from the modification studies, both strands appeared to be critical for EXD binding at this position (Fig. 1). Further, based on comparisons with the MATal structure (see above), G6 is predicted to be contacted by Arg-55 of EXD. The mutant oligonucleotide, repeat 3(G6A) (Fig. 4A), was unable to bind EXD under standard conditions (Fig. 4B). In addition, this mutation dramatically reduced the ability to form EXD-LAB complexes (Fig. 4B). We also tested repeat 3(G6A) for enhancer activity in vivo by cloning three copies of this oligonucleotide upstream of a minimal promoter driving lacZ. Unlike wild-type repeat 3 which drives expression in a pattern that is very similar to the labial expression pattem (20), repeat 3 (G6A) failed to generate any expression in vivo (data not shown). These data are consistent
5226
Proc. Natl. Acad. Sci. USA 93
Biochemistry: Chan and Mann A
(1996)
B
EXD-A7; LAB-A'12
GGGGTIGTiGATGGGCGCTG
17
ICCCIACTACeCCGCGAC LA.B-All
16
D-AB;
~G GCGCTG CCCCACT~CCTACC(GCGAC EXD-A7; LAB-All (model 1) GGGGT6TGGATGGGCGCTG CCCCACTCCTA5CCGCGAC EXD-A8;LAB-A12 (mode 2) GGGGTGATGJTGGGCGCTG CTICCCCGAC
GGGGTGATGG
15
14 13 12 11 10 9
8
CCCCA~dC
C
EXI)) mn-djil 1: ArgX5-S Asn5l
LAB Asn5I 4
rec;ai3 GGGGTGATGGATGGGCGCTG hsg ACTATGATGTACTTTTCTACAT
ArgS5 AsnSI al
3 2
{x?As'n-MODEL I
MODEL 2
FIG. 3. Models for EXD-LAB binding to repeat 3. (A) EXD (striped) and LAB (black) are represented by arrows; the back of the arrow represents the N-terminal arm and the arrowhead represents the third a-helix. The Asn-51-contacted adenines are circled. The top two orientations are less likely due to steric clashes in either the major groove (top orientation) or minor groove (second orientation). (B) A three-dimensional representation of models 1 and 2. Based on the data presented here, we propose that model 1 best describes the EXD-LAB-DNA complex. The cylinders represent helices 1, 2, and 3 and N-terminal arms are indicated by the thick lines. Asn-51 contacts are indicated by a large A. Portions of the DNA backbone that were protected by the heterodimer from cleavage by hydroxyl radicals are shown as striped ribbons, and portions that were accessible to cleavage are shown as dark gray ribbons. Methylated bases that interfered with heterodimer formation are circled. The numbers to the left of the models refer to the base pairs in repeat 3 as in Fig. 2. (C) The top strand of repeat 3 is aligned with the top strand of a haploid-specific gene (hsg) operator (28) and identities are highlighted by the gray box. Contacts for MATal (al), MATa2 (a2), and (according to model 1) EXD and LAB are indicated.
with the conclusion that EXD binds to the left side of the ELbs and that EXD binding is necessary for enhancer activity. Mutating the Asn-51 contacts. One potentially simple way to distinguish between models 1 and 2 would be to mutate the two different adenines (All or A12) predicted to be contacted by Asn-51 of LAB. Unfortunately, this approach could not distinguish between these models because the ability to form EXD-LAB heterodimers was destroyed by mutating either of these adenines (data not shown). In contrast, EXD binding to these oligonucleotides was not eliminated, supporting the view that EXD binds to the left side of the ELbs. Changing the specificity ofheterodimerformation. The experiments described below examined the basis of HOX binding specificity. Whereas repeat 3 promotes heterodimer formation between EXD and LAB or its mouse homologue Hoxb-1, it does not bind EXD + Hoxb-4 (16). In addition, repeat 3 poorly binds the Drosophila HOX protein Ultrabithorax (UBX) in the absence or presence of EXD (Fig. 4C, lanes 4 and 6). Without any cofactor, UBX prefers to bind the sequence 5'-TAATGG (32), with Asn-51 of UBX contacting the underlined A. Based on structural studies with other homeodomains, the UBX N-terminal arm makes minor groove contacts with the two 5' base pairs (TA). If this sequence preference also holds true when UBX binds as a heterodimer with EXD, it should be possible to distinguish between models 1 and 2 because they make different predictions for how to increase UBX binding to repeat 3. Specifically, in model 1 the HOX N-terminal arm interacts with base pairs 9 and 10 whereas in model 2 it interacts with base pairs 13 and 14 (Fig. 3B). Model 1 predicts that UBX binding to repeat 3 should increase if the top strand bases GG (positions 9 and 10) were changed to TA to generate the sequence 5'-TAATGG [repeat 3(G9T, G10A); Fig. 4A]. In contrast, model 2 predicts that UBX binding to repeat 3 may increase if the bottom strand
bases CC (positions 13 and 14) were changed to TA to generate the sequence 5'-TAATCC [repeat 3(G13T, G14A); Fig. 4A]. We note, however, that the 3' CC of this sequence makes this a poor UBX monomer binding site (32). Of these two oligos, only repeat 3(G9T, G10A) bound UBX better than wild type repeat 3, a result that favors model 1 (Fig. 4C, lanes 7-12 and 25-39). LAB+EXD also efficiently bound to repeat 3(G13T, G14A) but, interestingly, not to repeat 3(G9T, G10A) (Fig. 4C, lanes 11 and 29). In addition, repeat 3(G9T, G10A), which contains a consensus UBX binding site 5'-TAATGG, also bound UBX in the absence of EXD (Fig. 4C, lane 28). Because positions 9 and 10 appeared to be important for altering HOX specificity, we made additional substitutions at these positions. Repeat 3(G9T) was qualitatively similar to wild type repeat 3 because LAB, but not UBX, efficiently formed heterodimers with EXD (Fig. 4C, lanes 13 to 18). EXD bound to repeat 3(G9T) significantly better than to wild-type repeat 3 (compare lanes 2 and 14), accounting for a modest increase in LAB+EXD and UBX+EXD complex formation (compare lanes 5 and 6 with 17 and 18). Strikingly, a single G -*
T mutation at
position
10
[repeat 3(G1OT)]
eliminated the
formation of LAB+EXD complexes and partially reduced the formation of UBX+EXD complexes (Fig. 4C, lanes 19-24; after a 3-fold longer exposure of this autoradiogram UBX+EXD complexes, but not LAB +EXD complexes, were visible). Most interestingly, we examined repeat 3(G9T, GlOT),
which combines the
previous
two G
->
T
mutations,
and found that its HOX preference was reversed from that of wild-type repeat 3: LAB+EXD complexes were undetectable and UBX+EXD complexes formed efficiently (Fig. 4C, lanes 31-36). Moreover, UBX did not efficiently bind this oligo in the absence of EXD (lane 34). These results illustrate that changes in the center of the ELbs can lead to dramatic differences in the specificity of het-
Biochemistry: Chan and Mann I
I
A
Proc. Natl. Acad. Sci. USA 93 (1996) I
IIOX
I
GGGGTGATGGATGGGCGCTG G6A ..... G9T,G10A .TA . G13T,G14A * ...TA. G9T .T.......... GIOT ........T........T . G9T,GIOT ... ... T T .
5227
N-term AsnSl hclix3
+
......
........
.
.
.
a
5 -G
.
N-term
B
LAB
Arg,55 Asn5l helix3((1y5Q) EXD/PBX
EXD probe:
FIG. 5. Generalized binding site for EXD/PBX-HOX heterodimers. Based on the EXD-LAB binding site defined here, we propose that EXD and its mammalian PBX homologues bind to the sequence 5'-TGATNN (indicated by the box to the left) and HOX proteins bind to the overlapping sequence 5'-NNATNN (indicated by the box to the right). The third helix of the EXD/PBX homeodomain may only weakly contribute to sequence specificity (indicated by the dashed arrows) because position 50 in these homeodomains, which usually plays an important role in DNA recognition, is a Gly (4).
E+L-_. E _.
erodimer formation. Whereas the .sequence 5'-TGATG£jATGG showed a clear preference for LAB+EXD the sequence 5'-TGATTTATGG showed a clear preference for UBX+EXD. These changes in heterodimer specificity provide further support for model 1. In this model, it is the N-terminal arm of the HOX protein that is contacting the specificitydetermining base pairs (9 and 10) whereas in model 2 it is the third a-helix that would contact them (Fig. 3). Within their N-terminal arms LAB and UBX differ in six of nine amino acids, including residues 3 and 7 which make base-specific contacts in other homeodomain-DNA structures (2, 25-27). In contrast, their third a-helices differ in only 3 of 17 amino acids and none of these residues make base-specific contacts in other
C
E _..
probe: 2 E+H_
N ATNN
.......
.:
:.
E_ ;' ...........
FIG. 4. Tests of the models. (A) The sequence of repeat 3 (line +) and all variants. Only bases that are different from the wild-type sequence are shown. (B) G6 is important for EXD binding. An EMSA experiment showing wild-type repeat 3 or repeat 3(G6A) binding to EXD, LAB, or EXD+LAB. Proteins used were as follows: lanes 1 and 6, 50 ng EXD; lanes 2 and 7, 80 ng LAB; lanes 3 and 8, 50 ng EXD plus 20 ng LAB; lanes 4 and 9, 50 ng EXD plus 40 ng LAB; lanes 5 and 10, 50 ng EXD plus 80 ng LAB. E and E+L indicate EXD and EXD plus LAB complexes, respectively. (C) Changing the specificity of heterodimer formation. Shown are autoradiograms of EMSA experiments carried out in parallel using the indicated probes bound to EXD, LAB, UBX, EXD+LAB, or EXD+UBX. Proteins used were as follows: lanes 1, 7, 13, 19, 25, and 31, no protein; lanes 2, 8, 14, 20, 26, and 32, 50 ng EXD; lanes 3, 9, 15, 21, 27, and 33, 80 ng LAB; lanes 4, 10, 16, 22, 28, and 34, 80 ng UBX; lanes 5, 11, 17, 23, 29, and 35, 50 ng EXD plus 80 ng LAB; lanes 6, 12, 18, 24, 30, and 36, 50 ng EXD plus 80 ng UBX. In the binding reactions in B and C, 50 ng EXD 180 nM, 80 ng LAB - 58 nM, and 80 ng UBX 42 nM. E and E+H indicate EXD and EXD+HOX
(LAB or UBX) complexes, respectively.
homeodomain-DNA structures. Thus, it is more likely that differences between the N-terminal arms of these HOX proteins distinguish differences at base pairs 9 and 10. Assuming that model 1 is correct, these results suggest a novel aspect to homeodomain-DNA recognition. In particular, the sequence specificity of UBX as a monomer is different from its specificity as a heterodimer with EXD. While the sequence 5'-TTATGG is a poor UBX monomer binding site (Fig. 4C, lane 34) (32) it is a good binding site for UBX+EXD. We interpret this change in specificity by suggesting that the presence of EXD alters the conformation of the HOX Nterminal arm, the DNA, or both, thus changing how the N-terminal arm contacts DNA. By biochemically analyzing the protein-DNA contacts made by a LAB-EXD heterodimer we have obtained a model for how these proteins bind to repeat 3 (model 1 of Fig. 3). In this model, the two proteins bind in the same orientation and are spaced apart by only 4 bp. In contrast, the two homeodomains in the MATa1/a2 heterodimer are separated by 12 bp (28, 31) (Fig. 3C). This model places the N-terminal arm of LAB in the center of the complex, interacting with base pairs in the minor groove that also have the potential to interact with the EXD third a-helix in the major groove. These base pairs are critical for discriminating between heterodimers formed between EXD and different HOX proteins. Because the consensus EXD/PBX binding site is capable of promoting heterodimer formation between EXD/PBX and a wide spectrum of HOX proteins (13, 15, 17, 33) we suggest that this model will apply to many EXD/PBX-HOX interactions (Fig. 5). One important implication of these results is that, in conjunction with EXD, subtle differences in the heterodimer binding site may be sufficient to distinguish between the binding of many or possibly all HOX proteins. Thus, these findings confirm previous assertions (11, 16) that in addition to increasing the affinity of HOX binding, EXD adds specificity to how HOX proteins bind to DNA.
5228
Biochemistry: Chan and Mann
We gratefully acknowledge Heike Popperl, Robb Krumlauf, and other members of the Krumlauf laboratory for first identifying repeat 3 and thank them for initiating a collaboration with us. We also thank Aneel Aggarwal and Dimitris Thanos for helpful discussions and comments on this paper. This work was supported by grants from the National Institutes of Health (HD27986) and the Searle Scholars Program awarded to R.S.M.
1. McGinnis, W. & Krumlauf, R. (1992) Cell 68, 283-302. 2. Gehring, W. J., Qian, Y. Q., Billeter, M., Furukubo-Tokunaga, K., Schier, A. F., Resendez-Perez, D., Affolter, M., Otting, G. & Wuthrich, K. (1994) Cell 78, 211-223. 3. Laughon, A. (1991) Biochemistry 30, 11357-11367. 4. Burglin, T. (1994) inA Comprehensive Classification ofHomeobox Genes, ed. Duboule, D. (Oxford Univ. Press, Oxford), pp. 26-71. 5. Kalionis, B. & O'Farrell, P. H. (1993) Mech. Dev. 43, 57-70. 6. Ekker, S. C., Jackson, D. G., von Kessler, D. P., Sun, B. I., Young, K. E. & Beachy, P. A. (1994) EMBO J. 13, 3551-3560. 7. Jurgens, G., Wieschaus, E., Niisslein-Volhard, C. & Kluding, H. (1984) Roux's Arch. Dev. Biol. 193, 283-295. 8. Peifer, M. & Wieschaus, E. (1990) Genes Dev. 4, 1209-1223. 9. Kamps, M. P., Murre, C., Sun, X.-h. & Baltimore, D. (1990) Cell 60, 547-555. 10. Nourse, J., Mellentin, J., Galili, N., Wilkinson, J., Stanbridge, E., Smith, S. & Cleary, M. (1990) Cell 60, 535-545. 11. Chan, S.-K., Jaffe, L., Capovilla, M., Botas, J. & Mann, R. S. (1994) Cell 78, 603-615. 12. van Dijk, M. & Murre, C. (1994) Cell 78, 617-624. 13. Neuteboom, S., Peltenburg, L., van Dijk, M. & Murre, C. (1995) Proc. Natl. Acad. Sci. USA 92, 9166-9170. 14. Phelan, M. L., Rambaldi, I. & Featherstone, M. (1995) Mol. Cell. Biol. 15, 3989-3997. 15. Lu, Q., Knoepfler, P., Scheele, J., Wright, D. & Kamps, M. (1995) Mol. Cell. Biol. 15, 3786-3795.
Proc. Natl. Acad. Sci. USA 93
(1996)
16. Popperl, H., Bienz, M., Studer, M., Chan, S.-K., Aparicio, S., Brenner, S., Mann, R. & Krumlauf, R. (1995) Cell 81, 1031-1042. 17. Chang, C.-P., Shen, W.-F., Rozenfeld, S., Lawrence, H. J., Largman, C. & Cleary, M. (1995) Genes Dev. 9, 663-674. 18. LeBrun, D. P. & Cleary, M. L. (1994) Oncogene 9, 1641-1647. 19. van Dijk, M. A., Voorhoeve, P. M. & Murre, C. (1993) Proc. Natl. Acad. Sci. USA 90, 6061-6065. 20. Chan, S.-K., Popperl, H., Krumlauf, R. & Mann, R. S. (1996) EMBO J. 15, 2477-2488. 21. Kaelin, W. G., Krek, W., Sellers, W., DeCaprio, J., Ajchenbaum, F., Fuchs, C., Chittenden, T., Li, Y., Farnham, P., Blanar, M., Livingston, D. & Flemington, E. (1992) Cell 70, 351-364. 22. Dixon, W., Jayes, J., Levin, J., Weidner, M., Dombroski, B. & Tullius, T. (1991) in Hydroxyl Radical Footprinting, ed. Sauer, R. (Academic, San Diego), Vol. 208, pp. 380-413. 23. Holler, M., Westin, G., Jiricny, J. & Schaffner, W. (1988) Genes Dev. 2, 1127-1135. 24. Brunelle, A. & Schleif, R. (1987) Proc. Natl. Acad. Sci. USA 84, 6673-6676. 25. Wolberger, C., Vershon, A. K., Liu, B., Johnson, A. D. & Pabo,
C. 0. (1991) Cell 67, 517-528. 26. Billeter, M., Qian, Y. Q., Otting, G., Muller, M., Gehring, W. & Wuthrich, K. (1993) J. Mo. Bio. 234, 1084-1093. 27. Kissinger, C. R., Liu, B., Martin-Blanco, E., Kornberg, T. B. & Pabo, C. 0. (1990) Cell 63, 579-590. 28. Goutte, C. & Johnson, A. D. (1994) EMBO J. 13, 1434-1442. 29. Rauskolb, C., Peifer, M. & Wieschaus, E. (1993) Cell 74, 1-20. 30. Flegel, W. A., Singson, A. W., Margolis, J. S., Bang, A. G., Posakony, J. W. & Murre, C. (1993) Mech. Dev. 41, 155-161. 31. Li, T., Stark, M., Johnson, A. & Wolberger, C. (1995) Science 270, 262-269. 32. Ekker, S. C., Young, K. E., von Kessler, D. P. & Beachy, P. A.
(1991) EMBO J. 10, 1179-1186. 33. van Dijk, M., Peltenburg, L. & Murre, C. (1995) Mech. Dev. 52, 99-108.