Supplemental Data Role for first zinc finger of WT1 in DNA sequence specificity: Denys-Drash syndromeassociated WT1 mutant in ZF1 enhances affinity for a subset of WT1 binding sites
Dongxue Wang1, John R. Horton2, Yu Zheng3, Robert M. Blumenthal4, Xing Zhang2, and Xiaodong Cheng1,2,*
1
Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322,
USA 2
Department of Molecular and Cellular Oncology, The University of Texas MD Anderson
Cancer Center, Houston, TX 77030, USA 3
RGENE, Inc., 953 Indiana Street, San Francisco, California 94107, USA
4
Department of Medical Microbiology and Immunology, and Program in Bioinformatics, The
University of Toledo College of Medicine and Life Sciences, Toledo, OH 43614, USA
*To whom correspondence should be addressed. Email:
[email protected]
5 figures and 2 tables
1
ZF4 1
bits
ChIP-seq (Motamedi et al., 2014)
1
A
C
Prediction by Persikov et al. (2014) and (2015) Polynomial SVM model
C T
A T
T
A
G TA
AA
Figure S1
A
A
G
G
CA C AA
T
T
B
G
T T AA
-KTS
+KTS
G
A
G
A
GCGTGGGCG
T/G G/A/T T/G
Triplet motif for ZF1 (Hamilton et al., 1995)
G
C
T
T
G 0
ZF1
G GG GG GG G GGGAG C
0 2
ZF2
1 2 3 4 5 6 7 8 9 10 11 12
bits
ChIP-chip (Hartwig et al., 2010)
2
ZF3
1 2 3 4 5 6 7 8 9 10 11 12
A
1 2 3 4 5 6 7 8 9 10 11 12
kDa 66.4 55.6 42.7 34.6
ZF1-4 M342R ZF1-4 M342R
27.0 ZF4
ZF3
ZF2
ZF1
Linear expanded SVM model
20.0 14.3
Figure S1. (A) Known WT1 binding consensus motifs. From top to bottom, WT1 consensus binding motifs as determined by ChIP-chip (1) (see Figure 6A), ChIP-seq (2) (see Figure 6B), in vitro DNA binding site selection assay (3), and predictions made by polynomial and linear expanded SVM-based algorithm (4,5). (B) SDS-PAGE gel showing the purified recombinant proteins used in this study.
2
Figure S2 Class in Vertebrata
Species
ZF1
Mammalia Aves Reptilia Amphibia Osteichthyes Chondrichthyes
Homo sapiens Gallus gallus Anolis carolinensis Xenopus tropicalis Lepisosteus oculatus Rhincodon typus
Mammalia Aves Reptilia Amphibia Osteichthyes Chondrichthyes
GenBank # AAH32861 XP_015141620 XP_016848374 NP_001135625 XP_006642581 XP_020366639
Substitutions relative to Mammalia ZF1 ZF2 Aves 0 0 Reptilia 2 0 Amphibia 0 0 Osteichthyes 0 2 Chondrichthyes 1 1
EKRPFM EKRPFM EKRPFM EKRPFM EKRPFM EKRPFM
ZF3 0 0 0 1 0
ZF2
CAYPGCNKRYFKLSHLQMHSRKH CAYPGCNKRYFKLSHLQMHSRKH CVYPGCNKRYFKLSHLQMHGRKH CAYPGCNKRYFKLSHLQMHSRKH CAYPGCNKRYFKLSHLQMHSRKH CPYPGCNKRYFKLSHLQMHSRKH
TGEKPYQ TGEKPYQ TGEKPYQ TGEKPYQ TGEKPYQ TGEKPYQ
CDFKDCERRFSRSDQLKRHQRRH CDFKDCERRFSRSDQLKRHQRRH CDFKDCERRFSRSDQLKRHQRRH CDFKDCERRFSRSDQLKRHQRRH CDFTDCGRRFSRSDQLKRHQRRH CDFKDCGRRFSRSDQLKRHQRRH
TGVKPFQ TGVKPFQ TGVKPFQ TGIKPFQ TGVKPFQ TGVKPFQ
ZF3 CKTCQRKFSRSDHLKTHTRTH CKTCQRKFSRSDHLKTHTRTH CKTCQRKFSRSDHLKTHTRTH CKTCQRKFSRSDHLKTHTRTH CETCQRKFSRSDHLKTHTRTH CKTCQRKFSRSDHLKTHTRTH
TGEKPFS TGEKPFS TGEKPFS TGEKPFS TGEKPFN TGEKPFS
ZF4 CRWPSCQKKFARSDELVRHHNMH CRWPSCQKKFARSDELVRHHNMH CRWPSCQKKFARSDELVRHHNMH CRWPSCQKKFARSDELVRHHNMH CRWPNCQKKFARSDELVRHHNMH CRWPNCQKKFARSDELVRHHNMH
QRNMTK QRNMTK NRNMTK QRNMTK QRNLTK QRNMTK
ZF4 0 0 0 1 1
Figure S2. Conservation of WT1 orthologs across Vertebrata. One of the top hits to human WT1 ZF region, from each other vertebrate class, is shown. The zinc fingers are in black, with DNA base-recognizing residues highlighted in yellow. Spacers between ZFs are shown in gray letters, and blue highlighting indicates substitutions relative to the human sequence. Reptilia shows a higher rate of substitution in ZF1, and that is only two changes, with none in the other ZFs. Aves and Amphibia have no substitutions, Osteichthyes has four overall but none are in ZF1, and Chondrichthyes has one each in ZFs 1, 2, and 4. No good WT1 ortholog could be found in the available Agnata sequences. [We did find, in the Agnata lamprey species Lethenteron camtschaticum, an ortholog for Egr1, which has similar ZF sequences to WT1 ZF2-4 but lacks ZF1.]
3
A
B
ZF1 ZF1
COOH-ZF4-ZF3-ZF2-ZF1-NH2 ||| ||| ||| 5’-C-GCG-GGG-GCG-TCT-G-3’ 3’-G-CGC-CCC-CGC-AGA-C-5’ 123 456 789 012
Figure S3
PDB 2PRT
C Mol C
Linker
Mol A
K336
Symmetry-related molecules
ZF1
H339
3.6
ZF2
3.2
3.6 (PDB 2PRT)
A12
ZF3
Figure S3. ZF1 stacks between two DNA molecules. (A) In a previous study (6), ZF1 is found at the joint of two tail-to-tail DNA molecules (PDB 2PRT). (B) The stacking distance (~3.6 Å) between A12 and H229, as well as between H229 and K336, is equivalent to one DNA helical rise. (C) Superimposition of structure 2PRT (in grey) with that of Mol A and Mol C (this study) provides an additional conformation of ZF1. Figure S4
A C C10
TFIIIA-5S RNA (PDB 1UN6)
Thr176
B Lys118
G75
Figure S4. TFIIIA-RNA interaction. (A) Crystal structure of TFIIIA-5S RNA, adopted from PDB 1UN6 (7). (B) Lys118 of ZF4 interacts with the RNA backbone phosphate and stacks with G75 base. (C) Thr176 of ZF6 stacks with C10 base. We note that both Lys118 and Thr176 are located in the DNA-base interacting position of each ZF, i.e., the first residue immediately before the helix.
4
Figure S5. Re-analysis of ChIP-seq data of expressing biotinylated WT1±KTS isoforms in leukemic K562 cells (8). (A) Schematic reduction of potential binding sites from the large number of observed binding sequences (available in the Gene Expression Omnibus (GEO) under accession number GSE81009) to significantly reduced sites by applying cutoff of fold enrichment using the motif-finding program MEME Suite. (B) At the cutoff of >200 fold enrichment, four common sequence motifs could be identified for both biotinylated WT1-KTS and WT1+KTS isoforms. (C) At the cutoff of >400 fold enrichment, a single sequence motif based on 265 sequences of biotinylated WT1-KTS (middle panel) matches with previously published ChIP-seq data, for example, the in vivo ChIP-Seq analysis on E18.5 kidneys (2) (top panel). Under the same cutoff, the biotinylated WT1+KTS binding sites contain two weaklyrelated repeats (bottom panel).
5
References 1.
2. 3. 4. 5. 6. 7. 8.
Hartwig, S., Ho, J., Pandey, P., Macisaac, K., Taglienti, M., Xiang, M., Alterovitz, G., Ramoni, M., Fraenkel, E. and Kreidberg, J.A. (2010) Genomic characterization of Wilms' tumor suppressor 1 targets in nephron progenitor cells during kidney development. Development, 137, 1189-1203. Motamedi, F.J., Badro, D.A., Clarkson, M., Lecca, M.R., Bradford, S.T., Buske, F.A., Saar, K., Hubner, N., Brandli, A.W. and Schedl, A. (2014) WT1 controls antagonistic FGF and BMP-pSMAD pathways in early renal progenitors. Nat Commun, 5, 4444. Hamilton, T.B., Barilla, K.C. and Romaniuk, P.J. (1995) High affinity binding sites for the Wilms' tumour suppressor protein WT1. Nucleic Acids Res, 23, 277-284. Persikov, A.V., Wetzel, J.L., Rowland, E.F., Oakes, B.L., Xu, D.J., Singh, M. and Noyes, M.B. (2015) A systematic survey of the Cys2His2 zinc finger DNA-binding landscape. Nucleic Acids Res, 43, 1965-1984. Persikov, A.V. and Singh, M. (2014) De novo prediction of DNA-binding specificities for Cys2His2 zinc finger proteins. Nucleic Acids Res, 42, 97-108. Stoll, R., Lee, B.M., Debler, E.W., Laity, J.H., Wilson, I.A., Dyson, H.J. and Wright, P.E. (2007) Structure of the Wilms tumor suppressor protein zinc finger domain bound to DNA. J Mol Biol, 372, 1227-1245. Lu, D., Searles, M.A. and Klug, A. (2003) Crystal structure of a zinc-finger-RNA complex reveals two modes of molecular recognition. Nature, 426, 96-100. Ullmark, T., Jarvstrat, L., Sanden, C., Montano, G., Jernmark-Nilsson, H., Lilljebjorn, H., Lennartsson, A., Fioretos, T., Drott, K., Vidovic, K. et al. (2017) Distinct global binding patterns of the Wilms tumor gene 1 (WT1) -KTS and +KTS isoforms in leukemic cells. Haematologica, 102, 336-345.
6
Table S1. Summary of crystallization, X-ray data collection from SERCAT beamlines at wavelength=1Å and refinement statistics (*)
WT1(-KTS) ZF1-4 DNA (5’-3’) (3’-5’) Crystallization
M342 AGCGTGGGAGTGT CGCACCCTCACAT 30% (w/v) PEG 3350 0.2 M ammonium acetate 0.1 M Bis-Tris, pH 5.5 6B0O 22-BM P1 36.93, 42.66, 71.82 88.3, 77.7, 85.8 29.59-1.55 (1.61-1.55) 0.044 (0.427) 0.028 (0.327) (0.793, 0.940) 19.1 (2.5) 96.8 (95.0) 3.3 (3.3) 196,483 59,787 (5883)
M342 AGCGTGGGAGGGT CGCACCCTCCCAT 20% (w/v) PEG 3350 70 mM Bis-Tris Propane 30 mM citric acid, pH 7.6 6B0P 22-BM P21212 75.80, 84.40, 85.97 90, 90, 90 37.88-2.08 (2.15-2.08) 0.124 (0.622) 0.043 (0.322) (0.851, 0.959) 17.5 (2.9) 100.0 (100.0) 9.0 (8.8) 306,172 33,923 (3354)
M342 R342 (3 crystals) AGCGTGGGAGTGTT AGCGTGGGAGGGTTA CGCACCCTCACAAT CGCACCCTCCCAATT 25% (w/v) PEG 3350 30% (w/v) PEG 2000 MME 0.2 M Li2SO4 0.1 M potassium cyanide (KCN) 0.1 M Bis-Tris, pH 6.5 6B0Q 6B0R 22-ID 22-ID P63 C2 161.66, 161.66, 42.18 153.94, 35.11, 87.15 90, 90, 120 90, 110.0, 90 38.83-2.79 (2.68-2.79) 31.35-1.85 (1.92-1.85) 0.114 (0.522) 0.121 (0.659) 0.051 (0.394) 0.030 (0.301) (0.686, 0.902) (0.850, 0.959) 13.5 (2.0) 21.3 (3.0) 99.8 (100) 99.9 (99.3) 5.9 (5.4) 17.4 (9.4) 94,211 694,051 16,012 (1516) 39,284 (3843)
PDB SERCAT Space group Cell dimensions (Å) α, β, γ (°) Resolution (Å) a Rmerge Rpim CC1/2, CC b Completeness (%) Redundancy Observed reflections Unique reflections Refinement Resolution (Å) 1.55 2.08 2.79 1.85 No. reflections 59,690 33,860 15,866 39,239 c Rwork / d Rfree 0.193 / 0.207 0.188 / 0.225 0.238 / 0.250 0.208 / 0.234 No. Atoms Protein 1946 2012 1957 1964 DNA 1077 995 1077 1212 Zn(II) 8 8 8 8 Solvent 380 331 70 190 B Factors (Å2) Protein 36.1 38.7 48.5 52.0 DNA 29.1 42.3 44.5 59.2 Zn(II) 29.2 30.1 28.3 40.9 Solvent 39.4 41.9 56.0 45.7 R.m.s. deviations Bond lengths (Å) 0.003 0.003 0.006 0.005 Bond angles (˚) 0.5 0.5 0.5 0.5 * Values in parenthesis correspond to highest resolution shell; a Rmerge = Σ | I - | /Σ I, where I is the observed intensity and is the averaged intensity from multiple observations; b = averaged ratio of the intensity (I) to the error of the intensity (σI); c Rwork = Σ | Fobs - Fcal | /Σ | Fobs |, where Fobs and Fcal are the observed and calculated structure factors, respectively; d Rfree was calculated using a randomly chosen subset (5%) of the reflections not used in refinement. PEG=polyethylene glycol; MME=methyl ether
Table S2. Summary of X-ray data collection from SERCAT beamline (22-ID) at wavelength=1Å and refinement statistics (*)
WT1(+KTS)ZF1-4 M342R mutant DNA (5’-3’) (3’-5’) Crystallization
1 A:T (15+1) AGCGATGGGAGGGTTA CGCTACCCTCCCAATT 20% (w/v) PEG 3350 0.2 M CaCl2 P21212 85.1, 125.7, 47.1 90, 90, 90 34.06-3.09 (3.72-3.09) 0.108 (1.459) 0.029 (0.425) (0.870, 0.965) 19.9 (1.4) 98.3 (90.5) 14.5 (10.9) 139,555 9,637 (855)
3 A:T (17+1) (2 crystals) AGCGAAATGGGAGGGTTA CGCTTTACCCTCCCAATT 30% (w/v) PEG 3000 0.1 M CHES:NaOH, pH 9.5 6BLW C2221 63.56, 95.58, 83.34 90, 90, 90 29.70-1.85 (1.92-1.85) 0.132 (0.937) 0.022 (0.210) (0.909, 0.976) 25.7 (2.4) 96.6 (80.5) 32.7 (14.0) 706,666 21,595 (1764)
3 A:T (19+1) 3 A:T (19+1) (2 crystals) ATAGCGAAATGGGAGGGTTA ATCGCTTTACCCTCCCAATT 30% (w/v) PEG 3000 28% (w/v) PEG 3000 0.1 M BIS-TRIS, pH 5.2 0.1 M CHES:NaOH, pH 9.2 C2 C2 125.1, 34.4, 70.5 125.9, 34.3, 115.5 90, 121.9, 90 90, 92.6, 90 32.70-3.29 (3.41-3.29) 33.50-3.99 (4.13-3.99) 0.136 (0.450) 0.180 (0.856) 0.061 (0.268) 0.084 (0.484) (0.893, 0.971) (0.798, 0.942) 10.8 (2.4) 9.3 (1.8) 81.8 (45.5) 100.0 (100.0) 4.5 (2.4) 5.5 (4.0) 15,358 25,038 3,415 (184) 4,563 (454)
PDB Space group Cell dimensions (Å) α, β, γ (°) Resolution (Å) a Rmerge Rpim CC1/2, CC b Completeness (%) Redundancy Observed reflections Unique reflections Refinement Resolution (Å) Refinement was not 1.85 Refinement was not continued because of lack of electron No. reflections continued because of lack 21,549 density for ZF4 c Rwork / d Rfree of electron density for 0.222 / 0.247 No. Atoms ZF4 Protein 863 DNA 673 Zn(II) 4 Solvent 93 B Factors (Å2) Protein 61.4 DNA 57.8 Zn(II) 48.2 Solvent 53.3 R.m.s. deviations Bond lengths (Å) 0.005 Bond angles (˚) 0.5 * Values in parenthesis correspond to highest resolution shell; a Rmerge = Σ | I - | /Σ I, where I is the observed intensity and is the averaged intensity from multiple observations; b = averaged ratio of the intensity (I) to the error of the intensity (σI); c Rwork = Σ | Fobs - Fcal | /Σ | Fobs |, where Fobs and Fcal are the observed and calculated structure factors, respectively; d Rfree was calculated using a randomly chosen subset (5%) of the reflections not used in refinement.