A Comprehensive Model for the Recognition of Human Telomeres by TRF1
Michael Garton and Charles Laughton School of Pharmacy and Centre for Biomolecular Sciences, University of Nottingham, University Park, Nottingham NG7 2RD, UK
Correspondence to Charles Laughton:
[email protected] http://dx.doi.org/10.1016/j.jmb.2013.05.005 Edited by D. Case
Abstract Eukaryotic chromosomes are capped by telomeres, nucleoprotein complexes that prevent chromosome endto-end fusions and control cell ageing. Two proteins in this complex, telomere repeat binding factors (TRF1 and TRF2), specifically recognise the double-stranded TTAGGG tandem repeat sequence. TRF1 is a homodimer with roles governing DNA architecture and negatively regulating telomere length. We explore the conformational space of this protein–DNA complex using molecular dynamics and, for the first time, generate a complete model of TRF1–DNA recognition that has not been possible on the basis of crystallographic and NMR data alone. The results reconcile previous conflicting experimental models for the sequence selectivity of the recognition process, by confirming many of the findings while identifying important new interactions and behaviour. This improved characterisation also reveals extensive indirect readout, which suggests that recognition will be affected by changes to DNA helical parameters such as bending. © 2013 The Authors. Published by Elsevier Ltd. Open access under CC BY-NC-ND license.
Introduction Telomeres are nucleoprotein complexes that protect the ends of linear chromosomes in eukaryotes. They perform a number of functions including the prevention of chromosome end-to-end fusions 1 that occur when DNA ends are recognised as double-strand breaks. They also act as a buffer to solve the end replication problem, which means that their length is one determinant of cellular age. 2 Activation of telomerase during carcinogenesis promotes telomere elongation and cell immortalisation, currently making telomeres a subject of intense interest in cancer biology. 3–5 Telomeric DNA is always guanine rich but sequence varies depending on the organism. In humans, it consists of tandem TTAGGG repeats, and in double strands, these are specifically recognised by the telomere repeat binding factors TRF1 and TRF2. 6,7 TRF1 is a homodimer and flexible linkers between the DNA binding and dimerisation domains imbue it with remarkable spatial flexibility. 8,9 Using two Myb-like DNA binding homeodomains, this flexibility is used to direct telomere architecture. 10,11 TRF1 is also a negative regulator of telomere length 12 and recruits
other shelterin components to the DNA, ostensibly via association with TIN2. 13 The TRF1–DNA interaction has been characterised by both X-ray crystallography 14 and NMR. 15 In common with most examples of protein–DNA recognition, these structures reveal both specific contacts between the protein and DNA bases, and nonspecific contacts between the protein and the sugar-phosphate backbone. The former are readily related to the mechanisms through which the protein reads the DNA sequence, while the latter can only do so indirectly—through sequence-dependent distortions in the DNA backbone geometry. While in general crystallography and NMR reveal similar features, in detail the situation is less harmonious. The crystal structure 14 shows that two base pairs in the TTAGGGTTA binding motif are not involved in any direct DNA–protein contacts, while mutagenesis 15 finds that binding activity is lost when these bases are mutated. At the same time, bases interacting with the protein via minor-groove contacts could be mutated without an effect on binding activity. The NMR structure 15 deposited in the form of 20 conformations is discordant with the crystal structure but better explains the mutagenesis results, hinting
0022-2836 © 2013 The Authors. Published by Elsevier Ltd. Open access under CC BY-NC-ND license.
J. Mol. Biol. (2013) 425, 2910–2921
2911
Telomere Recognition by TRF1
strongly that a better understanding of dynamic aspects of the complex may be required to fully appreciate the recognition process. Two recent molecular dynamics (MD) studies have used the TRF1–DNA complex to examine water mediation in protein–DNA complexes and also explore flexibility change driven by complex formation. 16,17 However, neither of these studies focused specifically on TRF1–DNA recognition and in fact no single model that fully describes recognition is available. Here, we assess the contribution of dynamic interaction to TRF1–DNA recognition, using an extensive set of MD simulations to explore the conformational space. Based on analysis of the resulting trajectories in conjunction with previous characterisations, we propose a reconciliatory model of TRF1–telomere recognition. The model builds on many findings from the crystal structure, mutagenesis data, and NMR structure, also adding some important, completely new interactions. This model is the most complete characterisation to date of the TRF1–DNA complex and includes both hydrophobic and water-mediated interactions. Specifically, this model explains why DNA–protein recognition is compromised by a transversion in the DNA sequence at position 6 (i.e., from TTAGGGTTA to TTAGGCTTA), and why it is much less sensitive to transversion at position 8 (i.e., from TTAGGGTTA to TTAGGGTAA).
Results Calculation of root-mean-square deviations, principal component analysis, and clustering confirm equilibration and adequate sampling of conformational space by the MD simulations As explained in Methods, our approach yields six independent replicate simulations of the protein– DNA interface for each DNA sequence modelled (WT, G6C, and T8A). Calculating the root-meansquare deviations (RMSDs) for each replicate simulation using protein alpha carbons and DNA phosphates (with the first MD snapshot as the reference frame) provides a preliminary evaluation of simulation quality. For the wild-type system, Fig. S1 shows that site A of replicate 1 takes 25–30 ns to properly equilibrate, but the remainder are equilibrated by around 3 ns. Minimal fluctuation is observed across the trajectories, indicating stable simulations. RMSD is an inherently crude metric, and thus, to provide a more detailed analysis of the way the molecules explored their conformational space, we used principal component analysis (PCA) to isolate the first two principal motions of each protein–DNA complex. Figure 1 shows the extent of this exploration for each system. Good equilibration and reproducibility is observed for four of the six binding-site
replicates. Binding site A in both replicates 1 and 2 (broken lines) shows a large conformational spread because TRF1 partially dissociates from the DNA over the course of the simulation. These replicates were removed before further analysis, on the grounds that this behaviour is likely to be an artefact of simulating the interaction of the protein with a short DNA oligomer rather than with an extended length of telomeric DNA. The remaining trajectories were combined and clustered to identify the most popular conformations. Figure 2a–e shows representative structures from the most significantly populated clusters (a = 33.3%, b = 19.2%, c = 10.5%, d = 9.8%, e = 9.1). In fact, there is little variation between these structures with a maximum RMSD of 1.79 Å between any two. Some difference in the N-terminal arm position is observed and is discussed later. Taken together, RMSD, PCA, and clustering confirm the reliability and stability of the selected MD calculations, providing an appropriate level of confidence in proposing a more complete model of TRF1–DNA recognition. The equivalent analysis of the G6C and T8A mutant simulations is included in the Supplementary Material (Figs. S2–S5). The MD model identifies new contacts and describes a highly dynamic interface The MD data were used to construct a map describing TRF1–telomeric DNA recognition (Fig. 3). The map details hydrogen bonds with nitrogenous bases, contacts with the sugar phosphate backbone, water-mediated interactions, and hydrophobic patches. Figures 4 and 5 show analogous maps generated in the same way from the crystal structure and NMR data. MD predictions are in much better agreement with the NMR model than the crystal structure. This is probably because both NMR and MD explicitly describe conformational variability. The agreement between the simulations and the NMR data is evident from the observation that every conformer is sampled by MD to within an RMSD of 1.4–1.6 Å (Table S1, Supplementary Material). There is no evidence that there is a correspondence between the deposited NMR structures and the centroids of the clusters identified above (Table S2, Supplementary Material); however, this could simply reflect a difference in the criteria used to select the NMR structures for deposition. Good sampling is further confirmed by the fact that though the MD simulations were initiated from crystal structure coordinates, many “NMR only” interactions are reproduced, notably the important Lys43–A3 and Asp44–C6 contacts. These were deemed critical for the interpretation of mutagenesis experiments but not identified in the crystal structure. Completely new interactions determined by
2912
Telomere Recognition by TRF1
Fig. 1. PCA of all six binding sites for the wild-type TRF1–DNA complex. Four out of six replicates (solid lines) are closely clustered, indicating well-reproduced MD. Site A of replicates 1 and 2 (broken lines) spread outside of the cluster because DNA binding is partially lost; data from these replicates were discarded.
MD further reinforce mutagenesis results, particularly interaction between Arg47 and G6, which suggest why binding activity is lost upon G6 mutation. The MD model identifies many new contacts, which cumulatively show that every base pair in the TTAGGGTTA binding motif is involved in direct (base-specific) readout. Added to this, the THR 48–A7 interaction extends the major-groove binding motif by one more base pair than previously determined. Altogether, this means a significantly greater incidence of both direct and indirect readout interactions (Table 1). In addition to new contacts, the MD model augments its predecessors by confirming and increasing the extent of dynamic interaction at the interface. NMR suggests three oscillatory dinucleotide contacts, through which a single amino acid can interact alternately with both base pairs in a base pair step. MD confirms these and identifies an additional oscillatory contact of Arg47 with G5 and G6. This is particularly interesting as the newly found interaction between these bases is implicated in correctly orienting TRF1 binding (see the next section). The MD map also reveals that though, as
the crystal structure shows, the N-terminal Arg2 can make contacts simultaneously with three bases, in reality, it is likely to be a highly dynamic interaction. Approximately half of the total hydrogen bonds identified exploit the sugar phosphate backbone, suggesting that recognition and binding affinity are dependent on DNA helix morphology. Helical parameter changes such as untwisting, bending, and kinking will disrupt the network of contacts with ramifications for the binding affinity. 18 Indirect readout can however still contribute to the sequence specificity of protein– DNA recognition because DNA helical morphology is sequence directed as a consequence of base stacking interaction differences. 19 In terms of binding-site hydration, this model is in agreement with very recent MD studies that have focused on the microscopic dynamic properties of water surrounding the TRF1–DNA complex. 16,17 Specific, persistent sites of solvent density shown in Fig. 6a–f confirm the rigid water layer described in these studies. Red spheres represent 10 crystallographic waters identified at the interface. MD agrees remarkably well, confirming these and predicting an additional water-bridged contact between 3 residues
Fig. 2. Structures (a–e) representing (in descending order) the most highly populated conformations from cluster analysis. For clarity, the DNA bases are colour coded in accordance with contact map colouring: adenine, yellow; guanine, green; cytosine, blue; and thymine, purple.
Telomere Recognition by TRF1
2913
Fig. 3. Map describing every interaction between TRF1 and telomeric DNA as determined by MD. Direct readout (red line), indirect readout (black dotted line), hydrophobic patches (black broken line), and water-mediated contacts (blue line) are all delineated and show many more interactions than either the crystal or NMR structures (Table 1). Direct and indirect contacts are annotated with percentage occupancy and standard deviation.
and G6. Interestingly, though the sites of solvent density are clear and sharp, an analysis of residence times extracted from the MD data (Table 2) shows that, in all cases, waters exchange rapidly at an average rate of 21.9 different molecules per nanosecond, a value in line with other MD studies on biomolecular hydration. 17,20,21
Court et al. 14 in discussing the crystal structure did not specifically identify hydrophobic group interactions; however, in our analysis of this structure, we did find a single instance of close proximity between hydrophobic moieties in the DNA and protein, and six in the case of the NMR models. In this case, our MD supports the crystallography in identifying only one
2914
Telomere Recognition by TRF1
Fig. 4. Map describing the interactions between TRF1 and telomeric DNA as determined by crystallography (Protein Data Bank entry 1W0T 14). Direct readout (red line), indirect readout (black dotted line), hydrophobic patches (black broken line), and water-mediated contacts (blue line) are delineated, showing significantly less interaction than MD (Table 1). No occupancy information can be determined from the single conformation.
good hydrophobic contact, with the caveat that TIP3P is a simple water model and not thoroughly validated in reproducing protein hydrophobicity. Time-dependent measurements indicate that most instances of proximity between TRF1 and thymine methyl groups are transient. The relatively few NMR conformations appeared to capture these transient states.
Newly identified sequence-specific contacts with the G6/C6 base pair are vital to DNA backbone clamping While the newly identified contacts between TRF1 and G6/C6 in the wild-type sequence make it much easier to explain why transversion at this site
Telomere Recognition by TRF1
2915
Fig. 5. Map describing the interactions between TRF1 and telomeric DNA as determined by solution NMR (Protein Data Bank entry 1IV6 16). Direct readout (red line), indirect readout (black dotted line), and hydrophobic patches (black broken line) are delineated. Direct and indirect contacts are annotated with their percentage occupancy and standard deviation using the 20 conformations published. With the exception of hydrophobic patches, less interaction is captured compared with MD (Table 1) and no water-mediated interaction could be analysed.
markedly reduces interaction, 15 confirmation of this requires analysis of the mutant system (G6C) to ensure that no alternative, compensatory interactions are possible. This is indeed the situation.
Table 1 and Fig. S6 show that though the interaction of Arg47 with the guanine base is apparently able to “follow” its transversion, the interaction of Asp44 with the cytosine counterpart cannot. In addition,
2916
Telomere Recognition by TRF1
Table 1. Comparing the number of contacts for each model including crystal structure and solution NMR together with MD wild-type and mutated DNA models Model Crystal structure NMR MD (wild type) MD (G6C mutant) MD (T8A mutant)
H-bonds to bases
H-bonds to backbone
Hydrophobic contacts
Water-mediated contacts
6 7 10 7 10
6 8 12 7 12
1 6 1 1 0
10 NM 11 7 8
NM, not measured. MD captures a much broader set of hydrogen bonds than the other methods while indicating less involvement of the hydrophobic effect. The loss of binding engendered by G6-to-cystosine mutation is seen by a reduced interface, while the ineffectual T8to-adenine mutation shows little reduction.
many contacts to neighbouring, non-mutated base pairs in the DNA are also lost. Figure 7 shows the effect of this mutation on major-groove width, measured using Curves +. Characteristic DNA backbone clamping around the insertion helix is greatly reduced, explaining the loss of non-mutated contacts and the dramatic loss of binding activity reported from mutagenesis. DNA backbone narrowing is a classic feature of helix–turn–helix (HTH)– DNA recognition, 22 and its impairment by mutation provides explanation for the severe binding reduction. Importantly, there is little evidence that it is changes in the intrinsic structure or elastic properties of the local DNA sequence, brought about by the change in sequence, that contribute much to this selective recognition. Thus, comparison of both mutated and wild-type free-DNA helical parameters indicates that the G6C mutation does little to change DNA morphology in the unbound state or, as evidenced by the error bars for major-groove width and helical twist, relevant elastic constants (Fig. 8). It seems therefore that recognition is “direct”—the change in TRF1 orientation engendered by Arg47 relocation causes the misalignment of other critical contacts. N-terminal promiscuity explains tolerance to mutation at positions 8 and 9 Binding to the telomeric minor groove by TRF1 occurs via the homeodomain N-terminal arm. While the significance of this arm in binding affinity is well established, 6,23 the argument that it contributes a high level of specificity is disputed here. Biophysical work has shown that mutation of T8 and A9 engenders no significant loss of binding activity. 15 Our simulations of the T8A model show that when T8/A8 undergo transversion, the long flexible Nterminal arm is able to simply relocate with the same affinity to different thymine or adenine bases in the minor groove (S7, Supplementary Material). This promiscuity is also evident by comparison of the MD (Figs. 2 and 3), NMR (Fig. 5), and crystal models (Fig. 4), which show significant variation between models in terms of bases contacted by N-terminal Arg2. While the arm undoubtedly contributes to
binding affinity, we predict that its only specific discriminatory function will be between T + A and G + C. Functionally, by retaining some protein flexibility on binding, TRF1 may mitigate the entropic penalty normally imposed by immobilisation at the DNA interface. 24 Arm flexibility also means that any conformational adjustment in this region to optimise interactions with the local DNA sequence does not influence critical aspects of the corresponding HTH domain orientation or conformation. Table 1 shows that mutation of the minor groove (T8A) and subsequent N-terminal relocation have no effect on the overall number of hydrogen bonds involved in total recognition.
Discussion We have used an extensive set of MD simulations to explore in detail the conformational space of telomeric DNA in complex with TRF1. A comprehensive picture of protein–DNA recognition has been produced, which includes both water-mediated interactions and hydrophobic group involvement in addition to direct protein–DNA hydrogen bonding. The simulations emphasise how individual amino acids can contribute to the specific recognition of more than one DNA base, not only in a static manner —through bifurcated interactions, but also in a dynamic manner—by oscillatory interactions. Analysis of hydration density maps, integrated over 200 ns of MD, identifies regions of intransient solvent mediation. These regions persist across a range of bound state conformations and therefore provide a different perspective to the X-ray picture of individual molecules of hydration, stabilised at a single crystallographic conformation. In view of this, it is quite remarkable that the MD predicts a pattern of water-mediated contacts that matches the crystallographic data very well; just one extra highoccupancy water is predicted, interacting with G6. Mutation of the G6/C6 base pair by transversion causes loss of the characteristic width reduction in the DNA major groove that optimises DNA–protein interactions and highlights the importance of the (MD
Telomere Recognition by TRF1
2917
Fig. 6. Map showing hydration density at the TRF1–DNA interface from six different viewpoints (a–f). Areas of intransient solvent density as determined by MD (grey mesh) are largely consistent with the crystallographic waters (red spheres). The N-terminal arm is colour coded orange; helices 2 and 3 are purple and green, respectively. On the DNA: adenine is yellow; thymine is purple; guanine is green; cytosine is blue. These correspond to the contact map colour coding.
2918
Telomere Recognition by TRF1
Table 2. Dynamic characteristics of water-mediated protein–DNA interactions Site 1 2 3 4 5 6 7 8 9 10 11
Between residues
Unique WAT/ns
Maximum residence time (ps)
Average residence time (ps)
Ser39 and T1 Ser39 and T2 Ser39 and A3 Lys43 and A3 Arg47 and Asp44 (A7) Arg47 and Trp46 and Asp44 (G6) A7 (Arg47 and Asp44) Met41 and C5, C6 G6 (Arg47 and Trp46 and Asp44) Asp44 (C4 and C5) C4 and C5 (Asp44)
18.6 22.6 22.1 20.8 19.3 24.0 25.9 21.3 26.1 20.0 20.7
680 387 441 336 585 304 337 519 362 539 411
57 45 45 48 52 42 39 47 38 50 48
Sites are numbered as in Fig. 3. Hydration sites shown in bold were only observed in the MD simulations, all others correlate with crystallographically observed sites.
identified) Arg47–G5/G6 dinucleotide interaction. The widespread loss of other hydrogen bonds as a consequence of this loss is consistent with experimental findings 15 that, for this mutation, report a severe reduction in binding affinity. What we cannot fully quantitate though at this stage is whether the failure to achieve a narrow major groove is an intrinsic property of the mutated DNA sequence (“the groove fails to narrow, so good contacts cannot be formed”) or of the recognition requirements of the protein–DNA interaction (“good contacts cannot be formed, so the groove fails to narrow”). Experimental observations indicating that the interaction of the homeodomain N terminus with the telomeric DNA is actually rather non-sequencespecific 15 are confirmed by our model. We show that N-terminal arm flexibility allows it to effectively relocate to more favourable bases in the event of minor-groove transversion. This promiscuity does
not affect the rest of the binding-site orientation as damping by the flexible arm ensures that the HTH hydrogen-bonding network remains intact. More than half of the identified hydrogen bonds are contacts to the sugar phosphate backbone. Recognition of DNA by TRF1 and subsequent binding affinity are therefore likely to be sensitive to DNA morphological changes that disrupt this network. 18 This is in line with experimental observations that TRF1 avoids telomeric T- and D-loop regions. The predominant localisation of TRF2 over TRF1 at these structures may indicate a different binding mode for TRF2 and work is ongoing to establish this. The ability of MD simulations to couple the atomic detail of crystallographic studies with the dynamic insights from NMR, and through doing so to reconcile apparently conflicting aspects of a molecular recognition problem, reinforces the view that a combination of experimental and theoretical techniques is preferred
Fig. 7. DNA major-groove width using the trajectories of TRF1-bound wild type and G6C DNA compared with free telomeric DNA. Sugar phosphate backbone narrowing around the HTH insertion helix is an important feature of binding, which is lost by mutation of G6 to a cytosine.
Telomere Recognition by TRF1
2919
Fig. 8. Comparison of helical parameters: twist, major-groove width, minor-groove width, roll, and rise for wild type and G6C mutant DNA, both complexed and uncomplexed. Bars showing standard deviation for twist and major-groove width indicate that G6C mutation does little to alter flexibility.
—indeed maybe necessary—to accurately characterise biological interactions.
Methods MD calculations were performed on the TRF1– TAGGGTTA complex using Protein Data Bank file 1W0T 14 as a starting point. The crystal structure is composed of a 19nucleotide section of DNA with two TRF1 proteins bound to adjacent binding sites. Models were constructed for both wildtype DNA and two mutations: G6C and T8A (G6C: TTAGGG → TTAGGC, T8A: TTAGGG → TAAGGG). Mutation choices were based on apparent discrepancies between the crystal structure and mutagenesis work: 15 G6C resulted in dramatic loss of binding activity and T8A at T8 reportedly had no effect. In both cases, the opposite effect was predicted by the crystal structure while NMR did not adequately explain such a severe drop-off in affinity. The models were prepared for MD simulation using the WHATIF
web interface 25 to build in any missing atoms and identify protonation states. They were then explicitly solvated (approximately 11,500 waters per model) in a 10-nm 3 box of TIP3P water using TLEAP in AMBER 10. 26 Sodium counterions were added for overall charge neutrality, and periodic boundary conditions were applied with the following box dimensions: 63 nm × 66 nm × 71 nm. Bonds to hydrogen were constrained using SHAKE 27 to permit a 2-fs time step, and the particle mesh Ewald 28 algorithm was used to treat long-range electrostatic interactions. The non-bonded cutoff was set at 12.0 Å. Systems were energy minimised using a combination of steepest descent and conjugate gradient methods. MD calculations were carried out with the PMEMD module of AMBER 10 in conjunction with the FF99 Barcelona force field, 29 which is specifically customised for nucleic acids. The FF99 Stony Brook force field30 was used for the protein. Each system was equilibrated and heated over 100 ps to 300 K and positional restraints were gradually removed. A Berendsen thermostat and barostat was used throughout for both temperature and pressure
2920
Telomere Recognition by TRF1
regulation. 31 Production MD runs of 3 × 50 ns replicates for each system were obtained from randomised starting velocities. Since each simulation features two independent copies of the protein in complex with its cognate (or mutated) DNA sequence (which we refer to as site A and site B), in this way, a total of 300 ns of conformational space exploration was obtained for each mutation. Exact protocols for the minimisations, equilibrations, heating, and production runs are delineated in the Supplementary Material. During calculations, a snapshot was saved every 2 ps. RMSDs (Fig. 1a) and PCA (Fig. 1b) were used to assess the equilibration and reproducibility of each replicate by employing an in-house code: PCAZIP. 32 Contour map outlines were generated from two-dimensional histograms of the first two PCA projections using matplotlib. 33 Subdivided into a 50 × 50 grid, the outline is drawn around all grid bins that have a data point occupancy ≥0.025% of the total points. PCA revealed instability in two of the six binding-site replicates and these were discarded before further analysis. RMS clustering of the trajectory frames was carried out using the MMTSB toolset 34 with the radius set to 2 Å and maxerr set to 1. Bonds were identified and quantified using PTRAJ in AMBER together with VMD. 35 Hydrogen bond length cutoff was set at 3.5 Å including water-mediated contacts, and groups participating in hydrophobic effects were defined as non-polar, nonbonded contacts b 5 Å. DNA helical parameters were derived using Curves+. 36 PTRAJ was also used to map hydration density using the “Grid” function and the protocol for this is described in the Supplementary Material. Structural alignments were performed and RMSDs were calculated using PyMOL 37 from selected structures representing the most highly populated clusters.
Acknowledgements This work was supported by the University of Nottingham and Regain, the charity for sports tetraplegics. Computational facilities were supported by the Biotechnology and Biological Sciences Research Council (Grant reference BB/F011407/1).
Supplementary Data Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.jmb.2013.05.005 Received 23 February 2013; Received in revised form 10 May 2013; Accepted 13 May 2013 Available online 20 May 2013 Keywords: telomeric; simulation; dynamics; interaction; map
Abbreviations used: MD, molecular dynamics; PCA, principal component analysis; HTH, helix–turn–helix.
References 1. Bourgain, F. M. & Katinka, M. D. (1991). Telomeres Inhibit end-to-end fusion and enhance maintenance of linear DNA-molecules injected into the parameciumprimaurelia macronucleus. Nucleic Acids Res. 19, 1541–1547. 2. Levy, M. Z., Allsopp, R. C., Futcher, A. B., Greider, C. W. & Harley, C. B. (1992). Telomere endreplication problem and cell aging. J. Mol. Biol. 225, 951–960. 3. Willeit, P., Willeit, J., Kloss-Brandstatter, A., Kronenberg, F. & Kiechl, S. (2011). Fifteen-year follow-up of association between telomere length and incident cancer and cancer mortality. JAMA, 306, 42–44. 4. Perona, R. (2010). The Nobel Prize in Physiology or Medicine 2009 “for telomere biology” and its relevance to cancer and related diseases. Clin. Transl. Oncol. 12, 647–649. 5. Heaphy, C. M. & Meeker, A. K. (2011). The potential utility of telomere-related markers for cancer diagnosis. J. Cell. Mol. Med. 15, 1227–1238. 6. Konig, P., Fairall, L. & Rhodes, D. (1998). Sequencespecific DNA recognition by the Myb-like domain of the human telomere binding protein TRF1: a model for the protein–DNA complex. Nucleic Acids Res. 26, 1731–1740. 7. Zhong, Z., Shiue, L., Kaplan, S. & de Lange, T. (1992). A mammalian factor that binds telomeric TTAGGG repeats in vitro. Mol. Cell. Biol. 12, 4834–4843. 8. Bianchi, A., Stansel, R. M., Fairall, L., Griffith, J. D., Rhodes, D. & de Lange, T. (1999). TRF1 binds a bipartite telomeric site with extreme spatial flexibility. EMBO J. 18, 5735–5744. 9. Griffith, J., Bianchi, A. & de Lange, T. (1998). TRF1 promotes parallel pairing of telomeric tracts in vitro. J. Mol. Biol. 278, 79–88. 10. Bianchi, A., Smith, S., Chong, L., Elias, P. & de Lange, T. (1997). TRF1 is a dimer and bends telomeric DNA. EMBO J. 16, 1785–1794. 11. Rhodes, D., Fairall, L., Simonsson, T., Court, R. & Chapman, L. (2002). Telomere architecture. EMBO Rep. 3, 1139–1145. 12. van Steensel, B. & de Lange, T. (1997). Control of telomere length by the human telomeric protein TRF1. Nature, 385, 740–743. 13. O'Connor, M. S., Safari, A., Xin, H. W., Liu, D. & Songyang, Z. (2006). A critical role for TPP1 and TIN2 interaction in high-order telomeric complex assembly. Proc. Natl Acad. Sci. USA, 103, 11874–11879. 14. Court, R., Chapman, L., Fairall, L. & Rhodes, D. (2005). How the human telomeric proteins TRF1 and
2921
Telomere Recognition by TRF1
15.
16.
17.
18. 19. 20.
21.
22. 23. 24.
25. 26.
TRF2 recognize telomeric DNA: a view from highresolution crystal structures. EMBO Rep. 6, 191. Nishikawa, T., Okamura, H., Nagadoi, A., Konig, P., Rhodes, D. & Nishimura, Y. (2001). Solution structure of a telomeric DNA complex of human TRF1. Structure, 9, 1237–1251. Sinha, S. K. & Bandyopadhyay, S. (2011). Conformational fluctuations of a protein–DNA complex and the structure and ordering of water around it. J. Chem. Phys. 24, 245104. Sinha, S. K. & Bandyopadhyay, S. (2011). Dynamic properties of water around a protein–DNA complex from molecular dynamics simulations. J. Chem. Phys. 13, 135101. Lawson, C. L. & Berman, H. M. (2008). Indirect Readout of DNA Sequence by Proteins Royal Soc Chemistry, Cambridge, UK. Hunter, C. A. (1993). Sequence-dependent DNAstructure—the role of base stacking interactions. J. Mol. Biol. 230, 1025–1054. Castrignano, G. C., Chillemi, G. & Desideri, A. (2000). Structure and hydration of BamHI DNA recognition site: a molecular dynamics investigation. Biophys. J. 79, 1263–1272. Gorfe, A. A., Caflisch, A. & Jelesarov, I. (2004). The role of flexibility and hydration on the sequence-specific DNA recognition by the Tn91 integrase protein: a molecular dynamics analysis. J. Mol. Recognit. 17, 120–131. Blackburn, G. M., Gait, M. J., Loakes, D. & Williams, D. M. (2006). Protein–Nucleic Acid Interactions Royal Soc Chemistry, Cambridge, UK. Ades, S. E. & Sauer, R. T. (1995). Specificity of minorgroove and major-groove interactions in a homeodomain–DNA complex. Biochemistry, 34, 14601–14608. Tóth-Petróczy, A., Simon, I., Fuxreiter, M. & Levy, Y. (2009). Disordered tails of homeodomains facilitate DNA recognition by providing a trade-off between folding and specific binding. J. Am. Chem. Soc. 42, 15084–15085. Vriend, G. (1990). WHAT IF—a molecular modeling and drug design program. J. Mol. Graphics, 8, 52. Case, D. A., Cheatham, T. E., Darden, T., Gohlke, H., Luo, R., Merz, K. M. et al. (2005). The Amber
27. 28.
29.
30.
31.
32.
33.
34. 35. 36.
37.
biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688. Ryckaert, J. P., Ciccotti, G. & Berendsen, H. J. C. (1977). The SHAKE algorithm. J. Comput. Phys. 23, 327–341. Toukmaji, Abdulnour, Sagui, Celeste, Board, John & Darden, Tom (2000). Efficient particle-mesh Ewald based approach to fixed and induced dipolar interactions. J. Chem. Phys. 113, 1091–1093. Perez, A., Marchan, I., Svozil, D., Sponer, J., Cheatham, T. E., Laughton, C. A. & Orozco, M. (2007). Refinement of the AMBER force field for nucleic acids: improving the description of alpha/gamma conformers. Biophys. J. 92, 3817–3829. Hornak, V., Abel, R., Okur, A., Strockbine, B., Roitberg, A. & Simmerling, C. (2006). Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins, 65, 712–725. Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., DiNola, A. & Haak, J. R. (1984). Molecular dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684–3690. Meyer, T., Ferrer-Costa, C., Perez, A., Rueda, M., Bidon-Chanal, A., Luque, F. J. et al. (2006). Essential dynamics: a tool for efficient trajectory compression and management. J. Chem. Theor. Comput. 2, 251–258. Barrett, P., Hunter, J.D. & Greenfield, P. (2004). Matplotlib—Portable Python Plotting Package, Astronomical Data Analysis Software & Systems XIV. Feig, M., Karanicolas, J. & Brooks, C.L., III. (2001). MMTSB Tool Set, MMTSB NIH Research Resource, The Scripps Research Institute. Humphrey, W., Dalke, A. & Schulten, K. (1996). VMD: visual molecular dynamics. J. Mol. Graphics, 14, 33. Lavery, R., Moakher, M., Maddocks, J. H., Petkeviciute, D. & Zakrzewska, K. (2009). Conformational analysis of nucleic acids revisited: curves. Nucleic Acids Res. 37, 5917–5929. Schrödinger, L. The PyMOL Molecular Graphics System, Version 1.2r3pre.