'$ &' *
BIOINFORMATICS
DIAMOD: display and modeling of DNA bending &*,) $#" & '&. ))"& +'& ()+%&+ ' "'!%"*+). &"-)*"+. ' - + &' &'
/
Abstract Motivation: DIAMOD (Display and modeling of DNA) was created as a user-friendly software for exploring and better understanding DNA structural variations, particularly DNA bending. It was intended to be as open as possible so that any of the existing or future predictive models can be used with it. Results: DIAMOD features graphic display and interactive manipulation of DNA molecules on the screen. Since it works with di-, tri- or tetranucleotide models supplied as external files of angular parameters, it was recently used to evaluate critically all available predictive models for DNA bending. The program has a unique option to insert bends at defined positions in DNA sequence independently of the currently used model, which enables the simulation of both intrinsic and protein-induced kinking. Finally, many output file formats facilitate the sharing of data with other programs and the creation of visually pleasing images. Availability: The program is available on request to academic users free of charge. It will be distributed via the WWW (http://www-personal.umich.edu/∼mensur/software.html). Users with no network access can get a copy directly from the author. Contact:
[email protected]
Introduction Sequence-dependent structural variations of DNA have been demonstrated using various experimental techniques over the past couple of decades (Travers, 1989; Crothers et al., 1990; Hagerman, 1990; Palecek, 1991; Dickerson, 1992; Harrington and Winicov, 1994; Lilley et al., 1996). Although the smaller number of DNA building blocks and relatively simple secondary structure imply a more constricted range of conformations compared with proteins, subtle variations in DNA microstructure add up to produce molecules of complex shapes and with distinctive roles in transcriptional regulation. The configuration of DNA is a vectorial sum of
1Present address for correspondence: Howard Hughes Medical Institute, 4570 MSRB II, 1150 W. Medical Center Drive, Ann Arbor, MI 48109-0650, USA
341
small helical deflections (bends) associated with individual base steps. One of the most important roles of intrinsic DNA bending, besides architectural, is the ability to form custom sites for protein binding, thus lowering the thermodynamic requirements for DNA deformation and increasing the chance for recognition by proteins (Drew and Travers, 1985; Erie et al., 1994; Parvin et al., 1995; Starr et al., 1995). For that reason, knowledge of intrinsic DNA structural properties would ultimately help elucidate the code of DNA recognition by proteins (Nekludova and Pabo, 1994; Juo et al., 1996). Despite the significant research and theoretical efforts, the matter of the origin of DNA bending is not completely resolved at present (Dickerson et al., 1996; Dlakic and Harrington, 1996; Dlakic et al., 1996). Although this question remains the ultimate goal of DNA structural biologists, it has been recognized for quite some time that even seemingly opposite models can reach similar conclusions if certain structural assumptions are satisfied (Haran et al., 1994; Harvey et al., 1995; Dlakic et al., 1996). Several groups have developed their own predictive models based on the crystallographic or NMR data, by deduction from theoretical molecular mechanics and gel mobility studies, or from the analysis of DNase I cutting profiles (Satchwell et al., 1986; De Santis et al., 1990; Bolshoy et al., 1991; Goodsell and Dickerson, 1994; Bansal et al., 1995; Brukner et al., 1995; Gorin et al., 1995; Ulyanov and James, 1995). Our intention in developing DIAMOD (Display and modeling of DNA) was to incorporate all these models and use them for any given DNA sequence to aid comparison with experimental data. We accomplished this by using external ASCII files that contain angular parameters from various models. In addition, the program can generate user-defined bends on top of used structural models. We initially implemented this option to study intrinsic and protein-induced kinks, but it later proved to be useful in reconstructing DNA structures for which angular parameters associated with each dinucleotide are known. DIAMOD is user friendly, with many help screens, and runs on IBM PC or compatible computers with any graphic card, thus eliminating the need for expensive equipment. It features an interactive graphic display where molecules can be rotated about all three axes, zoomed, shown in stereo and viewed either in perspective or ortho Oxford University Press
M.Dlakic and R.E.Harrington
graphic projection. Another practical value of this program is that it produces many different output files that can be viewed and manipulated under various operating systems.
System and methods DIAMOD was written in Borland Turbo Pascal 6.0 for the IBM PC and compatible computers. Although DIAMOD is a DOS application, it runs under Windows 3.1 and Windows 95. The program does not require the math co-processor. However, it will take full advantage of it if present, particularly during the calculations needed for object movements on the graphic screen. It has been tested only with EGA and VGA graphic cards, but has a built-in support for all standard display devices. A help window is invoked from any text or graphic screen by pressing functional key F1. Simple printouts capturing the graphic screen can be made directly from the program on Hewlett Packard LaserJet or compatible laser printers. All output file types produced by DIAMOD are platform independent and can usually be processed further using publicly available or copyrighted freeware programs that exist for nearly all types of operating systems. For example, PDB files can be displayed and further modified in a variety of programs, one of which is freely available: RasMol (Sayle and Milner-White, 1995). Kinemage files can also be viewed under different operating systems (Richardson and Richardson, 1992, 1994). Persistence of Vision (PoV) files serve as input for the PoV-Ray rendering package which produces photographic quality images (The Persistence of Vision Development Team, 1991–97). Postscript files can be printed directly on postscript-capable printers, while HPGL files are primarily intended for Hewlett Packard plotters and compatible printers. Finally, PCX graphic files can be displayed in and printed from virtually any commercial graphic program, and can be imported into most word processors. All these files can be used in their respective programs without modifications from their original DIAMOD output. However, PDB files may require minimization depending on the type of structure generated.
tilt, roll and twist were taken from predictive models, while the rise per base pair was constant at 3.38 Å (Chandrasekaran and Arnott, 1996). Once the coordinates of the Nth base pair are defined, they are calculated for the (N + 1)th base pair by applying twist-angle/2 rotation about the z-axis, roll-angle rotation about the y-axis, tilt-angle rotation about the x-axis, another twist-angle/2 rotation about the z-axis and, finally, 3.38 Å translation in the positive direction of the z-axis [see Shpigelman et al. (1993) for more detailed description of applied transformations].
Implementation After the introductory screen, the program offers the choice between di-, tri- and tetranucleotide models, and shows possible choices. The user can move through the list of files with a cursor and make a choice. Next the user chooses a DNA sequence file which is then drawn on a graphical screen. Displayed molecules can be rotated, zoomed and drawn in stereo. Before quitting the graphic screen, it is important to make the view that is most informative because all output files will have that same orientation. After making the appropriate output file(s), the user can repeat the process with another DNA sequence or quit the program. Figure 1 shows images produced by rendering DIAMOD output files in PoV Raytracer (The Persistence of Vision Development Team, 1991–97). Figure 1A is a smooth cylinder representation that shows only a global shape of the molecule. The depiction in Figure 1B was inspired by earlier theoretical treatment of DNA as a worm-like chain—segments in that image correspond to base pairs. These two representations work best for longer DNA molecules. Figure 1C and D offer space-filling all-atom representation of a 21 bp DNA molecule. Atoms in Figure 1C are color coded according to standard nomenclature for DNA, while in Figure 1D the color distinction was made only between bases and the backbone. Although this representation can be used with DNA sequences of any size, it is most effective with lengths up to 50 bp.
Algorithm
Results and discussion
DIAMOD begins with idealized Cartesian coordinates of a base pair (Chandrasekaran and Arnott, 1996) which are defined as a local system of coordinates. Since it was assumed that generated structures will not be strictly regular, the use of the global coordinate system—one defined with regard to the overall DNA helix—is not advisable. The position of the next base pair in the sequence is completely determined by three rotations about and three translations along principal axes of the base pair (Dickerson et al., 1989). However, the three rotations (tilt, roll and twist) and one translation (rise) are most important in generating the global shape of DNA molecules and were the only ones used. Angular values for
Figure 2 shows modeling of repetitive DNA sequence (gggccctagaggggccctagaggggccctaga)4 (Brukner et al., 1994; Dlakic and Harrington, 1995) using dinucleotide and trinucleotide models (Satchwell et al., 1986; De Santis et al., 1990; Bolshoy et al., 1991; Goodsell and Dickerson, 1994; Ulyanov and James, 1995). The advantage of the program is that it supports all currently available predictive models and will also accommodate future ones as long as input files describing them are supplied in ASCII format that is readable by DIAMOD. This unique facility of the program enables quick and comprehensive evaluation of all predictive models against available experimental data, which has been used to ascertain
342
Display and modeling of DNA bending
Fig. 1. Graphic representations of DNA molecules created by rendering the DIAMOD output files in PoV Raytracer (The Persistence of Vis ion Development Team, 1991–97). Examples in (A), (C) and (D) are actually color images on a black background, but were adapted to grayscale images on a white background for the purposes of this figure. Color figures can be seen at http://www-personal.umich.edu/∼mensur/software.html.
the usefulness of individual models in predicting DNA structure (M.Dlakic and R.E.Harrington, in preparation). DNA kinking is a localized form of DNA bending, usually within one dinucleotide step, but typically of much larger magnitude. It usually involves unstacking of adjacent bases, which need not be the case for bending. For that reason, DNA kinking is considered as DNA bending, but not vice versa (Werner et al., 1996). Kinking is observed in many protein– DNA complexes (reviewed in Werner et al., 1996), and recently also in free DNA under certain conditions (Han et al., 1997a,b). DIAMOD was designed to incorporate such local distortions at user-specified locations regardless of the model used to generate the structure of the rest of the DNA sequence. This is achieved by placing a defined string of characters between bases in DNA sequence files, which makes the program assign them angular parameters from an independent file. Although this option was originally implemented to simulate intrinsic and protein-induced kinks in DNA sequence, it can also
be used to regenerate known DNA structures. This process begins by determining the angular parameters for each dinucleotide of known structure. We arbitrarily selected B-DNA decamer with a central TA step (Goodsell et al., 1994) and calculated local dinucleotide parameters using CURVES (Lavery and Sklenar, 1988). Successive ‘kinks’ were then inserted using these dinucleotide parameters and a new molecule was created. The overlap of the experimental and reconstructed molecule is shown in Figure 3A. The RMS deviation between C1* atoms of these two molecules is 0.88 Å. We consider that a good agreement since DIAMOD uses only three rotations and assumes constant rise per residue. Another use of this program feature is shown in Figure 3B. DNA from the yeast TFIIA/TBP/DNA complex (Tan et al., 1996) was reconstructed and extended on both sides by 32 bp of B-DNA. A view from the side (lower part of Figure 3B) makes it easier to see that TBP bends DNA by ∼80_. In addition, a view from the top (upper part of Figure 3B) shows that cumulative DNA
343
M.Dlakic and R.E.Harrington
Fig. 2. Graphical representation of DNA sequence (gggccctagaggggccctagaggggccctaga)4 (Brukner et al., 1994; Dlakic and Harrington, 1995) using: (A) dinucleotide model of Bolshoy et al. (1991); (B) dinucleotide model of De Santis et al. (1990); (C) dinucleotide model of Ulyanov and James (1995); (D) trinucleotide model defined by Goodsell and Dickerson (1994) from experimental data of Satchwell et al. (1986).
unwinding of the binding site (>100_) brings individual kinks into one plane, thus yielding a directional bend instead of a writhe (Kim,J.L. et al., 1993; Kim,Y.C. et al., 1993; Juo et al., 1996). The examples presented above illustrate the universal applicability of DIAMOD with all predictive models. DIAMOD is fully interactive and enables complete examination of displayed molecules without any external programs. However, several standard output files provide an interface with other programs for additional modeling and/or the production of publication quality images.
Acknowledgements This work was supported by grants CA70274 and GM53517 from the NIH, and from USDA Hatch Project NEV032D through the Nevada Agricultural Experiment Station (R.E.H.). M.D. acknowledges conceptual advice from Dima Rekesh and financial help from the Open Society Institute.
344
References Bansal,M., Bhattacharyya,D. and Ravi,B. (1995) NUPARM and NUCGEN: Software for analysis and generation of sequence dependent nucleic acid structures. Comput. Applic. Biosci., 11, 281–287. Bolshoy,A., McNamara,P.T., Harrington,R.E. and Trifonov,E.N. (1991) Curved DNA without AA: experimental estimation of all 16 DNA wedge angles. Proc. Natl Acad. Sci. USA, 88, 2312–2316. Brukner,I., Susic,S., Dlakic,M., Savic,A. and Pongor,S. (1994) Physiological concentration of magnesium ions induces a strong macroscopic curvature in GGGCCC-containing DNA. J. Mol. Biol., 236, 26–32. Brukner,I., Sanchez,R., Suck,D. and Pongor,S. (1995) Trinucleotide models for DNA bending propensity: Comparison of models based on DNase I digestion and nucleosome packaging data. J. Biomol. Struct. Dynam., 13, 309–317. Chandrasekaran,R. and Arnott,S. (1996) The structure of B-DNA in oriented fibers. J. Biomol. Struct. Dynam., 13, 1015–1027. Crothers,D.M., Haran,T.E. and Nadeau,J.G. (1990) Intrinsically bent DNA. J. Biol. Chem., 265, 7093–7096.
Display and modeling of DNA bending
Fig. 3. (A) DNA structure from Goodsell et al. (1994) superimposed with the structure regenerated by DIAMOD. Angular parameters for original DNA structure were calculated using CURVES (Lavery and Sklenar, 1988). The RMS deviation between C1* atoms is 0.88 Å and 1.22 Å for all atoms. (B) DNA structure from a ternary complex with yeast TBP and TFIIA (Tan et al., 1996) was regenerated and extended by 32 bp on both sides. Top and bottom views are related by 90_ rotation from the plane of the paper.
De Santis,P., Palleschi,A., Savino,M. and Scipioni,A. (1990) Validity of the nearest neighbor approximation in the evaluation of the electrophoretic manifestations of DNA curvature. Biochemistry, 29, 9269–9273. Dickerson,R.E. (1992) DNA structure from A to Z. Methods Enzymol., 211, 67–111. Dickerson,R.E. et al. (1989) Definitions and nomenclature of nucleic acid structure parameters. EMBO J., 8, 1–4. Dickerson,R.E., Goodsell,D. and Kopka,M.L. (1996) MPD and DNA bending in crystals and in solution. J. Mol. Biol., 256, 108–125. Dlakic,M. and Harrington,R.E. (1995) Bending and torsional flexibility of G/C-rich sequences as determined by cyclization assays. J. Biol. Chem., 270, 29945–29952. Dlakic,M. and Harrington,R.E. (1996) The effects of sequence context on DNA curvature. Proc. Natl Acad. Sci. USA, 93, 3847–3852. Dlakic,M., Park,K., Griffith,J.D., Harvey,S.C. and Harrington,R.E. (1996) The organic crystallizing agent 2-methyl-2,4-pentanediol reduces DNA curvature by means of structural changes in A-tracts. J. Biol. Chem., 271, 17911–17919. Drew,H.R. and Travers,A.A. (1985) DNA bending and its relation to nucleosome positioning. J. Mol. Biol., 186, 733–790.
Erie,D.A., Yang,G.L., Schultz,H.C. and Bustamante,C. (1994) DNA bending by Cro protein in specific and nonspecific complexes: Implications for protein site recognition and specificity. Science, 266, 1562–1566. Goodsell,D.S. and Dickerson,R.E. (1994) Bending and curvature calculations in B-DNA. Nucleic Acids Res., 22, 5497–5503. Goodsell,D.S., Kaczor-Grzeskowiak,M. and Dickerson,R.E. (1994) The crystal structure of C-C-A-T-T-A-A-T-G-G—Implications for bending of B-DNA at T-A steps. J. Mol. Biol., 239, 79–96. Gorin,A.A., Zhurkin,V.B. and Olson,W.K. (1995) B-DNA twisting correlates with base-pair morphology. J. Mol. Biol., 247, 34–48. Hagerman,P.J. (1990) Sequence-directed curvature of DNA. Annu. Rev. Biochem., 59, 755–781. Han,W., Dlakic,M., Zhu,Y.J., Lindsay,S.M. and Harrington,R.E. (1997a) Strained DNA is kinked by low concentrations of Zn2+. Proc. Natl Acad. Sci. USA, 94, 10565–10570. Han,W., Lindsay,S.M., Dlakic,M. and Harrington,R.E. (1997b) Kinked DNA. Nature, 386, 563. Haran,T.E., Kahn,J.D. and Crothers,D.M. (1994) Sequence elements responsible for DNA curvature. J. Mol. Biol., 244, 135–143.
345
M.Dlakic and R.E.Harrington
Harrington,R.E. and Winicov,I. (1994). New concepts in protein– DNA recognition: sequence directed DNA bending and flexibility. Prog. Nucleic Acids Res. Mol. Biol., 47, 195–270. Harvey,S.C., Dlakic,M., Griffith,J., Harrington,R., Park,K., Sprous,D. and Zacharias,W. (1995) What is the basis of sequence-directed curvature in DNAs containing A tracts? J. Biomol. Struct. Dynam., 13, 301–307. Juo,Z.S., Chiu,T.K., Leiberman,P.M., Baikalov,I., Berk,A.J. and Dickerson,R.E. (1996) How proteins recognize the TATA box. J. Mol. Biol., 261, 239–254. Kim,J.L., Nikolov,D.B. and Burley,S.K. (1993) Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature, 365, 520–526. Kim,Y.C., Geiger,J.H., Hahn,S. and Sigler,P.B. (1993) Crystal structure of a yeast TBP/TATA-box complex. Nature, 365, 512–520. Lavery,R. and Sklenar,H. (1988) The definition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids. J. Biomol. Struct. Dynam., 6, 63–91. Lilley,D.M.J., Chen,D.R. and Bowater,R.P. (1996) DNA supercoiling and transcription: Topological coupling of promoters. Q. Rev. Biophys., 29, 203–225. Nekludova,L. and Pabo,C.O. (1994) Distinctive DNA conformation with enlarged major groove is found in Zn-finger DNA and other protein–DNA complexes. Proc. Natl Acad. Sci. USA, 91, 6948–6952. Palecek,E. (1991) Local supercoil-stabilized DNA structures. CRC Crit. Rev. Biochem. Mol. Biol., 26, 151–226. Parvin,J.D., McCormick,R.J., Sharp,P.A. and Fisher,D.E. (1995) Pre-bending of a promoter sequence enhances affinity for the TATA-binding factor. Nature, 373, 724–727.
346
The Persistence of Vision Development Team (1991–97) The Persistence of Vision Raytracer. http://www.povray.org/ Richardson,D.C. and Richardson,J.S. (1992) The kinemage: a tool for scientific communication. Protein Sci., 1, 3–9. Richardson,D.C. and Richardson,J.S. (1994) Kinemages—simple macromolecular graphics for interactive teaching and publication. Trends Biochem. Sci., 19, 135–138. Satchwell,S., Drew,H.R. and Travers,A.A. (1986) Sequence periodicities in chicken nucleosome core DNA. J. Mol. Biol., 191, 659–675. Sayle,R.A. and Milner-White,E.J. (1995) RASMOL: biomolecular graphics for all. Trends Biochem. Sci., 20, 374. Shpigelman,E.S., Trifonov,E.N. and Bolshoy,A. (1993) CURVATURE: Software for the analysis of curved DNA. Comput. Applic. Biosci., 9, 435–440. Starr,D.B., Hoopes,B.C. and Hawley,D.K. (1995) DNA bending is an important component of site-specific recognition by the TATA binding protein. J. Mol. Biol., 250, 434–446. Tan,S., Hunziker,Y., Sargent,D.F. and Richmond,T.J. (1996) Crystal structure of a yeast TFIIA/TBP/DNA complex. Nature, 381, 127–134. Travers,A.A. (1989) DNA conformation and protein binding. Annu. Rev. Biochem., 58, 427–452. Ulyanov,N.B. and James,T.L. (1995) Statistical analysis of DNA duplex structural features. Methods Enzymol., 261, 90–120. Werner,M.H., Gronenborn,A.M. and Clore,G.M. (1996) Intercalation, DNA kinking, and the control of transcription. Science, 271, 778–784.