Use of Freely Available and Open Source Tools for In Silico Screening ...

3 downloads 107864 Views 591KB Size Report
Feb 14, 2014 - the standard PDB format with partial charges and atom types. Open Babel was again ... achievable on a desktop computer. Vina results are in ...
Technology Report pubs.acs.org/jchemeduc

Use of Freely Available and Open Source Tools for In Silico Screening in Chemical Biology Gareth W. Price,† Phillip S. Gould,‡ and Andrew Marsh*,§ †

MOAC Doctoral Training Centre, Senate House, ‡School of Life Sciences, §Department of Chemistry, University of Warwick, Coventry CV4 7AL, United Kingdom S Supporting Information *

ABSTRACT: Automated computational docking of large libraries of chemical compounds to a protein can aid in pharmaceutical drug design and gives scientists with basic computer experience a tool to help plan wet laboratory investigations when exploring the combination of chemical and pharmacological spaces. The use of open source tools to develop and select ligands for subsequent screening is outlined. A protocol leveraging the power of Open Babel and AutoDock Vina to perform file conversion, minimization, and docking implemented as a Python script is offered.

KEYWORDS: Graduate Education/Research, Upper-Division Undergraduate, Biochemistry, Chemoinformatics, Collaborative/Cooperative Learning, Discovery Learning, Drugs/Pharmaceuticals, Medicinal Chemistry

I

Sequencers, such as Accelrys’ Pipeline Pilot or the open source graphical user interface (GUI)-based Knime, allow similar processes to be implemented,14 and integration within the Knime Cheminformatics Nodes15 would undoubtedly add value to this approach.

nformatics is central to the way chemists learn and work today. Over a decade and a half ago, this Journal published a paper entitled “Computational Chemistry for the Masses”1 outlining the diversity and applicability of computational chemistry. There is now a wealth of commercial and open source applications that aid in the modeling of chemical systems, finding applications from pharmaceutical drug design to materials science. The imperative for enabling students to use them2 to explore the therapeutic drug-target landscape3−5 using burgeoning resources such as PubChem and genomic and other functional databases is clear.6 Tools such as NCBI BLAST7 have revolutionized the way biological data is accessed and manipulated. However, combining this with individual chemical properties available through, for example, RSC ChemSpider and NCBI PubChem presents ongoing conceptual and computational challenges.8 This report aims to introduce an open source, highthroughput protocol to chemists with a basic grasp of informatics from undergraduate level onward, enabling them to perform automated docking9 of a library of chemical compounds to a receptor protein. By using it, students and researchers will more easily be able to uncover molecular structures that bind to targets with the ultimate aim of linking them to screening data held by ChEMBL10 and PubChem. The protocol flow diagram is shown in Figure 1. Open Babel11,12 is extensively used to convert between file formats, and AutoDock Vina13 is used for the docking of ligands to a protein receptor. AutoDock Vina, previously identified as having potential as a teaching tool,6 can take advantage of multiple CPU or GPU cores and is open source. © 2014 American Chemical Society and Division of Chemical Education, Inc.



PREPARATION OF FILES Starting from a lead compound, it is usual to generate a library of similar structures either in a two-dimensional visual format or SMILES (simplified molecular-input line-entry system) text format. Services, such as the PubChem REST API,16 which might be called from within the script or Web services, that aim to produce similar compounds by exploring new chemical space, such as ChemNProp,17,18 can output a text file of SMILES compounds.19 From here, Open Babel11,12 was used to convert the library to MOL2 and PDB files that state the constituent atoms, their coordinates in three-dimensional space, and chemical bonding. The command-line obabel program within Open Babel was particularly useful for its ability to perform a genetic algorithm conformer search, allowing all conformations to be systematically determined, their in vacuo free energy calculated, and the lowest selected. Open Babel was also installed with obminimize, a program that minimizes the input file. AutoDock Vina requires input of PDBQT files that extend the standard PDB format with partial charges and atom types. Open Babel was again used for this conversion, although a Published: February 14, 2014 602

dx.doi.org/10.1021/ed400302u | J. Chem. Educ. 2014, 91, 602−604

Journal of Chemical Education

Technology Report

parameter, which can be thought of as the amount of effort Vina makes into searching all conformations, orientations, and positions of the ligand. Using an optimized cluster of workstations over a prolonged period of time allows this to be set in the thousands more easily. Although it is important to run the analysis with exhaustiveness (>1000) using a box closely defining the likely binding site, this is nonetheless achievable on a desktop computer. Vina results are in the form of a PDBQT with multiple modes. These modes are descriptions of the orientation, position, and conformation of the ligand. Due to their nonstandard format, many visualization programs do not read these files correctly, but AutoDock Tools and the popular PyMoL24 are both freely available options. Code in the Supporting Information is also provided that performs extraction of the modes to individual PDB files for viewing. The primary script provided in the Supporting Information that automates the protocol described above is ODScreen. The user supplies the directory containing the compounds to be screened, the type of compound files, the name of the receptor protein, and the name of the AutoDock Vina configuration file. Optional arguments include the exhaustiveness and, for more advanced users, the generation of a job file that can be submitted to MOAB/Torque cluster queues.



EXAMPLE USE A walk-through example based on the discovery of similar compounds to a known inhibitor of plant enzyme TIR1 ubiquitin ligase25 and PDB file 2P1Q.pdb is provided. Classroom use might include an extension to the exploration of molecular interactions in the known inhibitor−enzyme complex with discussion of in silico methods, including similarity searching, used in generating lead compounds.6

Figure 1. A flow diagram showing the path from library generation (orange) to screening (violet) and visualization and analysis (green).

separate Python script can be downloaded with MGLTools that can be used as an alternative. The same procedure was then applied to the “receptor”, the protein to which the ligands are docked. Hydrogens should be added and subsequently saved in PDBQT format using AutoDock Tools.20 Rotatable bonds are not explicitly selected because Vina can regard the side chains of selected residues of the receptor as being flexible. This is an improvement on early attempts at docking tools, where the receptor is kept rigid; however, a fully flexible protein would provide a more realistic model. It is worth emphasizing the power and simplicity of command-line tools combined with their automation via a simple shell or Python script. The most useful tools are those that require no interaction, yet output files that are accurate and representative of the result of manual labor. Furthermore, languages, such as shell script or Python, are often operatingsystem independent, allowing them to be run on different platforms with little reconfiguration.



ASSOCIATED CONTENT

S Supporting Information *

A straightforward protocol written in Python that depends on Open Babel being installed is available. Vina is also provided, but if an installed version is found, that is used in preference. The code is commented and will be maintained and updated through www.opendiscovery.org.uk. This material is available via the Internet at http://pubs.acs.org.



AUTHOR INFORMATION

Corresponding Author

*A. Marsh. E-mail: [email protected]. Notes



The authors declare the following competing financial interest(s): G.W.P. and P.S.G. declare no competing interest. A.M. is a shareholder in Tangent Reprofiling Limited, a subsidiary of SEEK, a trading style of PepTCell Limited, and declares no other competing interest.

SCREENING AutoDock Vina13 was chosen for docking because it is independently validated21 and actively maintained with a user community that is swift to answer any questions. It also includes a rotamer search, to explore dihedrals specified in the PDBQT. Note that Web alternatives, such as DOCK Blaster9 (which integrates with the ZINC database)22 and idock,23 have the advantage of using more powerful computational facilities than are typically available to novice users. Once the PDBQT files were prepared, the size and center point of a three-dimensional box in which the ligands were docked was determined. The only other option required is the exhaustiveness of the docking, a somewhat ambiguous



ACKNOWLEDGMENTS G.W.P. thanks EPSRC for a Ph.D. studentship through Warwick MOAC DTC (EP/F500378/1). P.S.G. acknowledges BBSRC (BB/1022880/1) for funding. We thank Warwick Centre for Scientific Computing (CSC) and staff for access to parallel computing clusters, P. M. Rodger (CSC and Department of Chemistry), D. Bray (Department of Chemistry), A. J. Easton (School of Life Sciences) for helpful discussions. We 603

dx.doi.org/10.1021/ed400302u | J. Chem. Educ. 2014, 91, 602−604

Journal of Chemical Education

Technology Report

(24) DeLano, W. L.; et al. The PyMOL Molecular Graphics System, version 1.5.0.4; Schrodinger: New York, 2002. (25) Calderón Villalobos, L. I. A.; Lee, S.; De Oliveira, C.; Ivetac, A.; Brandt, W.; Armitage, L.; Sheard, L. B.; Tan, X.; Parry, G.; Mao, H.; Zheng, N.; Napier, R.; Kepinski, S.; Estelle, M. A combinatorial TIR1/ AFB−Aux/IAA co-receptor system for differential sensing of auxin. Nat. Chem. Biol. 2012, 8, 477−485.

also thank M. Quareshy for testing the code and for providing the example results and referees for constructive feedback.



REFERENCES

(1) JCE staff. Computational Chemistry for the Masses. J. Chem. Educ. 1996 73, 104. (2) Manallack, D. T.; Chalmers, D. K.; Yuriev, E. Using the β2Adrenoceptor for Structure-Based Drug Design. J. Chem. Educ. 2010, 87, 625−627. (3) Hopkins, A. L. Network pharmacology: the next paradigm in drug discovery. Nat. Chem. Biol. 2008, 4, 682−690. (4) Brown, J. B.; Okuno, Y. Systems biology and systems chemistry: new directions for drug discovery. Chem. Biol. 2012, 19, 23−28. (5) Ortí, L.; Carbajo, R. J.; Pieper, U.; Eswar, N.; Maurer, S. M.; Rai, A. K.; Taylor, G.; Todd, M. H.; Pineda-Lucena, A.; Sali, A.; MartiRenom, M. A. A Kernel for Open Source Drug Discovery in Tropical Diseases. PLoS Neglected Trop. Dis. 2009, 3, e418 DOI: 10.1371/ journal.pntd.0000418. (6) Sutch, B. T.; Romero, R. M.; Neamati, N.; Haworth, I. S. Integrated Teaching of Structure-Based Drug Design and Biopharmaceutics: A Computer-Based Approach. J. Chem. Educ. 2012, 89, 45−51. (7) Altschul, S. F.; Gish, W.; Miller, W.; Myers, E. W.; Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 1990, 215, 403. (8) Ekins, S.; Waller, C. L.; Bradley, M. P.; Clark, A. M.; Williams, A. J. Four disruptive strategies for removing drug discovery bottlenecks. Drug Discovery Today 2013, 18, 265−271. (9) Irwin, J. J.; Shoichet, B. K.; Mysinger, M. M.; Huang, N.; Francesco Colizzi, F.; Wassam, P.; Cao, Yiqun. Automated Docking Screens: A Feasibility Study. J. Med. Chem. 2009, 52, 5712−5720. (10) Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2011, 40, 1100−1107. (11) O’Boyle, N. M.; Banck, M.; James, C. A.; Morley, C.; Vandermeersch, T.; Hutchison, G. R. Open Babel: An open chemical toolbox. J. Cheminf. 2011, 3, 33 DOI: 10.1186/1758-2946-3-33. (12) The Open Babel Package, version 2.3.1. http://openbabel.org (accessed Feb 2014). (13) Trott, O.; Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading. J. Comput. Chem. 2010, 31, 455−461. (14) Liao, C.; Sitzmann, M.; Pugliese, A.; Nicklaus, M. C. Software and resources for computational medicinal chemistry. Future Med. Chem. 2011, 3, 1057−1085. (15) Konstanz Information Miner. http://tech.knime.org/ cheminformatics-extensions (accessed Feb 2014). (16) PUG REST. http://pubchem.ncbi.nlm.nih.gov/pug_rest/PUG_ REST.html (accessed Feb 2014). (17) Cincilla, G.; Thormann, M.; Pons, M. Structuring Chemical Space: Similarity-Based Characterization of the PubChem Database. Mol. Inf. 2009, 29, 37−49. (18) ChemNProp − Chemical names to properties. http:// chemnprop.irbbarcelona.org (accessed Feb 2014). (19) Vidal, D.; Thormann, M.; Pons, M. LINGO, an efficient holographic text based method to calculate biophysical properties and intermolecular similarities. J. Chem. Inf. Model. 2005, 45, 386−393. (20) Morris, G. M.; Huey, R.; Lindstrom, W.; Sanner, M. F.; Belew, R. K.; Goodsell, D. S.; Olson, A. J. Autodock4 and AutoDockTools4: automated docking with selective receptor flexiblity. J. Comput. Chem. 2009, 16, 2785−91. (21) Docking At UTMB - DUD Results. http://docking.utmb.edu/ dudresults/ (accessed Feb 2014) (22) Irwin, J. J.; Sterling, S.; Mysinger, M. M.; Bolstad, E. S.; Coleman, R. G. ZINC: A Free Tool to Discover Chemistry for Biology. J. Chem. Inf. Model. 2012, 52, 1757−1768. (23) Li, H.; Leung, K.-S.; Wong, M.-H. idock: A Multithreaded Virtual Screening Tool for Flexible Ligand Docking. Proc. IEEE Symp. Comput. Intell. Bioinf. Comput. Biol. 2012, 77−84, DOI: 10.1109/ CIBCB.2012.6217214. 604

dx.doi.org/10.1021/ed400302u | J. Chem. Educ. 2014, 91, 602−604

Suggest Documents