Molecular Docking Study for Functional Annotation of ...

3 downloads 0 Views 235KB Size Report
protein interaction, Molecular docking, Plant cell wall protein. I. INTRODUCTION .... Journal of Molecular Graphics and Modelling, vol. 15, no.6, pp. 359–.
CALCON 2014

1

Molecular Docking Study for Functional Annotation of a Plant Protein Anamika Basu, and Anasua Sarkar*, Senior Member, IEEE  Abstract—The computational approaches like homology modelling and molecular docking, are frequently being used nowadays for the sequence analysis and functional characterization of proteins. Various structural and physicochemical properties of one protein can be illustrated by using computational tools, when the crystal structure of interested protein is unknown. We use computational methods as an excellent and cost effective alternative for analysing structure and function of the plant cell wall binding protein AtEXPA23. In this in-silicon study, almost exact structure of protein along with its structure-function relationship and ligand-protein interaction of that protein, has been determined. The structure reflects function, so the 3D structure of the protein is established using homology modelling approach. The predicted structure is subjected to dock using xylose as a ligand to study the ligandprotein interaction sites. The study lead us to better understanding of the ligand-protein interaction principles and mechanism of rigid plant cell wall expansion. Index Terms — Expansin, Homology modelling, Ligand– protein interaction, Molecular docking, Plant cell wall protein.

I. INTRODUCTION

F

OR many interesting proteins, experimental structures are not available. In principle, homology modelling can calculate a structure for functional annotation of those proteins, and thereby docking can be applied for identification of ligand binding sites of that modelled proteins. Two different strategies are employed to generate the modelling structure: a fully automated modelling method, which employs a prediction server, and a traditional homology modelling procedure. In the recent past, many methods are developed to predict protein-ligand binding sites. The methods are classified into two groups: geometry- and energy-based methods. The energy-based methods identify the binding site using model of energetics, such as PocketFinder [1] and QSiteFinder [2]. Other algorithms are based on geometry, for the binding sites, always locate on the concave surface, which likes a pocket or a cleft e.g. POCKET [3], Ligsite [4] and Surfnet [5]. For a protein structure, not only the functional site can be identified, but which ligands (for enzymes, substrates) bind to that site can also be predicted. The microarray result analysis shows ATEXPA23 (AT5G39280) protein, which causes loosening and extension of plant cell walls, is differentially expressed during different stages of plant embryogenesis. But the exact mechanism for plant cell wall extension by expansin is unknown till date. In the present study, we analyse the peripheral membrane protein ATEXPA23 from Arabidopsis thaliana using homology modelling and molecular docking. It contains one expansin-

like CBD domain and one expansin-like EG45 domain. ATEXPA23 belongs to the expansin family in Expansin A subfamily. Xylose is a carbohydrate molecule present in plant cell wall. The 3D model after refinement is used to explore the xylose binding characteristics of ATEXPA23 using SWISSDOCK. The docking analysis has shown that the surface exposed amino acid residues Arg 174 and Gly 49, interact with ligand xylose through H-bonding. II. METHODOLOGY The in-silico analysis of ATEXPA23 involves various desktop based applications for homology modelling and protein-ligand binding including SWISS-MODEL Version 8.05 [6], BioSerf v2.0 (Automated Homology Modelling), DomSerf v2.0 (Automated Domain Modelling by Homology) - both from The PSIPRED Protein Sequence Analysis Workbench [7] and GalaxyWEB [8]. BioSurf is an automated homology and de-novo modelling server, utilising Modeller, PSIBLAST, pGenTHREADER and HHBlits. DomSerf is used for automated homology modelling of protein domains. Our study involves various online applications for docking including SwissDock, a web service to predict the molecular interactions that may occur between a target protein and a small molecule, based on the docking software EADock DSS, [9] with CHARMM force field method for calculation [10]. The most favourable clusters can be visualized using UCSF Chimera [11]. SignalP4.0 [11] server is used to identify the cleavage site of extra cellular transport signal site. The physic-chemical parameters of the protein sequence that includes amino-acid and atomic compositions, molecular weight and isoelectric point (ρI) are computed by FFPred v2.0 (Eukaryotic Function Prediction) obtained from The PSIPRED Protein Sequence Analysis Workbench [7]. PredictProtein server [13] is used to Effect of Point Mutations Prediction for EXP23_ARATH. Secondary structure analyses of the query protein are performed by PSIPRED Server. The sequence is then submitted to MEMSAT3 & MEMSAT-SVM (Membrane Helix Prediction) to predict [14]. It is also submitted to PDBsum [15], a pictorial database that provides an at-a-glance overview of the contents of each 3D structure deposited in the Protein Data Bank (PDB), to evaluate the detailed topology and to identify the clefts in concerned protein. The FASTA sequence of the protein is entered into Conserved Domains Database [16] at NCBI for domain prediction and analyses. The presence of particular motifs that reflects the specific functions of the proteins, is searched by Motif Search Library (http://www.genome.jp/tools/motif/). To check the quality and

CALCON 2014

2 Chimera software [11]. Structures are analysed for their steriochemical properties through MolProbity [22] and NIH server. Molprobity server

Fig. 1.Ligand xylose binding within protein ATEXPA23.

TABLE I TEMPLATES FOR ATEXPA23 WITH Z-SCORE AND % IDENTITY Sl No.

PDB ID

Z Score

%ID

1 2

2hcz-X 3d30A

33.6 19.7

28 22

reliability of the predicted model, the evaluation tools ERRAT (version 2.0) [17] and Rampage [18] are used. Dali program is run to identify template [19]. The PDB ids of selected templates are 2hcz-X and 3d30A. PDB „2hcz-X‟ is the crystal structure of EXPB1 (Zea m 1), a beta-expansin and group-1 pollen allergen from maize, which induces extension and stress relaxation of grass cell walls. PDB „3d30A‟ is the crystal structure of an expansin like protein from Bacillus Subtilis at 1.9A resolution YoaJ (EXLX1), and is a bacterial expansin that promotes root colonization. Table 1 lists the two templates 2hcz-X and 3d30A with Z-score and % identity with our protein ATEXPA23. Prior to the docking procedure, xylose is identified as ligand by using the GalaxySite [20], which is a ligand binding site prediction from a given protein structure, from webserver GalaxyWEB [8], for the three-dimensional structure of the compound ATEXPA23. Docking of ATEXPA23 with ligand xylose is carried with SWISSDOCK web server based on EADock DSS [9]. Many binding modes are generated in the vicinity of all target cavities (blind docking). Simultaneously, their CHARMM energies are estimated on a grid with CHARMM force field [10] on external computers from the Swiss Institute of Bioinformatics. The binding modes with the most favourable energies are evaluated with FACTS [21] and are therefore clustered. Molecular complexes are ranked by the most favourable binding energies. Among those, we select the one structure representing the best binding mode, based on an energy average value corresponding to the first five ranked structures. The most favourable clusters are visualized by the USCF

Fig. 2.Ligand xylose binding within protein ATEXPA23 with H-bonding and bond lengths shown.

computes Ramachandarn values for dihedral angles, poor rotameric conformations, Cβ deviation, bad angles and bond lengths of all the residues. NIH server embeds an evaluation tool like ERRAT [17]. III. RESULTS AND DISCUSSION Docking of ATEXPA23 with ligand xylose is carried with SWISSDOCK web server based on EADock DSS [9] and 42 binding clusters are generated in the vicinity of all target cavities (blind docking). From those clusters for ATEXPA23, cluster 0 is selected with the values for FullFitness to be 10008.32 (kcal/mol) and Estimated ΔG to be -6.76 (kcal/mol). Protein–ligand binding for this cluster is visualized by using UCSF Chimera [11] as shown in Figure 1 and Figure 2 respectively. For ATEXPA23 protein, Arg 173 and Arg 174 are red circled and ligand xylose is shown in sticks in Figure 1. H bonding between O atom of Gly 49 with H8 atom of xylose and H atom of Arg 174 with O1 atom of xylose, with their bond lengths are shown in Figure 2. From PROCHECK server, residue- by- residue analysis for Ramachandran plot, shows that Arg 174 is present in Core beta region with extended strand and participates in betaladder as Secondary structure. However, it also shows that Gly 49 is not present in any predicted secondary structure region. The presence of EXPANSIN_EG45 motif, from PROSITE PROFILE of Motif Search Library (http://www.genome.jp/tools/motif/), for hydrophilic amino acid residue Arg 174, finally proves that ATEXPA23 interacts with plant cell wall through xylose by its family-45 endoglucanase-like domain. REFERENCES [1]

J. An, M. Totrov, R. Abagyan, “Pocketome via comprehensive identification and classification of ligand binding envelopes”, Molecular and Cellular Proteomics, vol. 4, no.6, pp.752–761, 2005.

CALCON 2014 [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13] [14]

[15] [16]

[17]

[18]

[19] [20]

[21]

[22]

[23]

A.T.R. Laurie, R.M. Jackson, “Q-SiteFinder: an energy-based method for the prediction of protein-ligand binding sites”, Bioinformatics, vol. 21, no. 9, pp.1908–1916, 2005. D.G. Levitt, L.J. Banaszak, “POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids”, Journal of Molecular Graphics, vol. 10, no.4, pp. 229–234, 1992. M. Hendlich, F. Rippmann, G. Barnickel, “LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins”, Journal of Molecular Graphics and Modelling, vol. 15, no.6, pp. 359– 363, 1997. R.A. Laskowski, “SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions”, Journal of Molecular Graphics, vol. 13, no. 5, pp. 323–330, 1995. K. Arnold, L. Bordoli, J. Kopp, T. Schwede, “The SWISS-MODEL Workspace: A web-based environment for protein structure homology modelling”, Bioinformatics, vol. 22, pp. 195-201, 2006. D.W.A. Buchan, F. Minneci, T.C.O. Nugent, K. Bryson, D.T. Jones, “Scalable web services for the PSIPRED Protein Analysis Workbench”, Nucleic Acids Research, vol. 41, no.W1, pp. W340-W348, 2013. W. H. Shin, G. R. Lee, L. Heo, H. Lee, C. Seok, “Prediction of Protein Structure and Interaction by GALAXY protein modeling programs”, Bio Design, vol. 2, no. 1, pp. 1-11, 2014. A. Grosdidier, V. Zoete, O. Michielin, “SwissDock, a protein-small molecule docking web service based on EADock DSS”, Nucleic Acids Res, vol. 39, pp. 270–277, 2011. K. Vanommeslaeghe, E. Hatcher, C. Acharya, S. Kundu, S. Zhong et al. “CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields”, J Comput Chem, vol. 31, pp. 671–690, 2010. E.F. Pettersen, T.D. Goddard, C.C. Huang, G.S. Couch, D.M. Greenblatt, E.C. Meng, T.E. Ferrin, “UCSF Chimera--a visualization system for exploratory research and analysis”, J Comput Chem, vol. 25, no. 13, pp. 1605-12, 2004. T.N. Petersen, S. Brunak, G. von Heijne, H. Nielsen, “SignalP 4.0: discriminating signal peptides from transmembrane regions”, Nature Methods, vol. 8, pp. 785-786, 2011. B. Rost, G. Yachdav, J. Liu, “The PredictProtein server”, Nucleic Acid Res, vol. 32, pp. 321-326, 2004. T. Nugent, D.T. Jones, “Transmembrane protein topology prediction using support vector machines”, BMC Bioinformatics, vol. 10, pp. 159, 2009. R.A. Laskowski, “PDBsum: summaries and analyses of PDB structures”, Nucleic Acids Res., vol. 29, pp. 221-222, 2001. A. Marchler-Bauer et al., “CDD: a Conserved Domain Database for the functional annotation of proteins”, Nucleic Acids Res, vol. 39, no. D, pp. 225-9, 2011. C. Colovos, T.O. Yeates, “Verification of protein structures: patterns of nonbonded atomic interactions”, Protein Sci, vol. 2, no. 9, pp. 1511-9, 1993. S.C. Lovell, I.W. Davis, W.B. Arendall III, P.I.W. de Bakker, J.M. Word, M.G. Prisant, J.S. Richardson, D.C. Richardson, “Structure validation by Calpha geometry: phi, psi and Cbeta deviation”, Proteins: Structure, Function & Genetics, vol. 50, pp. 437-450, 2002. L. Holm, P. Rosenström, “Dali server: conservation mapping in 3D”, Nucl. Acids Res, vol. 38, pp. W545-549, 2010. L. Heo, W. -H. Shin, M. S. Lee, C. Seok, “GalaxySite: Ligand-binding site prediction by using molecular docking”, Nucleic Acids Res, accepted, 2014. B. R. Brooks, C. L. Brooks III, A. D. Mackerell, L. Nilsson, R. J. Petrella, B. Roux, Y. Won, G. Archontis, C. Bartels, S. Boresch A. Caflisch, L. Caves, Q. Cui, A. R. Dinner, M. Feig, S. Fischer, J. Gao, M. Hodoscek, W. Im, K. Kuczera, T. Lazaridis, J. Ma, V. Ovchinnikov, E. Paci, R. W. Pastor, C. B. Post, J. Z. Pu, M. Schaefer, B. Tidor, R. M. Venable, H. L. Woodcock, X. Wu, W. Yang, D. M. York, and M. Karplus , “CHARMM: The Biomolecular simulation Program”, J. Comp. Chem, vol. 30, pp. 1545-1615, 2009. V. B. Chen, W. B. Arendall III, J. J. Headd, D. A. Keedy, R. M. Immormino, G. J. Kapral, L. W. Murray, J. S. Richardson and D. C. Richardson, “MolProbity: all-atom structure validation for macromolecular crystallography”, Acta Crystallographica, vol. D66, pp.12-21. http://www.genome.jp/tools/motif/

3