Indian Journal of Biotechnology Vol 12, January 2013, pp 40-45
Molecular modeling and functional characterization of a pertinent enzyme in Streptococcus pneumoniae serotype-2: A potential target for the development of novel pneumonia drug Simrika Thapa1,3, Md Asraful Alum1, Chinmoy Saha1, Abdullah Zubaer1,3, Arzuba Akter2 and Shakhinur Islam Mondal1* 1
Department of Genetic Engineering and Biotechnology and 2Department of Biochemistry and Molecular Biology Shahjalal University of Science and Technology, Sylhet 3114, Bangladesh 3 Swapnojaatra Bioresearch Laboratory, DataSoft Systems, Dhaka 1215, Bangladesh
Streptococcus pneumoniae is the causative agent of Pneumonia resulting in a substantial portion of childhood deaths in Bangladesh. Though drugs are in sufficient availability but the emergence of multi-drug resistant varieties of S. pneumoniae has led to the search for novel drug targets. The metabolic pathways of host (Homo sapiens) and pathogen (S. pneumoniae serotype-2) were compared where six biochemical pathways of S. pneumonia, distinct from human pathways, were identified. These comprised of 20 unique enzymes, which being non-homologous proteins in contrast to human host proteins, can be considered as probable drug targets. Among them, the 3D structure of an uncharacterized protein molecule was built in by homology modeling and the binding pockets of protein molecule responsible for specific functions were identified. These structural and functional characterizations of the protein, unique in S. pneumoniae in host condition, made the possibility for the sophisticated rational drug design. Keywords: Drug target, homology modeling, KEGG, metabolic pathway, S. pneumoniae
Introduction Pneumonia remains a leading cause of mortality in pediatric population of Bangladesh1, covering about 52 thousands childhood deaths in the year 20082. Published comparison of incidence between developing and developed countries indicates that 90 to 95% of clinical pneumonia occurs in developing countries1. It is defined as the presence of cough or difficulty in breathing, with other clinical symptoms and signs. The pathogens that cause pneumonia are Haemophilus influenzae, Streptococcus pneumoniae, Staphylococcus aureus, and Gram negative bacteria, such as, Escherichia coli and Klebsiella spp.. Among them, S. pneumoniae serotype-2 is the most prevalent pathogen in Bangladesh3,4. It is a Gram positive bacteria frequently colonizing the nasopharynx. Invasive infection can develop in a variety of body compartments including blood, lungs, cerebrospinal fluid and middle ear5. Though different drugs are available for the treatment but increasing incidence of multiple-antimicrobial drug resistance among S. pneumoniae isolates is becoming a problem —————— *Author for correspondence: Tel: +880-181-8237189 Email:
[email protected]
throughout the world6. Therefore, the realization to manufacture novel drugs is important. As most antibacterial drugs, currently known, are essentially inhibitors of certain bacterial enzymes, all enzymes specific to a bacterium can be considered as drug targets7. For attractive potential drug targets, the enzymes in the pathways of S. pneumonia, which do not show similarity to any enzyme or protein from the host, are essentially to be identified8. Analysis of metabolic pathways of host and pathogen allows assembling a list of potential enzymes/proteins that function solely for the pathogen’s viability or infectivity8. This strategy facilitates the search for chosen few potential drug targets and easily discards the larger list. Thus, this process eliminates the pseudo-drug targets as the cost involved in the investigation of drug targets is prohibitive. In the present study, protein targets that do not have human homologue were identified from unique pathways of S. pneumonia serotype-2 and one (uncharacterized) of the target proteins was characterized structurally and functionally, making it a potential and novel target to execute effective drug design program.
THAPA et al: MODELING AND CHARACTERIZATION OF DRUG TARGET FOR S. PNEUMONIAE SEROTYPE-2
Materials and Methods Identification of Potential Drug Targets
KEGG2 (Koyto Encyclopedia of Genes and Genomes) pathway database was used as a source of metabolic pathway information9. Metabolic pathway identification numbers of the host Homo sapiens and the pathogen S. pneumoniae serotype-2 were extracted from the KEGG database. Pathways which do not appear in the host but present in the pathogen according to KEGG database annotation have been identified as pathways unique to S. pneumoniae as compared to the host H. sapiens7. The corresponding protein sequences were retrieved from the KEGG database. They were subjected to a BLASTp10 search against the non-redundant database with the e-value inclusion threshold set to 0.005. The search was restricted to proteins from H. sapiens through an option available in the NCBI Basic Local Alignment Search Tool (BLAST), which allows selecting the organism to which the search should be restricted. In the current context, the objective was to find only those targets, which did not have detectable human homologues. Enzymes, which did not have hits below the e-value inclusion threshold of 0.005, were chosen as potential drug targets. Homology Modeling and Validation
The modeling of the three dimensional (3D) structure of the hypothetical protein found in C5-branched dibasic acid metabolism as possible drug target of S. pneumoniae protein was performed by homology modeling programs, SWISS-MODEL Repository11 and The Protein Model Portal program12. The quality and validation of the obtained model was performed using PROCHECK, ERRAT and PROVE softwares from “SAVES: MetaServer Structure Analysis” under NIH MBI Laboratory Server (http://nihserver.mbi.ucla.edu/SAVES/). Moreover, the model was analyzed in SuperPose13. The overall stereochemical property of the protein was assessed by Ramchandran plot analysis provided by PROCHECK. In silico Functional Characterization
The SOSUI server (http://bp.nuap.nagoya-u.ac.jp/ sosui/) performed the identification of nature, either soluble or membrane protein, and function of the protein with the evaluation of transmembrane regions for membrane proteins14. Based on the result from SOSUI, the sequence conservation analysis was predicted with ConSurf-database15. The significant
41
active sites anticipation was performed through CASTp (Computed Atlas of Surface Topography of Proteins)16. Results and Discussion Identification of Potential Drug Targets
S. pneumoniae infection is the major cause of morbidity and mortality in pediatric population1,4,5. The dramatic rise of multi-drug resistant strains of this pathogen6 has resulted in renewed efforts to identify novel drug targets for future rational drug design. For drug target identification, biosynthetic pathways are reliable and accurate, considering that proteins necessary for pathogen’s viability and infectivity are unusual to host8. Discarding enzymes from the pathogen, which share a similarity with the host proteins, ensures that the targets have nothing in common with the host proteins. Thereby eliminating undesired host protein-drug interactions7. From KEGG server by using KEGG2 pathway database, all pathways associated with S. pneumoniae were extensively analyzed and compared with the host (H. sapiens). The comparison revealed six metabolic pathways that are only present in the pathogen (Table 1). Enzymes involved in these pathways have been compared with proteins from the host. A total of 20 enzymes were found to be non-homologous to host, and considered as potential drug targets (Table 1). This approach has been successful in listing out many potential targets from the S. pneumoniae proteome, which are involved in vital aspects of the pathogen’s metabolism, persistence, virulence and cell wall biosynthesis. Homology Modeling and Validation of 3D Structure
From the 20 identified drug targets, current study has focused on an uncharacterized protein responsible for C5-branched dibasic acid metabolism in S. pneumonie, with the KEGG accession SPD_1113 (Table 1). Homology model of SPD_1113 was built in SWISS MODEL Repository and remodeled with the Protein Model Portal program by considering 3-isopropylmalate dehydratase small subunit (PDB: 2HCU) structure from S. mutans as a template (Fig. 1a). The template shared the highest percentage of amino acid sequence identity (78%) with the SPD_1113 according to the “consistency result of Model Quality” of Protein Model Portal. At this level of sequence identity between target protein and template, structure is the first indicator of
INDIAN J BIOTECHNOL, JANUARY 2013
42
Table 1—Targets from unique pathways that do not have human homologues as obtained from pathway analysis using KEGG database Acc. no.
Gene
Description
SwissProt ID
A. Peptidoglycan metabolism 1. SPD_0967 MurA UDP-N-acetylglucosamine-1-carboxyvinyltransferase 2. SPD_1222 MurB UDP-N-acetylmuramate dehydrogenase 3. SPD_1349 MurC UDP-N-acetylmuramate--alanine ligase 4. SPD_0598 MurD UDP-N-acetylmuramoylalanine--D-glutamate ligase 5. SPD_1359 MurE UDP-N-acetylmuramoyl-L-alanyl-D glutamate-L-lysine ligase 6. SPD_1483 MurF UDP-N-acetylmuramoylalanyl-D-glutamyl-2,6-diaminopimelateD-alanyl-D-alanine ligase 7. SPD_0599 MurG UDP-N-acetylglucosamine--N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol 8. SPD_0417 bacA N-acetylglucosamine transferase undecaprenyl-diphosphatase 9.SPD_0535 MurM Serine/alanine adding enzyme 10. SPD_0536 MurN Beta-lactam resistance factor 11. SPD_1416** Hypothetical protein B. D-alanine metabolism 1. SPD_2003 dltC
D-alanine-poly(phosphoribitol) ligase
Phosphoenolpyruvate carboxylase Glycerate kinase Fructose-bisphosphate aldolase Acetate kinase Phosphotransacetylase
UPPP_STRP2 Q04LR2_STRP2 Q04LR1_STRP2 Q04JG3_STRP2
Q9ZIH5_STRP2 Q04JV7(Q04JV7_STRP2) Q04K65 (Q04K65_STRP2) Q04KL8(CAPP_STRP2) Q04KG4 (Q04KG4_STRP2) Q04LS0 (Q04LS0_STRP2) Q04IC8 (ACKA_STRP2) Q04KI7 (Q04KI7_STRP2)
F. Phosphonate and phosphinate metabolism 1. SPD_1427 phnA PhnA protein 2. SPD_0319 cps2E Undecaprenylphosphate glucosephosphotransferase Cps2E The two targets for which PDB structure is not available are marked with **.
expected accuracy of the model12. The assessment of the predicted model was performed using SuperPose, which showed the maximum superposition of the model with template (both local and global RMSD value is 0.41). The SuperPose engendered visual output is shown in Fig. 1b. The quality of the model was validated in structural analysis and verification server (SAVES) with PROCHECK, ERRAT and PROVE programs [http://nihserver.mbi.ucla.edu/SAVES/]. Main chain parameters plotted by SAVES are Ramachandran plot quality, peptide bond planarity, side chain parameters, main chain bond length and overall quality factor (ERRAT)17. The stereo chemical quality of the predicted models and accuracy of the protein model was evaluated after the refinement process using Ramachandran Map
MURG_STRP2
DLTC_STRP2
C. Polyketide sugar unit biosynthesis 1. SPD_0329 rfbC dTDP-4-dehydrorhamnose 3,5-epimerase D. C5-Branched dibasic acid metabolism 1. SPD_1224 budA Alpha-acetolactate decarboxylase 2. SPD_1113** Hypothetical protein E. Methane metabolism 1. SPD_0953 Ppc 2. SPD_1011 3. SPD_0526 fba 4. SPD_1853 ackA 5. SPD_0985 eutD
Q04KK5_STRP2 MURB_STRP2 MURC_STRP2 MURD_STRP2 MURE_STRP2 Q04J99_STRP2
Q04JF4 (Q04JF4_STRP2) Q9ZII5 (Q9ZII5_STRP2)
calculations computed within the PROCHECK18. The Ramachandran plot provided by the performed SAVES analysis is shown in Fig. 1c. In Ramachandran plot, the good quality model is expected to have more than 90% of amino-acids in most favored regions19. In the generated model, it was found to be 92%, suggesting high quality of the model. The PROVE analysis revealed RMS Z-scores almost equal to 1 for high quality model14,20; while in our generated model it was 1.33 (Fig. 1e). As suggested by ERRAT, the overall quality factor was 97.619 (Fig. 1d); the high resolution structures generally produce values around 95% or higher 17. Thus, the predicted structure of SPD_1113 conformed well to the stereochemistry, indicating that it is a reasonably good quality model.
THAPA et al: MODELING AND CHARACTERIZATION OF DRUG TARGET FOR S. PNEUMONIAE SEROTYPE-2
43
Fig. 1 (a-e): (a) The resulting model for SPD_1113 generated by SWISS MODEL and Protein Model Portal; (b) The overlapped comparison of model with template verified in SuperPose program showing main chain in green and template in red; (c) Ramachandran plot: Red regions in the graph indicate the most allowed regions, whereas the yellow regions represent allowed regions (92.7, 6.1, 0.0 and 1.2% of the residues were located in the most favorable, additionally allowed, generously allowed and disallowed regions, respectively); (d) ERRAT shows model quality of 97%; & (e) PROVE shows RMS Z-score close to 1. In Silico Functional Characterization
SOSUI distinguishes between membrane and soluble protein from amino acid sequences, and predicts the transmembrane helices for the former 14. The executed analysis using SOSUI software designated the hypothetical protein to be a soluble one rather than the transmembrane helices. As the protein molecule is soluble one, a complementary program ConSurf shows the distribution of functional and structural residues of the modeled 3D structure and calculates the levels of evolutionary conservation for each amino acid21. ConSurf guesstimated functional and structural residues of SPD_1113 are shown in Fig. 2. ConSurf is an automated web-based tool for the identification of functionally important regions in proteins by surface mapping of the level of evolutionary conservation at each amino acid site position. It is well established
that residues buried in the protein core are conserved throughout evolution15. Computed Atlas of Surface Topography of Proteins (CASTp)22,23 is used for visualization of the annotated functional residues, with emphasis on mapping to surface pockets and interior voids. CASTp gives a prediction of active site and the number of amino acids involved in it24. It has the ability to locate functionally important residues and to obtain a comprehensive understanding of the structural basis of protein function. The two best binding pockets with high area and volume, one with area 169.9 and volume 337.9 (Fig. 3A) and another with area 175.8 and volume 373 (Fig. 3B), were predicted by CASTp as active sites. Depending on CASTp prediction, these two binding pockets seem to be the potential active sites for further docking analysis. Thus, the present model could be further explored for
44
INDIAN J BIOTECHNOL, JANUARY 2013
Fig. 2—ConSurf results showing the conservation quality of amino acid residues. It indicates the distribution of structural and functional residues over the structure. According to the neural-network algorithm, ‘e’ indicates an exposed residue, ‘b’ that of buried residue, ‘f’ indicates predicted functional residue (highly conserved and exposed), and ‘s’ indicates predicted structural residue (highly conserved and buried).
Fig. 3 (A & B)—The generated model subjected to CASTp showing two significant binding pockets: (A) One with area 169.9 and volume 337.9 (indicated in green); & (B) Other one with area 175.8 and volume 373.0 (indicated in blue). Below is shown the amino acid residues involved in configuration of respective binding pockets.
in silico docking studies with suitable inhibitors. Computational approaches like molecular docking could be adopted for screening of inhibitors, which can bind to the target with experimental or modeled structures. Conclusion The aim of the present study was identification of possible drug target and homology modeling of a metabolic protein in S. pneumonia serotype-2 prevalent in Bangladesh. A comparative metabolic pathway analysis of the host H. sapiens and the pathogen S. pneumoniae was performed. From KEGG server, a total of 20 enzymes were identified as nonhomologous to human protein sequences, which were involved in pathogen’s metabolism, persistence, virulence and cell wall biosynthesis. Among the disclosed targets, one uncharacterized enzyme was characterized structurally and functionally. On the basis of present knowledge, further studies can be conducted towards the development of new drugs against S. pneumonia serotype-2.
Acknowledgment Authors would like to thank Dr Ahmad Faisal Karim, Case Western Reserve University (CWRU), Cleveland, Ohio, USA for his inspiring discussion. They are also grateful to the Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Bangladesh for supporting this research. References 1
2
3
Brooks W A, Breiman F R, Goswami D, Hossain A, Alam K et al, Invasive pneumococcal disease burden and implications for vaccine policy in urban Bangladesh, Am J Trop Med Hyg, 77 (2007) 795-801. Black R E, Cousens S, Johnson H L, Lawn J E, Rudan I et al, Global, regional, and national causes of child mortality in 2008: A systematic analysis, Lancet, 375 (2010) 1969-1987. Asghar R, Banajeh S, Egas J, Hibberd P, Iqbal I et al, Chloramphenicol versus ampicillin plus gentamicin for community acquired very severe pneumonia among children aged 2-59 months in low resource settings: Multicentre randomised controlled trial (SPEAR study), Br Med J, 336 (2008) 80-84.
THAPA et al: MODELING AND CHARACTERIZATION OF DRUG TARGET FOR S. PNEUMONIAE SEROTYPE-2 4
5 6
7
8
9 10
11
12 13 14
Alam M R, Saha S K, Nasreen T, Latif F, Rahman, S R et al, Detection, antimicrobial susceptibility and serotyping of Streptococcus pneumoniae from cerebrospinal fluid specimens from suspected meningitis patients, Bangladesh J Microbiol, 24 (2007) 24-29. Moschioni M, Pansegrau W & Barocchi M A, Adhesion determinants of the Streptococcus species, Microb Biotechnol, 3 (2010) 370-388. Aspa J, Rajas O, Castro F R & Blanquer J, Drug-resistant pneumococcal pneumonia: Clinical relevance and related factors, Clin Infect Dis, 38 (2004) 787-798. Morya V K, Dewaker V, Mecarty S D & Singh R, In silico analysis of metabolic pathways for identification of putative drug targets for Staphylococcus aureus, J Comput Sci Syst Biol, 3 (2010) 062-069. Mandage R H & Wadnerkar A S, Subtractive genomics approch to identify potential therapeutic targets in Leishmania donovani, Int J Pharm Bio Sci, 1 (2010) 1-6. Kanehisa M, Goto S, Kawashima S & Nakaya A, The KEGG database at GenomeNet, Nucleic Acids Res, 30 (2002) 42-46. Stephen F A, Thomas L M, Alejandro A S, Zhang J, Zhang Z et al, Gapped BLAST and PSI-BLAST: A new generation of protein database search programs, Nucleic Acids Res, 25 (1997) 3389-3402. Kiefer F, Arnold K, Künzli M, Bordoli L & Schwede T, The SWISS-MODEL repository and associated resources, Nucleic Acids Res, 37 (2009) D387-D392. Arnold K, Kiefer F, Kopp J, Battey J N D, Podvinec M et al, The protein model portal, J Struct Funct Genomics, 10 (2009) 1-8. Maiti R, Domselaar G H, Zhang H & Wishart D S, SuperPose: A simple server for sophisticated structural superposition, Nucleic Acids Res, 32 (2004) W590-W594. Sahay A & Shakya M, In silico analysis and homology modelling of antioxidant proteins of spinach, J Proteomics Bioinf, 3 (2010) 148-154.
45
15 Glaser F, Pupko T, Paz I, Bell R E, Shental D B et al, ConSurf: Identification of functional regions in proteins by surface-mapping of phylogenetic information, Bioinformatics, 19 (2003) 163-164. 16 Dundas J, Ouyang Z, Tseng J, Binkowski A, Turpaz Y et al, CASTp: Computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues, Nucleic Acids Res, 17 (2006) W116-W118. 17 Colovos C & Yeates T, Verification of protein structures: Patterns of nonbonded atomic interactions, Protein Sci, 2 (1993) 1511-1519. 18 Laskowski R A, MacArthur M W, Moss D S & Thornton J M, PROCHECK—A program to check the stereochemical quality of protein structures, J Appl Cryst, 26 (1993) 283-291. 19 Ramachandran G N, Ramakrishnan C & Sasisekharan V, Steriochemistry of polypeptide chain configurations, J Mol Biol, 7 (1963) 95-99. 20 Mulakayala C, Banaganapalli B N, Anuradha C M & Chitta S K, Insights from Streptococcus pneumoniae glucose kinase structural model, Bioinformation, 3 (2009) 308-310. 21 Chen J & Shen B, Computational analysis of amino acid mutation: A proteome wide perspective, Curr Proteomics, 6 (2009) 228-234. 22 Binkowski T A, Naghibzadeh S & Liang J, CASTp: Computed atlas of surface topography of proteins, Nucleic Acids Res, 3 (2003) 3352-3355. 23 Liang J, Edelsbrunner H & Woodward C, Anatomy of protein pockets and cavities: Measurement of binding site geometry and implications for ligand design, Protein Sci, 7 (1998) 1884-1897. 24 Singh S, Kumar A, Patel A, Tripathi A, Kumar D et al, In silico 3D structure prediction and comparison of nucleocapsid protein of H1N1, J Modell Simul Syst, 1 (2010) 108-111.