Microbial Pathogenesis 121 (2018) 238–244
Contents lists available at ScienceDirect
Microbial Pathogenesis journal homepage: www.elsevier.com/locate/micpath
In silico identification of molecular mimics involved in the pathogenesis of Clostridium botulinum ATCC 3502 strain
T
Tulika Bhardwaja, Shafiul Haqueb, Pallavi Somvanshia,∗ a b
Department of Biotechnology, 10, Institutional Area, Vasant Kunj, TERI School of Advanced Studies, New Delhi 110070, India Research and Scientific Studies Unit, College of Nursing & Allied Health Sciences, Jazan University, Jazan 45142, Saudi Arabia
A R T I C LE I N FO
A B S T R A C T
Keywords: C. botulinum ATCC 3502 Food poisoning Molecular mimicry All-against-all BLAST Bit-score Enrichment analysis Virulence Homology analysis
Bacterial pathogens invade and disrupt the host defense system by means of protein sequences structurally similar at global and local level both. The sharing of homologous sequences between the host and the pathogenic bacteria mediates the infection and defines the concept of molecular mimicry. In this study, various computational approaches were employed to elucidate the pathogenicity of Clostridium botulinum ATCC 3502 at genomewide level. Genome-wide study revealed that the pathogen mimics the host (Homo sapiens) and unraveled the complex pathogenic pathway of causing infection. The comparative ‘omics’ approaches helped in selective screening of ‘molecular mimicry’ candidates followed by the qualitative assessment of the virulence potential and functional enrichment. Overall, this study provides a deep insight into the emergence and surveillance of multidrug resistant C. botulinum ATCC 3502 caused infections. This is the very first report identifying C. botulinum ATCC 3502 proteome enriched similarities to the human host proteins and resulted in the identification of 20 potential mimicry candidates, which were further characterized qualitatively by sub-cellular organization prediction and functional annotation. This study will provide a variety of avenues for future studies related to infectious agents, host-pathogen interactions and the evolution of pathogenesis process.
1. Introduction The development of multi-drug resistance among pathogenic microorganism results in sudden hospital outbreaks [49], repeated onset of viral attacks [59], and prevalence of severe infectious agents [29]. The inflating rate of food poisoning cases across the world raised the quest for elucidate the underlying complexity related to microbial pathogenesis. Food poisoning is suspected when an acute illness caused by pathogenic bacteria, viz. Clostridium [5,31]; [14], Salmonella [52], Staphylococcus [28] disturbs the gastrointestinal or neurological system. Clostridium botulinum ATCC 3502, anaerobic, Gram-positive, sporeforming bacteria (the causal agent of food-borne botulism) is a formidable pathogen. C. botulinum produces heat-resistant spores that exist widely in the environment, and in the absence of oxygen they germinate, grow and then excrete toxins [43]. The inherent ability of bacteria to invade the host's immune defense mechanism by virulence factors, i.e., adhesions [34], colonization factors [53], effectors [47], invasions [36], toxins [35], capsular polysaccharides [39], siderophores [26] is well characterized and validated. In addition, bacterial pathogens adopt a common strategy of invading the host immune system by means of molecular mimicry [11]. Mimicry
∗
Corresponding author. E-mail address:
[email protected] (P. Somvanshi).
https://doi.org/10.1016/j.micpath.2018.05.017 Received 28 March 2018; Received in revised form 10 May 2018; Accepted 11 May 2018 Available online 12 May 2018 0882-4010/ © 2018 Published by Elsevier Ltd.
is defined as an ecological phenomenon representing competitive ability of organisms for food and resources [17]. The mimicry of the pathogen deciphers the sequential and structural similarity between the pathogen encoded proteins and the host proteins that aids in the disruption of the host functions, which resulted or reflected in the form of any disease [13]. The role of pathogen mimicking candidates in invading and surviving within the human host immune response during infections was evident in Coxiella burnetii, that encodes two eukaryotelike sterol reductases [15,32], and Legionella sp. [33] encodes proteins mimicking the human guanine-exchange factors (GEFs) etc. In case of viruses, the sharing of antigens between the host and viruses to invade the human immune system and cause autoimmune diseases by acquiring genetic material from the host during virion formation [23] defines ‘molecular mimicry’. Two evolutionary mechanisms that were considered responsible for the origin of molecular mimicry candidates (a) lateral gene transfer (b) convergent or parallel evolution [25,45,48]. The presence of independent repeats in the pathogen's proteins shows resemblance with the host's proteins thus assists in the adherence of bacteria viz. collagen, leucine rich repeats etc. Convergent evolution [60] mediates local similarity between the pathogen and the host proteins (sharing of motifs and domains) rather than detectable
Microbial Pathogenesis 121 (2018) 238–244
T. Bhardwaj et al.
homology shared between the host and the pathogen proteins by lateral gene transfer process [41,48]. The structural similarity between clostridia toxin and mammalian collagen revealed the possibility of emergence and surveillance of Clostridium sp. in the host to cause the infection [12]. The present study lays the foundation for the identification of potential mimicry candidates for the elucidation of the pathogenicity of C. botulinum ATCC 3502 strain. To combat higher evolutionary processes of developing drug-resistance, several labor intensive iterative genes screening processes have been formalized yet resulted only in the relentless growth in the bacterial genomic data. As a solution, a computational pipeline was used to handle the large datasets that accelerate selective discovery of potential mimicry candidates shared between the host and C. botulinum ATCC 3502 proteome datasets. In this study a comparative analysis was performed between two orthlogous gene-sets (host v/s pathogen and host v/s non-pathogens/controls) by using ‘all-against-all’ BLAST [11]. Further, computation of bit score enabled the selective screening of potential mimicry candidates between the host and pathogenic strain. The qualitative assessment of the identified mimics revealed functional, virulent potential and subcellular localization of the mimicry candidates. Overall, this genome-wide analysis opens new avenues for the understanding of the pathogenesis of Clostridia sp. and possible therapeutic inventions.
Table 1 List of control organisms (non-pathogenic) for the screening of potential mimic candidates among human and query organism (C. botulinum ATCC 3502). 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40.
2. Materials and methodology 2.1. Primary dataset collection The complete proteome sequences of query pathogen organism (C. botulinum ATCC 3502) and the human host (Homo sapiens) were mined by using UniprotKB (www.uniprot.org/help/uniprotkb). An in-house Perl script was developed for the mining of non pathogenic organisms from NCBI Genome Database (https://www.ncbi.nlm.nih.gov/genome) and a temporary relational database consisting of control organisms (non-pathogenic strains) was generated using MySQL (https://www. phpmyadmin.net) to perform BLAST ‘all-against-all’ sequence similarity search (Table 1).
Acidithiobacillus ferrooxidans ATCC 23270 Aromatoleum aromaticum EbN1 Acinetobacter ADP1 Bacillus halodurans C 125 Bacteroides thetaiotaomicron VPI 5482 Bifidobacterium longum NCC2705 Bradyrhizobium japonicum USDA 110 Buchnera aphidicola str. Sg (Schizaphis graminum) Candidatus Blochmannia floridanus Caulobacter crescentus CB15 Clostridium acetobutylicum ATCC 824 Corynebacterium efficiens YS 314 Deinococcus radiodurans R1 Desulfovibrio vulgaris Hildenborough Escherichia coli K 12 substr MG1655 Gloeobacter violaceus PCC 7421 Geobacter sulfurreducens PCA Hyphomonas neptunium ATCC 15444 Lactobacillus plantarum WCFS1 Magnetococcus MC 1 Mesorhizobium loti MAFF303099 Methylococcus capsulatus Bath Myxococcus xanthus DK 1622 Nitrosomonas europaea ATCC 19718 Pseudomonas fluorescens SBW25 Pseudomonas putida KT2440 Ralstonia solanacearum GMI1000 Rhodopirellula baltica SH 1 Ruegeria pomeroyi DSS 3 Staphylococcus epidermidis ATCC 12228 Streptococcus thermophilus CNRZ1066 Streptomyces avermitilis MA 4680 Synechocystis PCC 6803 Thermoanaerobacter tengcongensis MB4 Thermus thermophilus HB27 Thiobacillus denitrificans ATCC 25259 Wolinella succinogenes DSM 1740 Xanthomonas axonopodis citri 306 Yersinia pestis biovar Microtus 91001 Zymomonas mobilis ZM4
2.2. Sequence similarity search analysis If, lies between 2 and 10, then non-pathogen hit. A standalone program BLAST v 2.2.16 was used to perform ‘allagainst-all’ BLAST analysis of the query pathogen and control organisms against the human proteome at default parameters. This similarity search analysis was performed by comparing the complete proteome of query pathogen to the human proteome using reciprocal BLAST hit strategy. Proteomes showing best bit score with reciprocal BLAST (greater than 10; default) were prioritized as conserved among all proteomes. Based on compositional bias, a presence/absence matrix was generated at an E-value cutoff of 1e−06.
2.4. Qualitative assessment of potential mimics (a) Homology analysis: Virulence Factor Database (VFDB) was used for the similarity search against the identified pathogenic mimics by performing BLAST at default parameters with E < 1e-0.6. (b) Enrichment analysis: DAVID, a tool for functional gene enrichment was used for functional annotation of the predicted mimics on the basis of GOTERM_BP-ALL, GO_TERM_CC-ALL, GOTERM_MF_ALL, PANTHER_BP_ALL, PANTHER_MF_ALL, BIOCARTA, KEGG_PATHWAY and PANTHER_PATHWAY ontologies at EASE = 0.1 value. (c) Prediction of sub-cellular organization: Sub-cellular localization of the potential mimics was predicted by using PSORTb, a tool based on machine learning algorithms (Support Vector Machine: SVM, STMHMM, and SCL-BLAST).
2.3. Screening of potential mimics The scoring matrix assisted in the identification of potential mimics shared between the query pathogen and the human host. A bit score of 10 defined the presence of specific potential mimic based on three criteria: absence/rarity in non-pathogens, more than 70% similarity between the pathogen and the host proteins, and expression in pathogens. The computation of bit score using the fraction of pathogen species with a hit detected by blast (Pi) divided by the fraction of nonpathogens (NPi) containing a hit screen out potential mimicry candidates between C. botulinum ATCC 3502 strain and the human host [11].
Bit Score =
3. Results 3.1. Primary dataset collection The size of circular genome of C. botulinum ATCC 3502 is 3.9Kb with 28.19% GC content. The GC content of a bacterium is expressed as a proportion of guanine and cytosine bases in the DNA molecule and reflects taxonomic classification, evolutionary pathogenomics and survival in ecological niche of prokaryotes [46]. The total proteome
Pathogen species with a hit detected by BLAST fraction of non−pathogenic hits
Conditions If, score difference > 10, then pathogenic mimic hit. 239
Microbial Pathogenesis 121 (2018) 238–244
T. Bhardwaj et al.
Table 2 List of potential mimics identified between the pathogen (C. botulinum ATCC 3502) and the host (Homo sapiens). POTENTIAL MIMICS HOST PROTEIN
PATHOGEN PROTEIN
NCBI ID (Host)
NCBI ID (Pathogen)
Alignment Evalue
Locus tag (pathogen)
Enabled homolog ATPase, H+ transporting, lysosomal V0 subunit a1 Coiled-coil domain containing 88C HECT, UBA and WWE domain containing 1, E3 ubiquitin protein ligase G protein-coupled receptor 125 UPF2 regulator of nonsense transcripts homolog Myosin XVIIIA Laminin, alpha 4 DEAD/H (Asp-Glu-Ala-Asp/His) box helicase 11 Hook microtubule-tethering protein 3 Microphthalmia-associated transcription factor Decapping enzyme, scavenger WW domain binding protein 11 Leucine-rich repeats and IQ motif containing 3
ATP-dependent DNA helicase V-type ATP synthase subunit I
578802282 530412425
148378683 14838056
3.56E-17 5.52E-07
CBO0683 CBO2629
Hypothetical protein Hypothetical protein
148762940 530426356
148381075 148380147
2.31557E-06 1.00902E-05
CBO3124 CBO2193
Hypothetical protein Peptidoglycan hydrolase Methyl accepting chemotaxis protein Methyl accepting chemotaxis protein ATP-dependent helicase Arginine deaminase Phage protein Molecular chaperone GroEL Alpha-ribazole-5′-phosphate phosphatase Capsular polysaccharide biosynthesis protein DNA Topoisomerase I
59823631 11693132 28416946 380503843 578822806 14165274 38156703 7661734 7706501 530361767
148379742 148378507 148381161 148380765 148378514 148379553 148379696 148381241 148378811 148381055
4.5557E-06 1.76638E-05 3.40374E-05 3.80129E-05 1.77588E-05 2.77983E-05 9.06996E-05 5.45814E-05 8.94391E-05 6.47853E-05
CBO1782 CBO0504 CBO3213 CBO2805 CBO0511 CBO1587 CBO1737 CBO3298 CBO0819 CBO3104
302058278
148380395
2.01234E-05
CBO2437
Hypothetical protein PTS system L-ascorbate family transporter subunit IIB Transcriptional regulator Hypothetical protein
374671824 530405779
148379080 148380107
4.39979E-05 9.30944E-05
CBO1092 CBO2151
578825835 578813069
148378271 148379214
6.45463E-05 9.80374E-05
CBO0269 CBO1231
Transporter monovalent cation:proton antiporter-2 (CPA2) family
578820067
148379810
9.67207E-05
CBO1850
DNA-directed RNA polymerase III subunit RPC10 Chromosome 9 open reading frame 69 Inositol-trisphosphate 3-kinase A Neural retina leucine zipper High mobility group nucleosomal binding domain 3 Sodium/hydrogen exchanger 9B1-like
4. Discussion
repertoire of the pathogenic organism containing 3581 proteins was downloaded from UniprotKB (www.uniprot.org/help/uniprotkb) in .ftp format. A temporary proteome database of fully sequenced non-pathogenic organism mined from NCBI genome was developed using MySQL to perform further non-homology search analysis.
The advent of next generation sequencing technology aids in the generation of a large amount of data freely available at the public repositories. Such data assist in resolving complex pathogenesis of infectious diseases, drug discovery, biofuel production, pharmacogenomics, nutrigenomics etc. Various computational, systems biology and bioinformatics strategies have been employed in the past to combat the increasing pressure of the surveillance of antibiotic resistant organisms. However, the concern of public health exacerbated by the emergence of antibiotic resistance and decline in the development of novel antibiotics in this post-antibiotic era worldwide [8]. The dependency on arsenal of virulence factors to disarm pathogenic bacteria was adopted by several researchers but the reliability on the selection pressure is still possesses a challenge to the research community. To overcome such limitations, molecular mimicry candidates of pathogens and hosts were identified to escalate the measures taken to combat antimicrobial resistance and pathogenicity. The resemblance between proteins shared between the host and the pathogen in terms of structure and sequence composition is termed as molecular ‘mimicry’. These mimic candidates of pathogens subvert the host's defense system by camouflaging host's antigens with their surface proteins [21]. The possible mechanism of emergence of mimicry candidates via (a) horizontal gene transfer and (b) convergence, has been reported earlier in parasites, bacteria and viruses [11]. For example, T. cruzi requires trans-sialidase for the survival in the tsetse fly and it transfers sialic acid from the host cells to the surface of the parasite while trans-sialidase is virulent in mammals [40]. Convergent evolution of mimic macromolecules between the host and the pathogen was reflected by the classical example of similarity shared by cytoadhesive region of mammalian thrombospondin, which binds to hepatocytes with 18 amino acid motif of P. falciparum CSP [38] glycolipid that assists in cell adhesion and the formation of tight junctions. Additionally, Forssman antigen was found synthesized by pathogenic helminths [44], and CRIT gene, which shares 98% similarity with the human orthlogous
3.2. Screening of potential mimics Orthology search was carried out by performing BLAST ‘all-againstall’ analysis of the host (Homo sapiens) proteome with the query pathogen (C. botulinum ATCC 3502) and the non-pathogenic organisms (control), respectively. Selective screening of the potential mimicry candidates was done based on a bit score generated for the individual alignment, which constitutes the complete scoring matrix. A bit score ranging from 0 to 10 enabled the categorization of similarity hits as a pathogenic/non-pathogenic. This iterative screening required n2 independent BLAST runs and resulted in the identification of 20 potential hits that act as mimicry candidates between C. botulinum ATCC 3502 and the human host (Table 2).
3.3. Qualitative assessment of the potential mimics Homology analysis against Virulence Factor Database (VFDB) was performed using BLAST at E < 1e-0.6 for the assessment of virulence potential of the mimics and screened out for being considered as virulent targets. The total number of mimics identified in association with cytoplasm, cytoplasmic membrane, extracellular matrix and cell wall were 11, 7, 1, and 1, respectively (Table 3 and Fig. 2). The statistical evaluation of the biological role of mimics was done by computing the enriched p-value by using the multiple-hypothesis corrections of Benjamini and Hochberg test. A critical value of 0.5 suggesting null hypothesis provided significant biological roles [16] of the potential mimics along with their respective superfamilies (Table 4). 240
Microbial Pathogenesis 121 (2018) 238–244
T. Bhardwaj et al.
Table 3 Predicted sub-cellular localization of 20 identified potential mimics. POTENTIAL MIMICS HOST PROTEIN
PATHOGEN PROTEIN
Sub-cellular localization
Localization Score
Enabled homolog ATPase, H+ transporting, lysosomal V0 subunit a1 Coiled-coil domain containing 88C HECT, UBA and WWE domain containing 1, E3 ubiquitin protein ligase G protein-coupled receptor 125 UPF2 regulator of nonsense transcripts homolog Myosin XVIIIA Laminin, alpha 4 DEAD/H (Asp-Glu-Ala-Asp/His) box helicase 11 Hook microtubule-tethering protein 3 Microphthalmia-associated transcription factor Decapping enzyme, scavenger WW domain binding protein 11 Leucine-rich repeats and IQ motif containing 3 DNA-directed RNA polymerase III subunit RPC10 Chromosome 9 open reading frame 69 Inositol-trisphosphate 3-kinase A Neural retina leucine zipper High mobility group nucleosomal binding domain 3 Sodium/hydrogen exchanger 9B1-like
ATP-dependent DNA helicase V-type ATP synthase subunit I Hypothetical protein Hypothetical protein
Cytoplasmic Cytoplasmic membrane Cytoplasmic Cytoplasmic membrane
9.97 9.99 9.67 9.99
Hypothetical protein Peptidoglycan hydrolase Methyl accepting chemotaxis protein Methyl accepting chemotaxis protein ATP-dependent helicase Arginine deaminase Phage protein Molecular chaperone GroEL Alpha-ribazole-5′-phosphate phosphatase Capsular polysaccharide biosynthesis protein DNA Topoisomerase I Hypothetical protein PTS system L-ascorbate family transporter subunit IIB Transcriptional regulator Hypothetical protein Transporter monovalent cation:proton antiporter-2 (CPA2) family
Cytoplasmic Extracellular Cytoplasmic membrane Cytoplasmic membrane Cytoplasmic Cytoplasmic Cytoplasmic membrane Cytoplasmic Cytoplasmic Cell wall Cytoplasmic Cytoplasmic membrane Cytoplasmic Cytoplasmic Cytoplasmic Cytoplasmic membrane
9.91 9.98 9.61 9.51 7.50 7.50 9.99 9.22 9.67 8.97 9.67 9.98 7.89 7.50 6.89 9.99
identified 20 potential hits that act as mimicry candidates between C. botulinum ATCC 3502 and the human host (Table 2). Further qualitative assessment of the identified mimics like virulent homology analysis, subcellular localization, prediction and enrichment analysis provide new avenues for unwinding the complexities of the pathogenesis. Functional annotation was enabled by using DAVID along with machine learning platforms for the subcellular localization prediction. The subcellular localization deciphers the correct placement of the target macromolecule enabling better understanding of host-pathogen interaction (Table 3). Coordinated response of the pathogen with the host is necessary for the survival and multiplication of the bacteria. As pathogen protein get localized in cellular compartments containing genetic material (nucleus and mitochondria), therefore affecting its complete regulatory machinery. Bacterial proteins interfere host's cellular mechanisms by adhering to the surface by means of receptors or via transport mechanism. The evolution of proteins occurrs in the humans and pathogens
[24]. Molecular mimicry candidates identified in above mentioned reports help in unraveling the host-pathogen interactions and thus provide new insights into the understanding of complex infectious disease pathways. In this study, a computational pipeline was used for the identification and characterization of mimic candidates of C. botulinum ATCC 3502 and the human host (Fig. 1). C. botulinum is a causable agent of food-poisoning cases at the global level. Among seven serotypes A-G, the group A is considered as the most harmful for the human health. In order to avoid the emergence of false negatives by sharing the similarity with non-pathogenic bacteria and its related side-effects, non-similarity search analysis was performed against the database of non-pathogenic organisms. This database acted as a control for the selective screening of the potential mimicry candidates between the query organism and the host. A complete list of the control organisms is given in Table 1. Further bit score computation enabled screening of the potential mimicry candidates of the pathogen and the host proteome content and
Fig. 1. Representation of the methodology adopted for the screening of the potential mimics. 241
Microbial Pathogenesis 121 (2018) 238–244
T. Bhardwaj et al.
Fig. 2. List of sub-cellular localization of 20 identified potential mimics.
signaling by regulating the levels of a large number of inositol polyphosphates, which on the other hand maintained by both calcium/ calmodulin and protein phosphorylation mechanisms [22]. Additionally, mammalian hook microtubule-tethering protein 3 that mediates organelle binding mimics pathogenic arginine deaminase (#10), which belongs to amidinotransferase superfamily and participates in bacterial arginine deaminase pathway. Arginine deaminase is a membrane bound protein provides major energy by catalyzing L-arginine into L-citrulline with ammonia. The released ammonia provides bacterial tolerance against acidic environments [57]. These proteins sequence composition structurally mimic the cytosolic coiled-coil proteins (Hook proteins) containing conserved N-terminal domains attached with microtubules that mediate binding to other organelles. These are major components of FTS/Hook/FHIP complex that assist in the positioning or formation of aggresome (pericentriolar accumulations of misfolded proteins, proteasomes and chaperones) [50]. The pathogen protein alpha-ribazole-5′-phosphate phosphatase (#13) mimics WW domain binding protein 11 of the human host. The active participation of alpha-ribazole-5′-phosphate phosphatase in the biosynthesis of adenosylcobalamin (Coenzyme B12) by hydrolyzing
independently but they share some similarity due to the presence of such repetitive structures. These repetitions like leucine rich repeats, coiled-coil structures serve potential targets for the pathogen mimicry with the host and subsequent pathogenesis. The characterization of biological role of the potential structure helped in evaluating the hostspecies specificity with proteins essential for the pathogen survival. Most of the identified mimics reflected association with the major metabolic cycles indispensable for the survival of the pathogen. Proteins with the similarity based on sequence composition and structure were considered as mimics and represented by (# serial numbers) in the results (Tables 2–4). For example, the predicted mimic #17 (PTS system L-ascorbate family transporter subunit IIB) and #19 (hypothetical protein) participated in active carbohydrate metabolism and transport system. Subunit EII encodes L-ascorbate-specific permease with two cytoplasmic and one transmembrane subunit (IIA and IIB) and IIC, respectively. These subunits are encoded by the genes involved in E. coli sga TBA operon. PTS system serves as a complex kinase system regulating the expression of genes participating in metabolic, transport and mutagenic processes [1]. This protein mimics mammalian inositol-trisphosphate 3-kinase that help in cellular
Table 4 List of functionally enriched 20 identified potential mimics. Potential Mimics
Functional Enrichment
Superfamily
Enrichment p-value
ATP-dependent DNA helicase V-type ATP synthase subunit I Hypothetical protein Hypothetical protein Hypothetical protein Peptidoglycan hydrolase Methyl accepting chemotaxis protein Methyl accepting chemotaxis protein ATP-dependent helicase Arginine deaminase Phage protein Molecular chaperone GroEL Alpha-ribazole-5′-phosphate phosphatase
DNA metabolism, DNA replication, recombination, and repair Energy production and conversion Cell cycle control, cell division, chromosome partitioning Unknown Transcription Unknown but found in several lipoproteins Chemotaxis Chemotaxis Replication, recombination and repair Amino acid transport and metabolism N/A Protein folding and adhesions Biosynthesis of cofactors, prosthetic groups, and carriers, Heme, porphyrin, and cobalamin Transcription Replication, recombination and repair Membrane anchored protein, ABC transporter permease Carbohydrate transport and metabolism
UvrD superfamily V-type ATPase_I SMC_N superfamily SLC5-6-like_sbd Superfamily LRR_8 superfamily YgiM superfamily MCP signal superfamily MCP signal superfamily PolC Amidinotransferase superfamily N/A Chaperonin like_superfamily Histidine Phosphatase superfamily
4.8e-4 1.21e-5 8.3e-6 8.3e-6 1.43e-5 2.47e-5 1.81e-5 2.7e-5 2.51e-5 8.17e-5 7.21e-7 7.23e-4
LRR_8 superfamily TOP1Ac superfamily YitT_membrane superfamily PTS_IIB superfamily
1.19e-2 4.29e-2 1.57e-8 2.65e-3
Transcription Carbohydrate transport and metabolism Transporter
HTH_6 superfamily EpsL superfamily CPA2 superfamily
4.29e-8 1.2e-5 3.8e-4
Capsular polysaccharide biosynthesis protein DNA Topoisomerase I Hypothetical protein PTS system L-ascorbate family transporter subunit IIB Transcriptional regulator Hypothetical protein Transporter monovalent cation:proton antiporter-2 (CPA2) family
242
Microbial Pathogenesis 121 (2018) 238–244
T. Bhardwaj et al.
Interestingly, (#11), bacterial phage protein mimicking mammalian microphthalmia-associated transcription factor shares structural similarity but performs no biological role. Indeed, two bacterial proteins (peptidoglycan hydrolase and hypothetical protein) mimics HECT, UBA, and WWE domain containing 1, E3 ubiquitin protein ligase (#4) and UPF2 regulator of nonsense transcripts homolog (#6), respectively. Peptidoglycan hydrolases can cleave the covalent bonds in peptidoglycan layer and thus regulates the cell wall growth [54] shared sequential similarity with the protein involved in both mRNA nuclear export and mRNA surveillance, UPF2 regulator of nonsense transcripts homolog. mRNA surveillance detects exported mRNAs and regulates nonsense-mediated mRNA decay (NMD) in post-splicing events. NMD cleaves mRNA containing premature RNA with stop codons prior to translation [7]. A bacterial hypothetical protein (#3) that is involved in chromatin and DNA dynamics [2] contains ATP-binding domains at the N- and C-termini and two extended coiled-coil domains separated by a hinge in the middle mimics coiled-coil domain containing 88C that regulates Wnt signaling pathway negatively and interacts with the dishevelled protein [3]. C. botulinum ATCC 3502 molecular chaperone GroEL protein (#12) that plays major role in protein refolding and unfolding processes mimics human decapping enzyme, scavenger protein participating in pre-mRNA splicing. Being considered to the family of chaperones, required for proper folding of proteins in bacteria; GroEL is formed by seven membered rings containing three domains (a) equatorial (b) apical (c) intermediate with separate binding sites for substrates. Similarly, being member of histidine triad family, decapping enzyme contain two conserved domains with independent sites capable of binding and hydrolyzing cognate substrate prior mRNA degradation [30]. Two mimics, #20 and #1 were identified between the human host and the pathogenic bacteria based on the structural similarity assisting in transport and energy generation processes, respectively. Overall, this analysis will serve as starting point for the researchers to investigate plausible novel ways to combat recurring antimicrobial resistant C. botulinum ATCC 3502 and causing infections in an effective manner by elucidating the complexity of host-pathogen interactions and possible pathways adopted by the bacteria to subvert the host for the pathogenesis.
adenosylcobalamin-5′-phosphate resulted into adenosylcobalamin and phosphate [51]. This protein showed sequential and structural similarity with a spliceosome – associated protein promoting pre-mRNA splicing by enabling cross-intron bridging of U1 and U2 snRNPs in the mammalian A complex [4]. The complete transcriptional machinery depends on the active participation of the proteins involved in replication, transcription and translation processes. It constitutes minimum gene content for the pathogen survival and evolution. Such pathogen proteins show sequential and structural similarity with the essential proteins of humans. This essentiality at an individual level laid the basis of underlying complex pathogenesis. ATP-dependent helicases (#1 and #9), DNA Topoisomerase I (#15) in the pathogen mimics enabled homolog, DEAD/H (Asp-Glu-Ala-Asp/His) box helicase and DNA-directed RNA polymerase III subunit RPC10, respectively. DNA topoisomerases assist in relaxing the strand by cutting one of the strands of DNA and further reannealing the same, while its mimic, i.e., DNA-directed RNA polymerase III subunit RPC10 catalyzes the transcription of DNA into RNA using the four ribonucleoside triphosphates as substrates and involved in pol III transcription re-initiation and RNA cleavage during transcription termination (http://www.uniprot.org/uniprot/?query= author:%22Landrieux+E.%22 [27]. DEAD box protein which show conserved motifs (Asp-Glu-Ala-Asp/His) that are involved in the alteration of RNA secondary structure such as translation initiation, nuclear and mitochondrial splicing, and ribosome and spliceosome assembly mimics ATP-dependent helicases required for genome maintenance by assisting in unwinding of DNA helix or self-annealed RNA molecule required for many major cellular processes like replication, transcription and translation, therefore considered essential for bacterial survival and pathogenic potential [18]. Several mimicry candidates identified between the host and the pathogen plays a major role in the transcriptional machinery. Capsular polysaccharide biosynthesis protein (#14) shared functional similarity with leucine-rich repeats and IQ motif containing 3. Leucine-rich repeats are the second most abundant protein in mammalian extracellular matrix and assists in pathogen-detection by supporting in adherence and invasion of the host cells. Similarly, capsular polysaccharide biosynthesis protein plays critical roles in bacterial–host interactions by directing coordinated polysaccharide polymerization and their export to cell surface [19,56]. Mammalian G protein-coupled receptor 125 assisting in generation of functional multipotent adult stem cells [42] mimics hypothetical protein (#5) and neural retina leucine zipper mimics transcriptional regulator (#18). Neural retina leucine zipper protein regulates the expression of several rod-specific genes (RHO and PDE6B) [20]. Bacteria move towards attractants and away from repellants by sensing mechanism, this terms as chemotaxis [58]. Bacterial surface proteins respond towards environmental stimuli by changing their conformation. This enables their direct interaction with plant and mammalian hosts [9]. Therefore, proteins involved in chemotaxis phenomenon considered important for the survival, motility and virulence of bacteria. Two methyl accepting chemotaxis bacterial proteins which belongs to methyl-accepting chemotaxis protein (MCP) signal superfamily mimics [Myosin XVIIIA (#7) and laminin alpha4 (#8)]. Mammalian protein that acts as molecular motors to interact with filamentous F-actin by utilizing chemical energy generated by ATP hydrolysis [6] and assists in vesicle budding from Golgi apparatus. This allows Golgi membrane trafficking and resultantly fuse into Golgi apparatus [10]. Mammalian myosin, a ubiquitous protein that assists in motor function under diverse movements such as cytokinesis, muscle contraction etc. [55] mimics methyl-accepting chemotaxis bacterial protein, which similarly mediates bacterial chemotaxis towards diverse environments and chemical attractants. In addition, (#8) laminin, alpha4 that assists in varied biological processes including cell adhesion, migration, signaling, and metastasis mimics structurally and functionally with another methyl accepting chemotaxis bacterial protein causing mobility [37].
5. Conclusion This study employs a systematic bioinformatics approach for the identification of potential molecular mimicry candidates between C. botulinum ATCC 3502 strain and the human host. Screening of 20 potential mimicry candidates was done by computing bit score enabled by scoring matrix generation. Further, qualitative assessment prioritizes them as novel targets for the identification of pathogenesis. The genome-wide analysis opens new avenues for understanding Clostridia pathogenesis and their therapeutic inventions. This approach will overcome the limitations of labor intensive bench-top methodologies applied for the screening of potential proteins sharing the structural similarity and the sequential composition between the query and the host organism. This study will pave the novel way to the researchers for future studies related to infectious agents, host–pathogen interactions and the evolution of pathogenesis. Appendix A. Supplementary data Supplementary data related to this article can be found at http://dx. doi.org/10.1016/j.micpath.2018.05.017. References [1] E. Ab, G.K. Schuurman-Wolters, D. Nijlant, K. Dijkstra, M.H. Saier, G.T. Robillar, R.M. Scheek, NMR structure of cysteinyl-phosphorylated enzyme IIB of the N,N'diacetylchitobiose -specific phosphoenolpyruvate-dependent phosphotransferase system of Escherichia coli, J. Mol. Biol. 308 (5) (2001) 993–1009.
243
Microbial Pathogenesis 121 (2018) 238–244
T. Bhardwaj et al.
[2] Y. Akai, R. Kanai, N. Nakazawa, M. Ebe, C. Toyoshima, M. Yanagida, ATPase-dependent auto-phosphorylation of the open condensin hinge diminishes DNA binding, Open Biol. 4 (12) (2014) 140193. [3] N. Aznar, K.K. Midde, Y. Dunkel, I. Lopez-Sanchez, Y. Pavlova, A. Marivin, et al., Daple is a novel non-receptor GEF required for trimeric G protein activation in Wnt signaling, eLife 4 (2015) e07091. [4] M.T. Bedford, R. Reed, P. Leder, WW domain-mediated interactions reveal a spliceosome-associated protein that binds a third class of proline-rich motif: the proline glycine and methionine-rich motif, Proc. Natl. Acad. Sci. U.S.A. 95 (18) (1998) 10602–10607. [5] T. Bhardwaj, P. Somvanshi, Pan-genome analysis of Clostridium botulinum reveals unique targets for drug development, Gene 623 (2017) 48–62. [6] M.J. Bloemink, M.A. Geeves, Shaking the myosin family tree. Biochemical kinetics defines four types of myosin motor, Semin. Cell Dev. Biol. 22 (2011) 961–967. [7] M. Clerici, A. Deniaud, V. Boehm, N.H. Gehring, C. Schaffitzel, S. Cusack, Structural and functional analysis of the three MIF4G domains of nonsense-mediated decay factor UPF2, Nucleic Acids Res. 42 (4) (2014) 2673–2686. [8] J. Conly, B. Johnston, Where are all the new antibiotics? The new antibiotic paradox, Can. J. Infect Dis. Med. Microbiol. 16 (2005) 159–160. [9] S. de Weert, H. Vermeiren, I. Mulders, I. Kuiper, N. Hendrickx, G. Bloemberg, J. Vanderleyden, R.D. Mot, B.J. Lugtenberg, Flagella-driven chemotaxis towards exudate components is an important trait for tomato root colonization by Pseudomonas fluorescens, Mol. Plant Microbe Interact. 15 (2002) 1173–1180. [10] H.C. Dippold, M.M. Ng, S.E. Farber-Katz, S.K. Lee, M.L. Kerr, M.C. Peterman, R. Sim, P.A. Wiharto, K.A. Galbraith, S. Madhavarapu, G.J. Fuchs, T. Meerloo, M.G. Farquhar, H. Zhou, S.J. Field, GOLPH3 bridges phosphatidylinositol-4- phosphate and actomyosin to stretch and shape the Golgi to promote budding, Cell 139 (2009) 337–351. [11] A.C. Doxey, B.J. McConkey, Prediction of molecular mimicry candidates in human pathogenic bacteria, Virulence 4 (6) (2013) 453–466. [12] A.C. Doxey, M.D. Lynch, K.M. Müller, E.M. Meiering, B.J. McConkey, Insights into the evolutionary origins of clostridial neurotoxins from analysis of the Clostridium botulinum strain A neurotoxin gene cluster, BMC Evol. Biol. 8 (2008) 316. [13] N.C. Elde, H.S. Malik, The evolutionary conundrum of pathogen mimicry, Nat. Rev. Microbiol. 7 (2009) 787–797. [14] J.C. Freedman, A. Shrestha, B.A. McClane, Clostridium perfringens enterotoxin: action, genetics, and translational applications, Toxins 8 (3) (2016) 73. [15] S.D. Gilk, P.A. Beare, R.A. Heinzen, Coxiella burnetii expresses a functional Ä24 sterol reductase, J. Bacteriol. 192 (2010) 6154–6159. [16] K. Glass, M. Girvan, Annotation Enrichment Analysis: an Alternative Method for Evaluating the Functional Properties of Gene Sets, Scientific Reports .4 (2014), p. 4191. [17] U. Gowthaman, V.P. Eswarakumar, Molecular mimicry: good artists copy, great artists steal, Virulence 4 (6) (2013) 433–434. [18] L.M. Granato, S.C. Picchi, M. Andrade, O. de, M.A. Takita, A.A. de Souza, N. Wang, M.A. Machado, The ATP-dependent RNA helicase HrpB plays an important role in motility and biofilm formation in Xanthomonas citrisubsp. citri, BMC Microbiol. 16 (2016) 55. [19] A. Guidolin, J.K. Morona, R. Morona, D. Hansman, J.C. Paton, Nucleotide sequence analysis of genes essential for capsular polysaccharide biosynthesis in Streptococcus pneumoniae type 19F, Infect. Immun. 62 (12) (1994) 5384–5396. [20] H. Hao, P. Tummala, E. Guzman, R.S. Mali, J. Gregorski, A. Swaroop, K.P. Mitton, The transcription factor neural retina leucine zipper (NRL) controls photoreceptorspecific expression of myocyte enhancer factor Mef2c from an alternative promoter, J. Biol. Chem. 286 (40) (2011) 34893–34902. [21] F.O. Hebert, L. Phelps, I. Samonte, M. Panchal, S. Grambauer, I. Barber, M. Kalbe, C.R. Landry, N. Aubin-Horth, Identification of candidate mimicry proteins involved in parasite-driven phenotypic changes, Parasites Vectors 8 (2015) 225. [22] C. Hoofd, D. Fabienne, L. Deneubourg, S. Deleu, T.M.U. Nguyen, K. Sermon, et al., A specific increase in inositol 1,4,5-trisphosphate 3-kinase B expression upon differentiation of human embryonic stem cells, Cell. Signal. 24 (7) (2012) 1461–1470. [23] A. Hurford, T. Day, Immune evasion and the evolution of molecular mimicry in parasites, Evolution 67 (10) (2013) 2889–2904. [24] J.M. Inal, K.M. Hui, S. Miot, S. Lange, M.I. Ramirez, et al., Complement C2 receptor inhibitor trispanning: a novel human complement inhibitory receptor, J. Immunol. 174 (2005) 356–366. [25] E.V. Koonin, K.S. Makarova, L. Aravind, Horizontal gene transfer in prokaryotes: quantification and classification, Annu. Rev. Microbiol. 55 (2001) 709–742. [26] I.L. Lamont, P.A. Beare, U. Ochsner, A.I. Vasil, M.L. Vasil, Siderophore-mediated signaling regulates virulence factor production in Pseudomonas aeruginosa, Proc. Natl. Acad. Sci. U.S.A. 99 (10) (2002) 7072–7077. [27] E. Landrieux, N. Alic, C. Ducrot, J. Acker, M. Riva, C. Carles, A subcomplex of RNA polymerase III subunits involved in transcription termination and reinitiation, EMBO J. 25 (2006) 118–128. [28] G.C. Lima, M.R. Loiko, L.S. Casarin, E.C. Tondo, Assessing the epidemiological data of Staphylococcus aureus food poisoning occurred in the State of Rio Grande do Sul, Southern Brazil, Braz. J. Microbiol. 44 (3) (2013) 759–763. [29] J. Liu, H. Ai, Y. Xiong, F. Li, Z. Wen, W. Liu, et al., Prevalence and correlation of infectious agents in hospitalized children with acute respiratory tract infections in Central China, PLoS One 10 (3) (2015) e0119170. [30] S.-W. Liu, X. Jiao, H. Liu, M. Gu, C.D. Lima, M. Kiledjian, Functional analysis of
[31] [32] [33] [34] [35] [36] [37]
[38]
[39]
[40]
[41]
[42]
[43]
[44] [45] [46]
[47] [48] [49] [50] [51]
[52] [53]
[54] [55] [56]
[57]
[58]
[59] [60]
244
mRNA scavenger decapping enzymes, RNA 10 (9) (2004) 1412–1422, http://dx. doi.org/10.1261/rna.7660804. B.M. Lund, M.W. Peck, A possible route for foodborne transmission of Clostridium difficile? Foodborne Pathog. Dis. 12 (3) (2015) 177–182. A. Omsland, R.A. Heinzen, Life on the outside: the rescue of Coxiella burnetii from its host cell, Annu. Rev. Microbiol. 65 (2011) 111–128. R.C. Orchard, N.M. Alto, Mimicking GEFs: a common theme for bacterial pathogens, Cell Microbiol. 14 (1) (2012) 10–18. J. Pizarro-Cerdá, P. Cossart, Bacterial adhesion and entry into host cells, Cell 124 (4) (2006) 715–727. G. Ramachandran, Gram-positive and gram-negative bacterial toxins in sepsis, 5 (1) (2014). D. Ribet, P. Cossart, How bacterial pathogens colonize their hosts and invade deeper tissues, Microb. Infect. 17 (3) (2015) 173–183. A.J. Richards, L. al-Imara, N.P. Carter, J.C. Lloyd, M.A. Leversha, F.M. Pope, Localization of the gene (LAMA4) to chromosome 6q21 and isolation of a partial cDNA encoding a variant laminin A chain, Genomics 22 (1) (1994) 237–239. K.J. Robson, J.R. Hall, M.W. Jennings, T.J. Harris, K. Marsh, et al., A highly conserved amino-acid sequence in thrombospondin, properdin and in proteins from sporozoites and blood stages of a human malaria parasite, Nature 335 (1988) 79–82. D. Roy, J.-P. Auger, M. Segura, N. Fittipaldi, D. Takamatsu, M. Okura, M. Gottschalk, Role of the capsular polysaccharide as a virulence factor for Streptococcus suis serotype 14, Can. J. Vet. Res. 79 (2) (2015) 141–146. S.S. Rubin-de-Celis, H. Uemura, N. Yoshida, S. Schenkman, Expression of trypomastigote trans-sialidase in metacyclic forms of Trypanosoma cruzi increases parasite escape from its parasitophorous vacuole, Cell Microbiol. 8 (2006) 1888–1898. N.A. Sallee, G.M. Rivera, J.E. Dueber, D. Vasilescu, R.D. Mullins, B.J. Mayer, et al., The pathogen protein EspF(U) hijacks actin polymerization using mimicry and multivalency, Nature 454 (2008) 1005–1008. M. Seandel, D. James, S.V. Shmelkov, I. Falciator, Generation of functional multipotent adult stem cells from GPR125+ germline progenitors, Nature 449 (7160) (2007) 346–350. M. Sebaihia, M.W. Peck, N.P. Minton, N.R. Thomson, M.T.G. Holden, W.J. Mitchell, et al., Genome sequence of a proteolytic (Group I) Clostridium botulinum strain Hall A and comparative analysis of the clostridial genomes, Genome Res. 17 (7) (2007) 1082–1092. H.L. Shear, R.S. Nussenzweig, C. Bianco, Immune phagocytosis in murine malaria, J. Exp. Med. 149 (1979) 1288–1298. S. Sikora, A. Strongin, A. Godzik, Convergent evolution as a mechanism for pathogenic adaptation, Trends Microbiol. 13 (2005) 522–527. P. Šmarda, P. Bureš, L. Horová, I.J. Leitch, L. Mucina, E. Pacini, L. Tichý, V. Grulich, O. Rotreklová, Ecological and evolutionary significance of genomic GC content diversity in monocots, Proc. Natl. Acad. Sci. U.S.A. 111 (39) (2014) 4096–4102. E.B. Speth, Y.N. Lee, S.Y. He, Pathogen virulence factors as molecular probes of basic plant cellular functions, Curr. Opin. Plant Biol. 10 (6) (2007) 580–586. C.E. Stebbins, J.E. Galán, Structural mimicry in bacterial virulence, Nature 412 (2001) 701–705. E.R.M. Sydnor, T.M. Perl, Hospital epidemiology and infection control in acute-care settings, Clin. Microbiol. Rev. 24 (1) (2011) 141–173. G. Szebenyi, W.C. Wigley, B. Hall, A.J. Didier, M. Yu, P. Thomas, H. Kraemer, Hook 2 contributes to aggresome formation, BMC Cell Biol. 8 (19) (2007). S.V. Ghodge, A.A. Fedorov, E.V. Fedorov, B. Hillerich, R. Seidel, S.C. Almo, F.M. Raushel, Structural and mechanistic characterization of L-histidinol phosphate phosphatase from the PHP family of proteins, Biochemistry 52 (6) (2013) 1101–1112. P.C. Turnbull, Food poisoning with special reference to Salmonella – its epidemiology, pathogenesis and control, Clin. Gastroenterol. 8 (3) (1979) 663–714. L. van Alphen, H.M. Jansen, J. Dankert, Virulence factors in the colonization and persistence of bacteria in the airways, Am. J. Respir. Crit. Care Med. 151 (6) (1995) 2094–2099 discussion 2099–2100. W. Vollmer, B. Joris, P. Charlier, S. Foster, Bacterial peptidoglycan (murein) hydrolases, FEMS Microbiol. Rev. 32 (2) (2008) 259–286. A. Weiss, L.A. Leinwand, The mammalian myosin heavy chain gene family, Annu. Rev. Cell Dev. Biol. 12 (1996) 417–439. R. Woodward, W. Yi, L. Li, G. Zhao, H. Eguchi, R.S. Perali, et al., In vitro bacterial polysaccharide biosynthesis: defining the functions of Wzy and Wzz, Nat. Chem. Biol. 6 (6) (2010) 418–423. L. Xiong, J.L.L. Teng, M.G. Botelho, R.C. Lo, S.K.P. Lau, P.C.Y. Woo, Arginine metabolism in bacterial pathogenesis and cancer therapy, Int. J. Mol. Sci. 17 (3) (2016) 363. J. Yao, C. Allen, Chemotaxis is required for virulence and competitive fitness of the bacterial Wilt pathogen Ralstonia solanacearum, J. Bacteriol. 188 (10) (2006) 3697–3708. W. Yao, L. Hertel, L. Wahl, Dynamics of recurrent viral infection, Proc. Biol. Sci. 273 (1598) (2006) 2193–2199. D. Yu, Z. Yin, Y. Jin, J. Zhou, H. Ren, M. Hu, et al., Evolution of bopA gene in Burkholderia: a case of convergent evolution as a mechanism for bacterial autophagy evasion, BioMed Res. Int. (2016) 6745028.