Molecular Evolution of the Helicobacter pylori ... - Seth Bordenstein

2 downloads 0 Views 2MB Size Report
Sep 15, 2010 - Gonzalez-Rivera, C., K. A. Gangwer, M. S. McClain, I. M. Eli, M. G. ... Shirai, T. Nakazawa, R. Ally, I. Segal, B. C. Wong, S. K. Lam, F. O. Olfat,.
JOURNAL OF BACTERIOLOGY, Dec. 2010, p. 6126–6135 0021-9193/10/$12.00 doi:10.1128/JB.01081-10 Copyright © 2010, American Society for Microbiology. All Rights Reserved.

Vol. 192, No. 23

Molecular Evolution of the Helicobacter pylori Vacuolating Toxin Gene vacA䌤† Kelly A. Gangwer,1,2 Carrie L. Shaffer,1 Sebastian Suerbaum,3 D. Borden Lacy,1,2,4 Timothy L. Cover,1,5,7* and Seth R. Bordenstein6* Department of Microbiology and Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee1; Center for Structural Biology, Vanderbilt University School of Medicine, Nashville, Tennessee2; Institute of Medical Microbiology and Hospital Epidemiology, Hannover Medical School, Hannover, Germany3; Department of Biochemistry, Vanderbilt University School of Medicine, Nashville, Tennessee4; Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee5; Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee6; and Veterans Affairs Tennessee Valley Healthcare System, Nashville, Tennessee7 Received 10 September 2010/Accepted 15 September 2010

Helicobacter pylori is a genetically diverse organism that is adapted for colonization of the human stomach. All strains contain a gene encoding a secreted, pore-forming toxin known as VacA. Genetic variation at this locus could be under strong selection as H. pylori adapts to the host immune response, colonizes new human hosts, or inhabits different host environments. Here, we analyze the molecular evolution of VacA. Phylogenetic reconstructions indicate the subdivision of VacA sequences into three main groups with distinct geographic distributions. Divergence of the three groups is principally due to positively selected sequence changes in the p55 domain, a central region required for binding of the toxin to host cells. Divergent amino acids map to surface-exposed sites in the p55 crystal structure. Comparative phylogenetic analyses of vacA sequences and housekeeping gene sequences indicate that vacA does not share the same evolutionary history as the core genome. Further, rooting the VacA tree with outgroup sequences from the close relative Helicobacter acinonychis reveals that the ancestry of VacA is different from the African origin that typifies the core genome. Finally, sequence analyses of the virulence determinant CagA reveal three main groups strikingly similar to the three groups of VacA sequences. Taken together, these results indicate that positive selection has shaped the phylogenetic structure of VacA and CagA, and each of these virulence determinants has evolved separately from the core genome. suggests that H. pylori has spread throughout the world concurrently with the major events of human dispersal, and thus H. pylori is potentially a useful marker for the geographic migrations of human populations (12). One of the important virulence determinants of H. pylori is a secreted toxin known as VacA. VacA is a pore-forming toxin that causes multiple alterations in human cells, including cell vacuolation, depolarization of membrane potential, alteration of mitochondrial membrane permeability, apoptosis, activation of mitogen-activated protein kinases, inhibition of antigen presentation, and inhibition of T-cell activation and proliferation (8, 10, 15). Secreted by an autotransporter (type Va) secretion mechanism, VacA is translated as a 140-kDa protoxin that undergoes N- and C-terminal cleavage during the secretion process to yield an N-terminal signal sequence, a mature 88kDa secreted toxin known as p88, a small secreted peptide with no known function (termed secreted alpha peptide, or SAP) (7), and a C-terminal beta-barrel domain (41, 47) (Fig. 1A). Two domains of p88 VacA, p33 and p55, have been identified based on partial proteolysis of p88 into fragments of 33 kDa and 55 kDa, respectively (47) (Fig. 1A). The N-terminal p33 domain (residues 1 to 311) is involved in pore formation while the p55 domain (residues 312 to 821) contains one or more cell-binding domains (14, 48). The isolated p55 domain binds to host cells less avidly than does the full-length p88 protein, and in contrast to p88, the isolated p55 domain is not inter-

Helicobacter pylori is a Gram-negative bacterium that persistently colonizes the human stomach. H. pylori induces a gastric mucosal inflammatory response known as superficial gastritis and is a risk factor for the development of peptic ulcer disease, gastric adenocarcinoma, and gastric mucosa-associated lymphoid tissue (MALT) lymphoma (2, 43). H. pylori is present in about half of all humans throughout the world. H. pylori strains from unrelated humans exhibit a high level of genetic diversity (5, 44). The population structure of H. pylori is panmictic, and the rate of recombination in H. pylori is reported to be among the highest in the Eubacteria (17, 44). Multilocus sequence analysis of housekeeping genes has revealed the presence of at least nine different H. pylori populations or subpopulations that are localized to distinct geographic regions (12, 27, 31). Analysis of these sequences

* Corresponding author. Mailing address for T. L. Cover: Division of Infections Diseases, A2200 Medical Center North, Vanderbilt University School of Medicine, Nashville, TN 37232. Phone: (615) 322-2035. Fax: (615) 343-6160. E-mail: [email protected]. Mailing address for S. R. Bordenstein: Department of Biological Sciences, Vanderbilt University, Box 351634, Station B, Nashville, TN 37235-1634. Phone: (615) 322-9087. Fax: (615) 343-6707. E-mail: s.bordenstein @vanderbilt.edu. † Supplemental material for this article may be found at http://jb .asm.org/. 䌤 Published ahead of print on 24 September 2010. 6126

VOL. 192, 2010

FIG. 1. Analysis of VacA phylogeography. (A) The vacA gene encodes a 140-kDa protoxin, which undergoes cleavage to yield a signal sequence, a secreted 88-kDa toxin, a secreted alpha-peptide (SAP), and a C-terminal ␤-barrel domain. The mature 88-kDa VacA toxin contains two domains, designated p33 and p55. The midregion sequence that defines type m1 and m2 forms of VacA is located within p55. A 21-amino-acid insertion is present in m2 forms but not m1 forms of VacA. (B) Neighbor-joining phylogenetic tree of 100 amino acid sequences of VacA. Three major groups (designated groups 1 to 3) are evident. The chart shows the number of strains analyzed and characteristics of VacA protein sequences in each group of the tree. Group 1 comprises type m1 sequences mainly from non-Asian strains, group 2 comprises m1 sequences from Asian strains, and group 3 comprises m2 sequences from both Asian and non-Asian strains. See Fig. S1 in the supplemental material for a ladder-type version of this tree.

nalized by cells (18, 48). These observations suggest that sequences in both the p33 and p55 domains mediate VacA interactions with the surface of cells. All strains of H. pylori contain a chromosomal vacA gene, but individual strains differ considerably in levels of VacA activity (3, 8). Two studies analyzed vacA sequence encoding a fragment of the p33 domain and did not detect any recognizable phylogenetic structure (star or bush-type pattern), presumably due to the presence of extensive recombination (19, 44). Other studies analyzed different regions of VacA and detected polymorphisms that allow classification of vacA alleles into distinct families (designated s1/s2, i1/i2, and m1/m2) depending on the presence of signature sequences in different regions of VacA (3, 4, 39). Geographic differences have been detected within several of these vacA regions (22, 24, 29, 37, 51, 52, 55). In general, strains containing vacA alleles classified as s1, i1, or m1 have been associated with an increased risk of ulcer disease or gastric cancer compared to strains containing vacA alleles classified as s2, i2, or m2 (3, 13, 39). Another important H. pylori virulence factor is the secreted CagA effector protein. The cagA gene is localized within a

MOLECULAR EVOLUTION OF H. PYLORI vacA

6127

40-kb chromosomal region known as the cag pathogenicity island (PAI) (20). H. pylori strains expressing CagA are associated with a significantly increased risk for development of ulcer disease or gastric cancer compared to strains that lack the cagA gene (6). Upon entry into cells, CagA undergoes phosphorylation by host cell kinases and induces numerous alterations in cellular signaling, leading to the designation of CagA as a “bacterial oncoprotein” (20, 32). H. pylori strains that produce an active VacA protein (type s1 VacA) typically express CagA, and strains that produce inactive VacA proteins (type s2 VacA) typically lack the cagA gene (3). vacA and the cag PAI localize to distant sites on the H. pylori chromosome, and, therefore, the basis for this association has been unclear. Recently, several studies have reported that there are complex relationships between the cellular effects of VacA and CagA, whereby VacA can downregulate CagA’s effects on epithelial cells, or vice versa (1, 35, 46, 56). This functional interaction between VacA and CagA may represent a mechanism that allows H. pylori to minimize damage to gastric epithelial cells or minimize mucosal inflammation, thereby allowing it to persistently colonize the stomach. Although VacA is considered an important H. pylori virulence factor and hundreds of studies have classified H. pylori strains based on a vacA typing scheme, there has been very little effort to investigate the forces that drive vacA diversification, to analyze the evolutionary history of vacA, or to correlate vacA diversity with features of the VacA three-dimensional structure. Several important questions remain in studying the vacA gene: (i) Are the s1, i1, and m1 alleles (which are associated with an increased risk of gastroduodenal disease) more recently derived than the s2, i2, and m2 alleles? (ii) Are the geographic differences in vacA alleles driven by adaptive evolution or genetic drift? (iii) Does the evolutionary history of the vacA gene parallel the evolutionary history of the core genes used for MLST analysis, which are markers for ancient migrations of human populations? In the current study, we present a comprehensive analysis of the molecular evolution of vacA. Our analysis of VacA diversity indicates that VacA sequences are clustered into three main groups with distinct geographic distributions. By analyzing topological differences between vacA and housekeeping gene phylogenetic trees, we demonstrate that the vacA gene does not share the same evolutionary history as the core genome of H. pylori. We report that the evolution of VacA has been shaped by positive selection, and adaptive evolution is restricted to the p55 domain. Most of the sequence divergence corresponds to surface-exposed amino acids in the three-dimensional structure of the p55 domain. Finally, we note that there are similarities between the phylogenetic structure of the VacA and CagA trees, and we discuss the roles that positive selection pressures have played in the evolution of these two virulence determinants.

MATERIALS AND METHODS VacA reference sequences. VacA from strain 60190 (GenBank accession number Q48245) was used as the reference sequence for amino acid numbering, in which residue 1 refers to alanine-1 of the secreted 88-kDa VacA protein. It is the prototype s1/m1 form of VacA, and the crystal structure of the p55 domain of VacA from this strain has been determined previously (14). VacA sequences from strain 95-54 (GenBank accession number U95971) and strain Tx30a

6128

GANGWER ET AL.

(GenBank accession number Q48253) were used as reference sequences for s1/m2 and s2/m2 proteins, respectively (3, 36). Delineation of VacA domains. VacA domains analyzed in this study correspond to the following amino acid sequence numbers in VacA from H. pylori strain 60190: p33, residues 1 to 311; p55, residues 312 to 821; secreted alpha peptide, residues 822 to 954 (7); and the C-terminal beta-barrel domain, residues 955 to 1254. The signal sequence corresponds to residues preceding the p33 domain. Selection of VacA and CagA sequences for phylogenetic analysis. One hundred deduced VacA amino acid sequences (86 full-length gene sequences and 14 that were complete within the region encoding the p55 domain) (for strain names see Fig. S1 in the supplemental material) were identified by a BLAST search of the GenBank using the two prototype VacA sequences listed above. These sequences originated from H. pylori strains that were isolated from humans in many different regions of the world. To obtain additional VacA sequences of African origin, we analyzed vacA in five H. pylori strains that were isolated from patients in Africa and previously classified by multilocus sequence typing (MLST) analysis as HpAfrica2 (strains 191.9 and 501.9), HspSAfrica (cc2c), or HspW Africa (D1a and D1b) (12, 27). The vacA locus was amplified using an Expand Long Template PCR System (Roche Applied Science) with the primers described in Table S1 in the supplemental material, and the vacA sequences were determined. The vacA sequences from strains D1a and D1b each contained a frameshift mutation and were excluded from subsequent phylogenetic analyses. Sequences were aligned with MUSCLE and edited manually in MacClade, version 4.08 (http://macclade.org/macclade.html) (28). The total length of aligned sequences was 1,354 amino acids. All insertions/deletions (indels) and hypervariable regions were removed manually by eye from the alignments, resulting in a final alignment length of 1,135 amino acids for the unrooted analysis and 971 amino acids for the rooted analysis. For analysis of CagA sequences, we evaluated the group of 100 strains from which VacA sequences were available and identified 46 strains for which full-length CagA sequences were also available. Criteria for classification of VacA sequences. VacA sequences were classified as m1 or m2 based on the absence or presence, respectively, of a 21-amino-acid insert within the p55 domain (between amino acids 475 and 476) (Fig. 1A) (3). We identified and excluded two m1/m2 chimeric VacA proteins (from strains ch2 and v225) in which tracts of recombination between m1 and m2 sequences were identifiable by eye (4, 14, 54). VacA sequences were classified as s1 or s2 based on the absence or presence, respectively, of a 9-amino-acid insertion in the signal sequence region (3). VacA sequences were classified as i1 or i2 based on amino acid substitutions that fall into two clusters, previously denoted as clusters B and C (39). Phylogenetic analyses. Unrooted phylogenetic distance trees based on VacA and CagA protein sequences were created using the neighbor-joining method with a Jukes-Cantor genetic distance model in Geneious, version 4.6.5 (A. J., Drummond, B. Ashton, M. Cheung, J. Heled, M. Kearse, R. Moir, T. StonesHavas, T. Thierer, and A. Wilson, Biomatters, Auckland, New Zealand). Support for nodes on the neighbor-joining trees was assessed by 2,000 replicates of bootstrap, and the majority rule consensus trees are shown. Maximum-likelihood (ML) trees based on DNA sequences were used for Shimodaira-Hasegawa (SH) statistical tests of topological congruence. Prior to ML analyses, a DNA substitution model for each data set was selected using jModelTest, version 0.1.1 (http://darwin.uvigo.es/software/modeltest.html) (38) with the corrected Akaike information criterion (AICc). ML heuristic searches were performed using 100 random taxon addition replicates with tree bisection and reconnection branch swapping. ML bootstrap support was determined using 100 bootstrap replicates, each using 10 random taxon addition replicates with tree bisection and reconnection (TBR) branch swapping. Searches were performed in parallel on a Beowulf cluster using the clusterpaup program, written by A.G. McArthur, and PAUP, version 4.0b10 (45). Shimodaira-Hasegawa test. The Shimodaira-Hasegawa test (42) is used to compare the topology of a maximum-likelihood (best) tree to that of an alternate evolutionary hypothesis for tree topology. We tested the significance of topological differences between the vacA and MLST phylogenetic trees and between the vacA and cagA trees using the SH test (42). The test compares the likelihood score (⫺lnL) of a given sequence alignment across its ML tree versus the ⫺lnL of that data set across alternative topologies, which in this case are the ML phylogenies for other data sets. The differences in the ⫺lnL values are evaluated for statistical significance using bootstrap (1,000 replicates) based on two methods, the resampling estimated log-likelihood (RELL) method and the more extensive full optimization. These two approaches yielded similar results. Reconstruction of a vacA pseudogene from Helicobacter acinonychis. The entire vacA pseudogene of H. acinonychis, corresponding to approximately nucleotides 443900 to 439500 in the genome sequence of strain Sheeba (9, 11), was translated

J. BACTERIOL. in all three reading frames, and the translated fragments with homology to H. pylori VacA were then concatenated. The VacA protein encoded by the reconstructed H. acinonychis vacA pseudogene consists of 1,310 amino acids. A BLAST search indicates that the reconstructed H. acinonychis VacA sequence exhibits 64% amino acid identity to its closest match in H. pylori and retains a high level of relatedness to H. pylori VacA throughout the sequence. Analysis of housekeeping genes. Nucleotide sequences of housekeeping genes were retrieved from the H. pylori multilocus sequence typing database (http: //pubmlst.org/helicobacter). This database contains nucleotide sequence data (398 to 627 nucleotides per gene) for seven housekeeping genes (atpA, efp, mutY, ppa, trpC, ureI, and yphC) from each H. pylori strain included in the database (12). Concatenated nucleotide sequences were aligned using MUSCLE and edited manually in MacClade, version 4.08 (28). To permit rooting of a tree of concatenated housekeeping genes, we retrieved orthologous sequences from the H. acinonychis genome (11). PhyloBayes and MrBayes inference methods were used to generate the rooted housekeeping gene trees and posterior probability values. Rooted phylogenetic analyses of VacA sequences. PhyloBayes, version 2.3 (http://megasun.bch.umontreal.ca/People/lartillot/www/index.htm), was used to reconstruct the VacA rooted trees based on various inference methods. These analyses were performed by leveraging several models of molecular evolution to the sequence alignments, including the site-homogeneous models of Jones-Taylor-Thorton (JTT) and Whelan and Goldman (WAG) and the category amino acid site-heterogeneous mixture model (CAT), to suppress tree artifacts associated with long-branch attraction (26). For all analyses, at least two independent runs were performed with free equilibrium frequencies inferred from the data and gamma distributed rate variation with four discrete categories. Burn-ins up to 20% of the sampled trees were used until a maximum difference (MaxDiff) value of ⬍0.15 was achieved to ensure chain equilibration. Population genetic tests of selection. A sliding-window analysis of the ratio of nonsynonymous to synonymous substitutions dN/dS was performed using VacA sequences from strains 60190 (m1 type) and 95-54 (m2 type) with the program DnaSP (http://www.ub.edu/dnasp) (40). Sliding-window parameters included a window size of 50 bases and a step size of 10 bases. For further analyses, a total of 45 VacA sequences, corresponding to 15 VacA amino acid sequences from each VacA group, were retrieved from GenBank. These strains are shown in Fig. S1 in the supplemental material (in boldface). Additionally, a total of 32 CagA sequences, corresponding to CagA amino acid sequences from each CagA group (6 from group 1, 15 from group 2, and 11 from group 3), were retrieved. Sequences were assembled and aligned with Geneious and edited manually in MacClade, version 4.08 (28). All indels and hypervariable regions were removed manually by eye. The standard McDonald-Kreitman test (http://mkt.uab.es/mkt/) (30) was carried out on full-length vacA, individual regions of vacA, and fulllength cagA sequences with the exclusion of low-frequency variants less than or equal to 15% to reduce artifacts associated with detecting adaptive evolution. The neutrality index (NI) was calculated from the ratio of the number of polymorphisms to the number of substitutions as follows: NI ⫽ (Pn/Ps)/(Dn/Ds), where P is polymorphic within the population, D is divergence or fixed difference between populations, n is nonsynonymous, and s is synonymous. Nucleotide sequence accession numbers. Sequences of the vacA genes determined in this study were deposited in GenBank under accession numbers HQ287752, HQ287753, and HQ287754.

RESULTS AND DISCUSSION Phylogenetic analysis of VacA. As a first approach for studying phylogenetic features of VacA, we analyzed 100 complete or nearly complete VacA amino acid sequences that were available in GenBank. An unrooted phylogenetic analysis demonstrated that most of the sequences clustered into three distinct groups (Fig. 1B; see also Fig. S1 in the supplemental material), corresponding to non-Asian strains (predominantly from Australia, Kenya, the United States, and Europe; group 1), Asian strains (predominantly from China and Japan; group 2), and strains with a worldwide distribution (both Asian and non-Asian; group 3). Based on an analysis of indels that are diagnostic of previously described VacA families (3, 4, 39), all of the sequences in group 1 and group 2 were classified as type m1, and all of the sequences in group 3 were classified as type

VOL. 192, 2010

MOLECULAR EVOLUTION OF H. PYLORI vacA

6129

FIG. 2. Neighbor-joining phylogeny of the VacA p55 domain and VacA p33 domain. (A) Three main groups (designated groups 1 to 3) are detected within this tree. The chart shows the number of strains analyzed and characteristics of VacA sequences in each group of the tree. This tree maintains the same pattern as the VacA full-length tree shown in Fig. 1. The nomenclature for the primary VacA p55 groups (groups 1, 2, and 3) is consistent with the nomenclature of groups in the full-length VacA tree (Fig. 1). (B) Two main groups are evident, designated group Ap33 and group Bp33. The chart shows the number of strains analyzed and characteristics of VacA sequences in each group of the tree. The sequences in group Ap33 were localized in groups 1, 2, and 3 of the full-length VacA tree (Fig. 1B) and groups 1, 2, and 3 of the p55 tree (panel A), and sequences in group Bp33 were all localized in group 3 of the full-length VacA tree and group 3 of the p55 tree. Divergence between group Ap33 and group Bp33 reflects differences within the VacA intermediate region (39). The sequences in group Ap33 are characterized as type i1, with the exception of two sequences that appear to be i1-i2 hybrids, and sequences in group Bp33 are exclusively characterized as type i2.

m2 (Fig. 1B). All of the VacA sequences in groups 1 and 2 contain a type s1 signal sequence region; three sequences within group 3 contain a type s2 signal sequence, and the remaining sequences contain type s1 signal sequences (Fig. 1B). Within group 3 there is a subgroup of four sequences (from strains CHN5147, CHN1811a, CHN5114a, and CHN3295b; designated subgroup II), all of which were from H. pylori strains isolated in Shanghai, China (24). The VacA sequence from strain Shi470, which was isolated from an Amerindian patient in the Amazon (25), was located in the tree between group 1 and group 2 sequences. Phylogenetic analysis of VacA structural domains. To determine which of the structural domains in VacA have shaped the tree into three phylogeographic groups, we performed phylogenetic analyses on five putative structural domains (Fig. 1A): p55, p33, signal sequence region, secreted alpha-peptide (SAP), and the C-terminal ␤-barrel region (Fig. 2, p55 and p33; see also Fig. S2A to C, respectively, in the supplemental material for the other domains). There was marked variation in the general appearance of these trees. Of particular interest, the tree structures of the two domains comprising the secreted VacA toxin (p33 and p55) were markedly different from each other (Fig. 2). Only the tree for the p55 domain (427 aligned amino acids of 1,135 total amino acids) yielded a three-group pattern (Fig. 2A) that overlaps with the phylogeography of full-length VacA (Fig. 1B). The other regions exhibited tree structures (Fig. 2B; see also Fig. S2A to C) substantially different from those of full-length VacA or p55 trees. Therefore, the localization of full-length VacA sequences to three main groups (Fig. 1) is determined primarily by protein sequence

divergence in the p55 region. In each of the trees, we noted that particular groups of sequences had distinct geographic distributions. Within the p55 tree, a group of sequences of Asian origin, classified as group 2 (m1 Asian) in Fig. 2, have been assigned a variety of different labels in previous publications, including m1b, m1T, and m3 (21, 29, 37, 53). Phylogenetic incongruence between the trees of vacA and housekeeping genes. Previous studies have classified H. pylori strains into a set of population groups with distinct geographic distributions, based on MLST analyses of seven housekeeping genes (12, 27, 31). To investigate relationships between the phylogeny of housekeeping genes and the three-group structure of the VacA phylogeny (Fig. 1B), we analyzed the nucleotide sequences of vacA and housekeeping gene sequences from 12 strains for which both sets of sequences were available. In addition, we determined the vacA nucleotide sequences of two strains previously classified as HpAfrica2 by MLST analysis since this population group is known to exhibit a relatively high level of divergence from other H. pylori population groups (12, 27, 31). Housekeeping genes from different strains do not differ substantially from one another at the protein level, and, therefore, this comparative analysis required the use of nucleotide sequences rather than protein sequences. The overall topology of the vacA tree was completely dissimilar from that of the housekeeping gene tree (see Fig. S3 in the supplemental material). To statistically evaluate the topological incongruence or congruence between the vacA and housekeeping gene phylogenies, we compared the maximum-likelihood (ML) phylogenies of the 14 taxa common to both data sets using the Shimodaira-Hasegawa (SH) test. This analysis confirmed that

6130

GANGWER ET AL.

J. BACTERIOL.

TABLE 1. Results of Shimodaira-Hasegawa test of alternative tree topologies for housekeeping and vacA genesa Likelihood score for data set Topology

Core gene tree vacA tree

Core genes

vacA

⫺9,773.96 ⫺10,156.12*

⫺14,725.85* ⫺14,247.38

a Data set denotes the alignment of the concatenated core genes and the vacA gene. Topology denotes the maximum-likelihood trees shown in Fig. S3 in the supplemental material. The likelihood scores (⫺lnL) are shown in the table and are based on comparing each data set across its own ML tree topology and the alternative topology. The lowest (best) likelihood scores are indicated in boldface for each data set. Significance of the likelihood differences from the comparisons of a common data set across different topologies was measured using a bootstrap approach with RELL sampling and full optimization for 1,000 replicates. For example, the score from the comparison of the core genes data set against the core gene topology (⫺9773.96) is significantly better than the score from the alternative comparison of the core genes data set against the vacA gene topology (⫺10156.12). ⴱ, P ⬍ 0.001.

the vacA and housekeeping tree topologies are significantly different (P ⬍ 0.0001) (Table 1; see also Fig. S3), indicating that the vacA toxin gene has a different evolutionary history from that of the core genes of H. pylori. Rooted phylogenetic analyses of VacA and housekeeping genes. To further compare the ancestry of vacA with that of the MLST core genes, we generated and compared a rooted tree of full-length VacA protein sequences with a rooted tree of concatenated housekeeping gene sequences, using the same strains used in the SH test described above. We used the corresponding nucleotide sequences from housekeeping genes from the close relative H. acinonychis to root the housekeeping gene tree (11), and we used the deduced protein sequence of a reconstructed vacA pseudogene in H. acinonychis as an outgroup to root the VacA tree (11). We could not use vacA nucleotide sequences in this analysis because of the extremely high nucleotide divergence between the outgroup sequence and the ingroup. Nonetheless, the VacA protein and vacA nucleotide trees of the ingroup recapitulate the same threegroup phylogenies (data not shown).

The Bayesian root for the MLST housekeeping gene tree is confidently positioned in taxa classified as HpAfrica2, a population currently found almost exclusively in South Africa (12, 27) (Fig. 3A). We performed a second, rooted analysis using a larger MLST data set consisting of 61 sequences from representative H. pylori strains that previously had been classified into nine geographically distinct populations and subpopulations (12). The root was again positioned in taxa classified as HpAfrica2; the next most closely related taxa are also from Africa and are classified as HspSouth Africa or HspWest Africa subpopulations. Results were similar between the smaller and larger data sets, and thus there was no effect of taxon selection on the placement of the MLST root (compare Fig. 3A and Fig. S4 in the supplemental material). To further confirm the rooting position in African populations, we excluded the HpAfrica2 taxa and repeated the analysis. In this case, the root is positioned in taxa classified as HspSouth Africa subpopulation (data not shown). Taken together, the confident placement of the MLST rooting in the African taxa confirms previous reports of an ancient African origin for H. pylori in humans (12, 27). A 971-amino-acid sequence alignment of the reconstructed VacA amino acid sequence of H. acinonychis with 14 ingroup taxa yields a Bayesian phylogeny (Fig. 3B) with the VacA root confidently positioned at the B38 taxon (an s2/m2 form of VacA from a strain classified as HpEurope based on MLST analysis). We also created a rooted tree for a larger data set using 24 ingroup taxa (see Fig. S5 in the supplemental material), corresponding to H. pylori VacA sequences that were representative of VacA groups 1 to 3 in the unrooted tree (Fig. 1). In this analysis, the VacA root is confidently positioned in the CHN3295 and CHN5147 taxa (see Fig. S5). These two strains also have m2 sequence characteristics and belong to the group 3 subgroup II of the full-length VacA tree (Fig. 1). Thus, the root of the VacA tree is unexpectedly positioned in m2 taxa of Chinese origin, rather than taxa of African origin, with the next branch consisting of m2 sequences of non-Asian origin. This analysis does not allow us to determine with confidence

FIG. 3. Comparative phylogenetic analyses of housekeeping gene sequences and VacA sequences. Rooted MrBayes trees of concatenated housekeeping gene sequences and VacA sequences. (A) Nucleotide sequences of seven housekeeping genes (atpA, efp, mutY, ppa, trpC, ureI, and yphC) from 14 strains of H. pylori and one outgroup, H. acinonychis, were analyzed. A classification of H. pylori strains into populations or subpopulations based on MLST analysis (12, 27) is shown in parentheses. Housekeeping genes are referred to as core genes. (B) Deduced amino acid sequences of VacA from the same 14 strains of H. pylori and one outgroup, H. acinonychis, were analyzed. The numbers represent Bayesian posterior probability values for each node.

VOL. 192, 2010

FIG. 4. Sliding-window analysis of vacA from H. pylori strains 60190 and 95-54. vacA sequences from strains 60190 (non-Asian m1 type) and 95-54 (non-Asian m2 type) were aligned, and dN/dS ratios were calculated using DnaSP with a sliding window of 50 bases and a 10-bp step size. A dN/dS value of ⬎1 indicates positive selection.

whether the s1, i1, and m1 forms of VacA are more recently derived than the s2, i2, and m2 forms. However, these data suggest that m2 forms of VacA were present at the time when H. pylori and H. acinonychis diverged from a common ancestor. Three observations suggest that the VacA rooting position is accurate. First, the VacA rooting position in m2 Asian sequences is supported with two different inference methods, MrBayes and PhyloBayes. Second, three different models of evolution (CAT, WAG, and JTT) and removal or addition of other m2 Asian sequences and hypervariable regions does not alter the VacA rooting position; in particular, the probabilistic inference model, CAT, accounts for across-site heterogeneities and can handle model misspecifications associated with longbranch attraction artifacts (26). Third, the maximum amino acid identity of the VacA outgroup with Asian m2 (65.4%) or non-Asian m2 sequences (64.0%) is greater than that of the outgroup with Asian m1 (57.4%) or non-Asian m1 sequences (51.1%). Positive selection in VacA. A relatively high level of divergence within the p55 VacA cell-binding domain compared to other domains may reflect relaxed constraint on that portion of the sequence or positive selection if amino acid replacements confer a selective advantage. In the latter case, we expect to observe an accumulation of nonsynonymous changes (dN) at a rate higher than that of synonymous changes (dS). Previous studies failed to detect positive selection in VacA based on analyses of dN/dS ratios (4), but such analyses are known to lack sensitivity when applied to large segments of a gene. As another approach for investigating the evolutionary pressures acting on VacA, we first analyzed vacA sequences for positive selection (dN/dS of ⬎1) using a sliding-window analysis with full-length vacA sequences from strains 60190 (type m1 nonAsian, group 1) and 95-54 (type m2, group 3). The crystal structure of the p55 domain is available for VacA from strain 60190 (14), and VacA from strain 95-54 is known to exhibit a different cell type specificity from that of VacA from strain 60190 (36). dN/dS ratios greater than 1 were observed in mainly one portion of the vacA sequence, the p55 cell-binding domain (Fig. 4). To follow up the observation of elevated dN/dS ratios in the portion of vacA encoding the p55 cell-binding domain, we

MOLECULAR EVOLUTION OF H. PYLORI vacA

6131

collected full-length DNA sequences of 15 vacA alleles from each of the three main groups (Fig. 1, groups 1 to 3). We used the McDonald-Kreitman test (MKT) (30) to investigate if adaptive evolution in the p55 domain is driving the divergence of the three groups. The MKT analyzes the neutral theory prediction that the ratio of synonymous-to-nonsynonymous polymorphism (Ps/Pn) within groups should be the same as the ratio of synonymous-to-nonsynonymous divergence (Ds/Dn) between groups. It was used previously to detect positive selection in an H. pylori sel1 homolog (33). The results indicate a significant deviation from neutrality when full-length vacA sequences and the p55 domain (Table 2) (P ⬍ 0.001) are analyzed but not when the p33 domain or other regions are analyzed. Excess nonsynonymous fixation, one signature of adaptive protein evolution, causes the neutrality index (NI) in the MKT to be less than 1. For all statistically significant MKT comparisons, the NI was ⬍0.53 (Table 2). These results confirm the sliding-window analysis and indicate that the divergence in the p55 cell-binding domain is due to strong positive selection. Serum antibody responses to VacA are known to be directed predominantly against the p55 domain rather than the p33 domain (16). Therefore, immune selective pressure could potentially be one of the important forces that drive positive selection within the p55 domain. In addition, it is possible that diversification represents functional adaptation of the p55 domain to interact with different receptors or targets in host cells (36). Surface exposure of divergent amino acids within the p55 domain. Comparative sequence analyses revealed three main families of p55 domain sequences (Fig. 2), and divergence within this domain is the result of positive selection. To investigate the location of divergent amino acids within the threedimensional structure of the p55 domain, we used the sequence of the p55 domain from H. pylori strain 60190 as a reference for VacA sequences classified as group 1 (m1 nonAsian) in Fig. 1 and 2 since a crystal structure is available for the p55 domain from this strain. All of the VacA sequences from group 2 (Fig. 1, m1 Asian) were aligned to generate an m1 Asian consensus sequence, and, similarly, all of the VacA sequences from group 3 (Fig. 1, m2) were aligned to generate an m2 consensus sequence. We then compared the reference (group 1) sequence with the two consensus sequences and identified the sites of divergent amino acids within the p55 crystal structure. In a comparison of the reference group 1 (m1 non-Asian) VacA sequence with that of the group 2 (m1 Asian) consensus sequence, 30 sites differed, and 28 of these sites were surface exposed (Fig. 5A). A total of 109 sites differed when the reference VacA sequence was compared with the group 3 (m2) consensus sequence, and 95 were surface exposed (Fig. 5B). Interestingly, 17 of the 30 divergent amino acids identified in the first comparison (reference versus m1 Asian) were also divergent in the second comparison (reference versus m2), and 15 of these correspond to surface-exposed residues. At 10 of these 17 sites, each of the three populations contains a distinct amino acid substitution, which suggests that the observed divergence has resulted from multiple independent bouts of evolutionary changes. Divergent amino acids often appeared as contiguous sites (or clusters of amino acids) within the p55 crystal structure (Fig. 5). In particular, VacA sequences clas-

6132

GANGWER ET AL.

J. BACTERIOL.

TABLE 2. Analysis of positive selection in vacA using the McDonald-Kreitman testa ␣-Valuec

Domain of VacAb

Dn

Ds

Pn

Ps

P value

NI

Group 2 vs group 3 Full-length Signal sequence p33 p55 SAP ␤-Barrel

146.06 0 0 153.49 1 0

68.03 0 0 75.68 0 0

248 17 57 99 22 72

286 19 75 92 22 100

0 NA NA 0.001 0.322 NA

0.408 NA NA 0.53 0 NA

0.591 NA NA 0.469 1 NA

Group 1 vs group 3 Full-length Signal sequence p33 p55 SAP ␤-Barrel

132.02 0 3 130.93 3.02 1

63.09 0 2.01 61.1 3.07 1

353 21 78 143 33 78

393 19 104 140 29 101

0 NA 0.446 0 0.862 0.856

0.429 NA 0.501 0.476 1.157 0.774

0.57 NA 0.498 0.523 -0.157 0.225

Group 2 vs group 1 Full-length Signal sequence p33 p55 SAP ␤-Barrel

46.51 0 4.01 32.64 6.08 4.01

31.81 0 5.08 17.62 2.03 7.18

194 12 31 70 27 54

314 16 75 110 29 84

0 NA 0.348 0.001 0.154 0.828

0.422 NA 0.523 0.343 0.311 1.15

0.577 NA 0.476 0.656 0.688 ⫺0.15

a The neutrality index (NI) was calculated from the ratio of the number of polymorphisms to the number of substitutions as follows: NI ⫽ (Pn/Ps)/(Dn/Ds), where P is polymorphic within the population, D is divergence or fixed difference between populations, n is nonsynonymous, and s is synonymous. Shaded lines indicate statistically significant results that are indicative of positive selection. NA, not applicable. b Group 1, m1 non-Asian; group 2, ml Asian; group 3, m2. c The proportion of adaptive substitutions that ranges from ⫺⬁ to 1 and is estimated as 1 ⫺ NI.

FIG. 5. Divergent amino acids within the VacA p55 domain. The three-dimensional structure of the VacA p55 domain (amino acids 355 to 811) from H. pylori 60190 (classified as group 1 in Fig. 1 and 2) was used as a reference for mapping divergent amino acids. (A) VacA sequences classified in group 2 (m1 Asian) were aligned, and a consensus sequence was determined. Differences between the sequence of VacA from strain 60190 and the m1 Asian consensus sequence are highlighted in blue. (B) VacA sequences classified in group 3 (m2) were aligned, and a consensus sequence was determined. Differences between the sequence of VacA from strain 60190 and the m2 consensus sequence are highlighted in blue.

sified as m2 contained a set of divergent amino acids corresponding to a longitudinal contiguous patch within the crystal structure (Fig. 5B). A surface-exposed location of divergent amino acids is consistent with the hypothesis that these residues may be subject to antibody recognition. Moreover, alterations in amino acids found on the surface of VacA could potentially lead to alterations in the interactions of VacA with host cells. CagA phylogeography and adaptive evolution. In H. pylori strains 26695 and J99, vacA and cagA (encoding the secreted effector protein CagA) are located ⬃350 kb apart in the genomes. Recently, it has been shown that VacA can downregulate CagA’s effects on epithelial cells and that CagA can protect cells against the apoptotic effects of VacA (35, 46). Furthermore, VacA can counteract the ability of CagA to activate nuclear factor of activated T cells (NFAT) in gastric epithelial cells (56). We thus hypothesized that these two genes share an evolutionary history characterized by co- or counteradaptations in response to a common selective pressure. We identified 46 H. pylori strains for which both VacA and CagA sequences were available. Phylogenetic analysis of the full-length CagA sequences revealed three groups (Fig. 6). Clustering of CagA sequences from strains of East Asian origin in a distinct group is consistent with results of previous studies (50, 55). Notably, the overall appearance of the CagA tree is very similar to the phylogeny of VacA (Fig. 1). Group 1 in the CagA tree consists of seven sequences that are predominantly from non-Asian strains (Fig. 6). The corresponding VacA sequences from most of these strains are characterized as m1 non-Asian and are found in group 1 in the full-length VacA tree (Fig. 1). CagA group 2 consists of 25 exclusively Asian

VOL. 192, 2010

FIG. 6. Analysis of CagA phylogeography. Neighbor-joining phylogenetic tree of 46 CagA amino acid sequences. Three major groups are evident: group 1 consists predominantly of sequences from non-Asian strains, group 2 consists of Asian sequences, and group 3 consists of Asian sequences. The chart shows the number of strains analyzed and characteristics of VacA sequences in each group of the tree. CagA sequences shown in groups 1 and 2 correspond to H. pylori strains containing type m1 VacA (groups 1 and 2 of Fig. 1), whereas CagA sequences shown in group 3 correspond to strains containing type m2 VacA (group 3 of Fig. 1). The nomenclature for the primary CagA groups (groups 1, 2, and 3) is consistent with the nomenclature of groups in the full-length VacA tree (Fig. 1).

sequences; most of the corresponding VacA sequences are characterized as m1 Asian and are found in group 2 in the full-length VacA tree. Finally, CagA group 3 consists of 11 exclusively Asian sequences; most of the corresponding VacA sequences are characterized as m2 and are found in group 3 in the full-length VacA tree. The observed similarities in the CagA and VacA phylogenetic trees suggest that CagA and VacA might be coevolving due to similar selective pressures. In support of this hypothesis, a McDonald-Kreitman test indicates that positive selection has shaped CagA divergence of group 2 from both group 3 (P ⬍ 0.005; NI of 0.58) and group 1 (P ⬍ 0.018; NI of 0.66). Two recent publications also detected positive selection when cagA sequences were analyzed (34, 49), and a recent paper reported an association between particular cagA motifs and specific vacA types in a different group of strains (23). Thus, there are striking similarities between the topologies of VacA and CagA trees, and positive selection has shaped the phylogenetic structures of both of these virulence determinants. Phylogenetic incongruence between the trees of vacA and cagA. We next sought to investigate more rigorously the evolutionary relationships between vacA and cagA. Generation of a rooted tree of CagA sequences is not possible because cagA is not present in H. acinonychis and is not currently known to be present in any species other than H. pylori. Therefore, we statistically tested the topological similarity between vacA and cagA phylogenies. For this analysis, we selected 28 H. pylori strains containing VacA and CagA sequences that were representative of the three different groups. We compared the ML phylogenies

MOLECULAR EVOLUTION OF H. PYLORI vacA

6133

based on nucleotide sequences of the 28 taxa common to both data sets using the SH test. We used vacA nucleotide sequences in this analysis instead of protein sequences because the nucleotide differences provide increased resolution for the topology comparisons, and the trees are unrooted, which obviates the need to use protein sequences that are more conserved. Nonetheless, the vacA and cagA nucleotide trees of the ingroup recapitulate the same three-group phylogenies (see Fig. S6 in the supplemental material). Despite the grouping similarities in the vacA and cagA trees, the topologies are significantly different based on the SH test (P ⬍ 0.001) (Table 3; see also Fig. S6), indicating that the vacA toxin gene has not coevolved in strict concert with cagA. This result is not unexpected as there are strain differences (OK111, J99, F37, and Shi470) in the two phylogenies that can account for this statistical result (see Fig. S6). Second, there are fine-scale differences in evolutionary relationships within the groups that are not congruent when the vacA and cagA trees are compared. Repeating the SH test after removal of the four major outliers again yielded a significant difference between the topologies (data not shown). The most parsimonious explanation for the fine-scale differences between the genes, and yet the broad similarities in phylogeographic patterns and patterns of adaptive evolution, is that historical bouts of adaptation drove the parallel divergence of both cagA and vacA. However, more recent evolutionary changes at the tips of the three groups have scrambled any support for statistical concordance. These recent changes within groups could now be occurring by either drift or selection unrelated to the ancestral changes that drove the three groups’ common divergence. Thus, we hypothesize that VacA and CagA functionally interact most effectively when they are from the same group (i.e., group 1 VacA interacts most efficiently with group 1 CagA, etc.). Conclusions. In summary, our key findings indicate, first, that VacA sequences can be classified into three distinct groups on the basis of amino acid sequences and that different VacA domains exhibit different evolutionary histories. Second, VacA has undergone strong divergence and positive selection in the p55 cell-binding domain, which is consistent with humoral immune recognition of this domain; a result may be optimized binding of VacA to different receptors or targets in

TABLE 3. Results of Shimodaira-Hasegawa test of alternative tree topologies for cagA and vacA genesa Likelihood score for data set Topology

cagA tree vacA tree

cagA

vacA

⫺14,893.50 ⫺16,922.57*

⫺18,862.81* ⫺17,537.20

a Data set denotes the alignments of the vacA and cagA genes. Topology denotes the maximum-likelihood trees shown in Fig. S6 in the supplemental material. The likelihood scores (⫺lnL) are shown in the table and are based on comparing each data set across its own ML tree topology and the alternative topology. The lowest (best) likelihood scores are indicated in boldface for each data set. Significance of the likelihood differences from the comparisons of a common data set across different topologies was measured using a bootstrap approach with RELL sampling and full optimization for 1,000 replicates. For example, the score from the comparison of the cagA data set against the cagA topology (⫺14893.50) is significantly better than the score from the alternative comparison of the cagA data set against the vacA topology (⫺16922.57). ⴱ, P ⬍ 0.001

6134

GANGWER ET AL.

host cells. Third, divergent amino acids map to surface-exposed sites in the p55 domain. Fourth, the phylogeographic features of VacA and CagA are surprisingly similar yet markedly different from the phylogeographic features of housekeeping genes, which reflect a global spread of H. pylori out of Africa. We speculate that there is likely a related selective pressure on both VacA and CagA. Since there is substantial physical distance between vacA and cag genes within the bacterial genome, this selection could arise by a form of pseudolinkage of functionally interacting genes that perhaps balances proinflammatory and anti-inflammatory characteristics of strains to facilitate long-term colonization of the human gastric mucosa.

J. BACTERIOL.

16.

17.

18.

19.

ACKNOWLEDGMENTS

20.

This work was supported by National Institutes of Health grants R01 AI39657, R01 AI068009, and P01 CA116087 (to T.L.C.) and R01 GM085163 (to S.R.B.), by German Research Foundation grant SFB 900/A1 (to S.S.), and by funding from the Department of Veterans Affairs (to T.L.C.) and the Burroughs Wellcome Fund (to D.B.L.).

21.

REFERENCES

23.

1. Argent, R. H., R. J. Thomas, D. P. Letley, M. G. Rittig, K. R. Hardie, and J. C. Atherton. 2008. Functional association between the Helicobacter pylori virulence factors VacA and CagA. J. Med. Microbiol. 57:145–150. 2. Atherton, J. C., and M. J. Blaser. 2009. Coadaptation of Helicobacter pylori and humans: ancient history, modern implications. J. Clin. Invest. 119:2475– 2487. 3. Atherton, J. C., P. Cao, R. M. Peek, Jr., M. K. Tummuru, M. J. Blaser, and T. L. Cover. 1995. Mosaicism in vacuolating cytotoxin alleles of Helicobacter pylori. Association of specific vacA types with cytotoxin production and peptic ulceration. J. Biol. Chem. 270:17771–17777. 4. Atherton, J. C., P. M. Sharp, T. L. Cover, G. Gonzalez-Valencia, R. M. Peek, Jr., S. A. Thompson, C. J. Hawkey, and M. J. Blaser. 1999. Vacuolating cytotoxin (vacA) alleles of Helicobacter pylori comprise two geographically widespread types, m1 and m2, and have evolved through limited recombination. Curr. Microbiol. 39:211–218. 5. Blaser, M. J., and D. E. Berg. 2001. Helicobacter pylori genetic diversity and risk of human disease. J. Clin. Invest. 107:767–773. 6. Blaser, M. J., G. I. Perez-Perez, H. Kleanthous, T. L. Cover, R. M. Peek, P. H. Chyou, G. N. Stemmermann, and A. Nomura. 1995. Infection with Helicobacter pylori strains possessing cagA is associated with an increased risk of developing adenocarcinoma of the stomach. Cancer Res. 55:2111–2115. 7. Bumann, D., S. Aksu, M. Wendland, K. Janek, U. Zimny-Arndt, N. Sabarth, T. F. Meyer, and P. R. Jungblut. 2002. Proteome analysis of secreted proteins of the gastric pathogen Helicobacter pylori. Infect. Immun. 70:3396– 3403. 8. Cover, T. L., and S. R. Blanke. 2005. Helicobacter pylori VacA, a paradigm for toxin multifunctionality. Nat. Rev. Microbiol. 3:320–332. 9. Dailidiene, D., G. Dailide, K. Ogura, M. Zhang, A. K. Mukhopadhyay, K. A. Eaton, G. Cattoli, J. G. Kusters, and D. E. Berg. 2004. Helicobacter acinonychis: genetic and rodent infection studies of a Helicobacter pylori-like gastric pathogen of cheetahs and other big cats. J. Bacteriol. 186:356–365. 10. de Bernard, M., A. Cappon, G. Del Giudice, R. Rappuoli, and C. Montecucco. 2004. The multiple cellular activities of the VacA cytotoxin of Helicobacter pylori. Int. J. Med. Microbiol. 293:589–597. 11. Eppinger, M., C. Baar, B. Linz, G. Raddatz, C. Lanz, H. Keller, G. Morelli, H. Gressmann, M. Achtman, and S. C. Schuster. 2006. Who ate whom? Adaptive Helicobacter genomic changes that accompanied a host jump from early humans to large felines. PLoS Genet. 2:e120. 12. Falush, D., T. Wirth, B. Linz, J. K. Pritchard, M. Stephens, M. Kidd, M. J. Blaser, D. Y. Graham, S. Vacher, G. I. Perez-Perez, Y. Yamaoka, F. Megraud, K. Otto, U. Reichard, E. Katzowitsch, X. Wang, M. Achtman, and S. Suerbaum. 2003. Traces of human migrations in Helicobacter pylori populations. Science 299:1582–1585. 13. Figueiredo, C., J. C. Machado, P. Pharoah, R. Seruca, S. Sousa, R. Carvalho, A. F. Capelinha, W. Quint, C. Caldas, L. J. van Doorn, F. Carneiro, and M. Sobrinho-Simoes. 2002. Helicobacter pylori and interleukin 1 genotyping: an opportunity to identify high-risk individuals for gastric carcinoma. J. Natl. Cancer Inst. 94:1680–1687. 14. Gangwer, K. A., D. J. Mushrush, D. L. Stauff, B. Spiller, M. S. McClain, T. L. Cover, and D. B. Lacy. 2007. Crystal structure of the Helicobacter pylori vacuolating toxin p55 domain. Proc. Natl. Acad. Sci. U. S. A. 104:16293– 16298. 15. Gebert, B., W. Fischer, and R. Haas. 2004. The Helicobacter pylori vacuolat-

22.

24.

25.

26.

27.

28. 29.

30. 31. 32.

33.

34.

35.

36.

37.

ing cytotoxin: from cellular vacuolation to immunosuppressive activities. Rev. Physiol. Biochem. Pharmacol. 152:205–220. Ghose, C., G. I. Perez-Perez, V. J. Torres, M. Crosatti, A. Nomura, R. M. Peek, Jr., T. L. Cover, F. Francois, and M. J. Blaser. 2007. Serological assays for identification of human gastric colonization by Helicobacter pylori strains expressing VacA m1 or m2. Clin. Vaccine Immunol. 14:442–450. Go, M. F., V. Kapur, D. Y. Graham, and J. M. Musser. 1996. Population genetic analysis of Helicobacter pylori by multilocus enzyme electrophoresis: extensive allelic diversity and recombinational population structure. J. Bacteriol. 178:3934–3938. Gonzalez-Rivera, C., K. A. Gangwer, M. S. McClain, I. M. Eli, M. G. Chambers, M. D. Ohi, D. B. Lacy, and T. L. Cover. 2010 Reconstitution of Helicobacter pylori VacA toxin from purified components. Biochemistry 49: 5743–5752. Gottke, M. U., C. A. Fallone, A. N. Barkun, K. Vogt, V. Loo, M. Trautmann, J. Z. Tong, T. N. Nguyen, T. Fainsilber, H. H. Hahn, J. Korber, A. Lowe, and R. N. Beech. 2000. Genetic variability determinants of Helicobacter pylori: influence of clinical background and geographic origin of isolates. J. Infect. Dis. 181:1674–1681. Hatakeyama, M. 2004. Oncogenic mechanisms of the Helicobacter pylori CagA protein. Nat. Rev. Cancer. 4:688–694. Ito, Y., T. Azuma, S. Ito, H. Miyaji, M. Hirai, Y. Yamazaki, F. Sato, T. Kato, Y. Kohli, and M. Kuriyama. 1997. Analysis and typing of the vacA gene from cagA-positive strains of Helicobacter pylori isolated in Japan. J. Clin. Microbiol. 35:1710–1714. Ito, Y., T. Azuma, S. Ito, H. Suto, H. Miyaji, Y. Yamazaki, Y. Kohli, and M. Kuriyama. 1998. Full-length sequence analysis of the vacA gene from cytotoxic and noncytotoxic Helicobacter pylori. J. Infect. Dis. 178:1391–1398. Jang, S., K. R. Jones, C. H. Olsen, Y. M. Joo, Y.-J. Yoo, I.-S. Chung, J.-H. Cha, and D. S. Merrell. 2010. Epidemiological link between gastric disease and polymorphisms in VacA and CagA. J. Clin. Microbiol. 48:559–567. Ji, X., F. Frati, S. Barone, C. Pagliaccia, D. Burroni, G. Xu, R. Rappuoli, J. M. Reyrat, and J. L. Telford. 2002. Evolution of functional polymorphism in the gene coding for the Helicobacter pylori cytotoxin. FEMS Microbiol. Lett. 206:253–258. Kersulyte, D., A. K. Mukhopadhyay, B. Velapatino, W. Su, Z. Pan, C. Garcia, V. Hernandez, Y. Valdez, R. S. Mistry, R. H. Gilman, Y. Yuan, H. Gao, T. Alarcon, M. Lopez-Brea, G. Balakrish Nair, A. Chowdhury, S. Datta, M. Shirai, T. Nakazawa, R. Ally, I. Segal, B. C. Wong, S. K. Lam, F. O. Olfat, T. Boren, L. Engstrand, O. Torres, R. Schneider, J. E. Thomas, S. Czinn, and D. E. Berg. 2000. Differences in genotypes of Helicobacter pylori from different human populations. J. Bacteriol. 182:3210–3218. Lartillot, N., H. Brinkmann, and H. Philippe. 2007. Suppression of longbranch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol. Biol. 7(Suppl. 1):S4. Linz, B., F. Balloux, Y. Moodley, A. Manica, H. Liu, P. Roumagnac, D. Falush, C. Stamer, F. Prugnolle, S. W. van der Merwe, Y. Yamaoka, D. Y. Graham, E. Perez-Trallero, T. Wadstrom, S. Suerbaum, and M. Achtman. 2007. An African origin for the intimate association between humans and Helicobacter pylori. Nature 445:915–918. Maddison, D. R., and W. P. Maddison. 2005. MacClade 4: analysis of phylogeny and character evolution. Sinauer Associates, Inc., Sunderland, MA. Mane, S. P., M. G. Dominguez-Bello, M. J. Blaser, B. W. Sobral, R. Hontecillas, J. Skoneczka, S. K. Mohapatra, O. R. Crasta, C. Evans, T. Modise, S. Shallom, M. Shukla, C. Varon, F. Megraud, A. L. Maldonado-Contreras, K. P. Williams, and J. Bassaganya-Riera. 2010. Host-interactive genes in Amerindian Helicobacter pylori diverge from their Old World homologs and mediate inflammatory responses. J. Bacteriol. 192:3078–3092. McDonald, J. H., and M. Kreitman. 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351:652–654. Moodley, Y., and B. Linz. 2009. Helicobacter pylori sequences reflect past human migrations. Genome Dyn. 6:62–74. Odenbreit, S., J. Puls, B. Sedlmaier, E. Gerland, W. Fischer, and R. Haas. 2000. Translocation of Helicobacter pylori CagA into gastric epithelial cells by type IV secretion. Science 287:1497–1500. Ogura, M., J. C. Perez, P. R. Mittl, H. K. Lee, G. Dailide, S. Tan, Y. Ito, O. Secka, D. Dailidiene, K. Putty, D. E. Berg, and A. Kalia. 2007. Helicobacter pylori evolution: lineage-specific adaptations in homologs of eukaryotic Sel1like genes. PLoS Comput. Biol. 3:e151. Olbermann, P., C. Josenhans, Y. Moodley, M. Uhr, C. Stamer, M. Vauterin, S. Suerbaum, M. Achtman, and B. Linz. 2010. A global overview of the genetic and functional diversity in the Helicobacter pylori cag pathogenicity island. PLoS Genet. 6:e1001069. Oldani, A., M. Cormont, V. Hofman, V. Chiozzi, O. Oregioni, A. Canonici, A. Sciullo, P. Sommi, A. Fabbri, V. Ricci, and P. Boquet. 2009. Helicobacter pylori counteracts the apoptotic action of its VacA toxin by injecting the CagA protein into gastric epithelial cells. PLoS Pathog. 5:e1000603. Pagliaccia, C., M. de Bernard, P. Lupetti, X. Ji, D. Burroni, T. L. Cover, E. Papini, R. Rappuoli, J. L. Telford, and J. M. Reyrat. 1998. The m2 form of the Helicobacter pylori cytotoxin has cell type-specific vacuolating activity. Proc. Natl. Acad. Sci. U. S. A. 95:10212–10217. Pan, Z. J., D. E. Berg, R. W. van der Hulst, W. W. Su, A. Raudonikiene, S. D.

VOL. 192, 2010

38.

39.

40.

41.

42. 43. 44.

45. 46.

47.

48.

Xiao, J. Dankert, G. N. Tytgat, and A. van der Ende. 1998. Prevalence of vacuolating cytotoxin production and distribution of distinct vacA alleles in Helicobacter pylori from China. J. Infect. Dis. 178:220–226. Posada, D. 2003. Using MODELTEST and PAUP* to select a model of nucleotide substitution. Curr. Protoc. Bioinformatics, chapter 6, unit 6.5. doi:10:1002/0471250953.bi0605s00. Rhead, J. L., D. P. Letley, M. Mohammadi, N. Hussein, M. A. Mohagheghi, M. Eshagh Hosseini, and J. C. Atherton. 2007. A new Helicobacter pylori vacuolating cytotoxin determinant, the intermediate region, is associated with gastric cancer. Gastroenterology 133:926–936. Rozas, J., J. C. Sanchez-DelBarrio, X. Messeguer, and R. Rozas. 2003. DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19:2496–2497. Schmitt, W., and R. Haas. 1994. Genetic analysis of the Helicobacter pylori vacuolating cytotoxin: structural similarities with the IgA protease type of exported protein. Mol. Microbiol. 12:307–319. Shimodaira, H. H., M. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:1114–1116. Suerbaum, S., and P. Michetti. 2002. Helicobacter pylori infection. N. Engl. J. Med. 347:1175–1186. Suerbaum, S., J. M. Smith, K. Bapumia, G. Morelli, N. H. Smith, E. Kunstmann, I. Dyrek, and M. Achtman. 1998. Free recombination within Helicobacter pylori. Proc. Natl. Acad. Sci. U. S. A. 95:12619–12624. Swofford, D. L. 2002. PAUP*: phylogenetic analysis using parsimony (*and other methods), version 4. Sinauer Associates, Sunderland, MA. Tegtmeyer, N., D. Zabler, D. Schmidt, R. Hartig, S. Brandt, and S. Backert. 2009. Importance of EGF receptor, HER2/Neu and Erk1/2 kinase signalling for host cell elongation and scattering induced by the Helicobacter pylori CagA protein: antagonistic effects of the vacuolating cytotoxin VacA. Cell Microbiol. 11:488–505. Telford, J. L., P. Ghiara, M. Dell’Orco, M. Comanducci, D. Burroni, M. Bugnoli, M. F. Tecce, S. Censini, A. Covacci, Z. Xiang, et al. 1994. Gene structure of the Helicobacter pylori cytotoxin and evidence of its key role in gastric disease. J. Exp. Med. 179:1653–1658. Torres, V. J., S. E. Ivie, M. S. McClain, and T. L. Cover. 2005. Functional

MOLECULAR EVOLUTION OF H. PYLORI vacA

49.

50.

51.

52.

53.

54.

55.

56.

6135

properties of the p33 and p55 domains of the Helicobacter pylori vacuolating cytotoxin. J. Biol. Chem. 280:21107–21114. Torres-Morquecho, A., S. Giono-Cerezo, M. Camorlinga-Ponce, C. F. Vargas-Mendoza, and J. Torres. 2010. Evolution of bacterial genes: evidences of positive Darwinian selection and fixation of base substitutions in virulence genes of Helicobacter pylori. Infect. Genet. Evol. 10:764–776. Truong, B. X., V. T. Mai, H. Tanaka, T. Ly le, T. M. Thong, H. H. Hai, D. Van Long, K. Furumatsu, M. Yoshida, H. Kutsumi, and T. Azuma. 2009. Diverse characteristics of the CagA gene of Helicobacter pylori strains collected from patients from southern Vietnam with gastric cancer and peptic ulcer. J. Clin. Microbiol. 47:4021–4028. Van Doorn, L. J., C. Figueiredo, F. Megraud, S. Pena, P. Midolo, D. M. Queiroz, F. Carneiro, B. Vanderborght, M. D. Pegado, R. Sanna, W. De Boer, P. M. Schneeberger, P. Correa, E. K. Ng, J. Atherton, M. J. Blaser, and W. G. Quint. 1999. Geographic distribution of vacA allelic types of Helicobacter pylori. Gastroenterology 116:823–830. van Doorn, L. -J., C. Figueiredo, R. Sanna, S. Pena, P. Midolo, E. K. Ng, J. C. Atherton, M. J. Blaser, and W. G. Quint. 1998. Expanding allelic diversity of Helicobacter pylori vacA. J. Clin. Microbiol. 36:2597–2603. Wang, H. J., C. H. Kuo, A. A. Yeh, P. C. Chang, and W. C. Wang. 1998. Vacuolating toxin production in clinical isolates of Helicobacter pylori with different vacA genotypes. J. Infect. Dis. 178:207–212. Wang, W. C., H. J. Wang, and C. H. Kuo. 2001. Two distinctive cell binding patterns by vacuolating toxin fused with glutathione S-transferase: one highaffinity m1-specific binding and the other lower-affinity binding for variant m forms. Biochemistry 40:11887–11896. Yamazaki, S., A. Yamakawa, T. Okuda, M. Ohtani, H. Suto, Y. Ito, Y. Yamazaki, Y. Keida, H. Higashi, M. Hatakeyama, and T. Azuma. 2005. Distinct diversity of vacA, cagA, and cagE genes of Helicobacter pylori associated with peptic ulcer in Japan. J. Clin. Microbiol. 43:3906–3916. Yokoyama, K., H. Higashi, S. Ishikawa, Y. Fujii, S. Kondo, H. Kato, T. Azuma, A. Wada, T. Hirayama, H. Aburatani, and M. Hatakeyama. 2005. Functional antagonism between Helicobacter pylori CagA and vacuolating toxin VacA in control of the NFAT signaling pathway in gastric epithelial cells. Proc. Natl. Acad. Sci. U. S. A. 102:9661–9666.