Domain Organization and Evolution of Multifunctional Autoprocessing ...

1 downloads 355 Views 2MB Size Report
domain organization within the aquatic species Vibrio vulnificus as well as to study the evolution of the rtxA1 gene. ..... This new domain corresponds to Mcf-.
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, Jan. 2011, p. 657–668 0099-2240/11/$12.00 doi:10.1128/AEM.01806-10 Copyright © 2011, American Society for Microbiology. All Rights Reserved.

Vol. 77, No. 2

Domain Organization and Evolution of Multifunctional Autoprocessing Repeats-in-Toxin (MARTX) Toxin in Vibrio vulnificus䌤† Francisco J. Roig,1 Fernando Gonza´lez-Candelas,2,3 and Carmen Amaro1* Departamento de Microbiología, Facultad de Biología, Universidad de Valencia, Valencia, Spain1; Unidad Mixta Geno ´mica y Salud Centro Superior de Investigacio ´n en Salud Pu ´blica-Universidad de Valencia/Instituto Cavanilles de Biodiversidad y Biología Evolutiva, Valencia, Spain2; and CIBER de Epidemiología y Salud Pu ´blica, Valencia, Spain3 Received 30 July 2010/Accepted 2 November 2010

The objective of this study was to analyze multifunctional autoprocessing repeats-in-toxin (MARTX) toxin domain organization within the aquatic species Vibrio vulnificus as well as to study the evolution of the rtxA1 gene. The species is subdivided into three biotypes that differ in host range and geographical distribution. We have found three different types (I, II, and III) of V. vulnificus MARTX (MARTXVv) toxins with common domains (an autocatalytic cysteine protease domain [CPD], an ␣/␤-hydrolase domain, and a domain resembling that of the LifA protein of Escherichia coli O127:H6 E2348/69 [Efa/LifA]) and specific domains (a Rho-GTPase inactivation domain [RID], a domain of unknown function [DUF],a domain resembling that of the rtxA protein of Photorhabdus asymbiotica [rtxAPA], and an actin cross-linking domain [ACD]). Biotype 1 isolates harbor MARTXVv toxin types I and II, biotype 2 isolates carry MARTXVv toxin type III, and biotype 3 isolates have MARTXVv toxin type II. The analyzed biotype 2 isolates harbor two identical copies of rtxA1, one chromosomal and the other plasmidic. The evolutionary history of the gene demonstrates that MARTXVv toxins are mosaics, comprising pieces with different evolutionary histories, some of which have been acquired by intra- or interspecific horizontal gene transfer. Finally, we have found evidence that the evolutionary history of the rtxA1 gene for biotype 2 differs totally from the gene history of biotypes 1 and 3. to elucidate the genetic basis of V. vulnificus virulence for humans. These isolates produce a wide range of putative virulence factors, among which is a repeats-in-toxin (RTX) toxin, which belongs to the subfamily of multifunctional autoprocessing RTX (V. vulnificus MARTX [MARTXVv]toxin, or RtxA1) toxins. rtxA1 mutants are less virulent for mice and have lower cytotoxicity for different cell lines (17). The rtxA1 gene is practically identical in the two sequenced isolates of biotype 1 (7, 15) (98.9% and 99.5% nucleotide and amino acid identity, respectively) and codes for a protein of 5,206 amino acids (aa), which shows extensive regions of similarity with the MARTX toxin of Vibrio cholerae (MARTXVc toxin), the most extensively studied MARTX toxin to date (28). Both toxins share the N- and C-terminal repetitive regions, which seem to be common to all MARTX toxins, a Rho-GTPase inactivation domain (RID), an autocatalytic cysteine protease domain (CPD), and an ␣/␤-hydrolase domain. Meanwhile, they differ in that the biotype 1 MARTXVv (MARTXVvbt1) toxin lacks an actin cross-linking domain (ACD) and presents three additional putative domains, DUF3 (domain of unknown function), Mcf (a domain similar to an Mcf toxin of Photorhabdus luminescens), and PMT C1/C2 (a domain similar to a portion of the Pasteurella mitogenic toxin PMT) (27). The genetic basis underlying fish virulence is related to a recently described plasmid of 68 to 70 kb (18). This plasmid codes for resistance to the innate immune system of fishes and is currently under study. The plasmid also carries an rtxA1 gene, which codes for a toxin that seems to be a hybrid between the MARTXVvbt1 toxin and the MARTXVc toxin because it lacks the RID domain and presents an ACD domain (18). Interestingly, the chromosome of the only biotype 2 isolate

Vibrio vulnificus is an “accidental” pathogen, inhabiting brackish water ecosystems in temperate and tropical regions (22). The species has been subdivided into three biotypes on the basis of genetic, phenotypic, and host range differences (4, 35). Biotypes 1 and 2 are distributed worldwide while biotype 3 is geographically restricted to Israel (1, 4, 11, 35). Three O serovars have been described in biotype 2 (serovar A, serovar E, and serovar I), and one has been identified in biotype 3 (serovar O) while biotype 1 has not been fully serotyped (2, 5, 11). The three biotypes of V. vulnificus can cause disease in humans after ingestion of contaminated seafood or immersion of an open wound in brackish waters (22). In healthy people, the ingestion of V. vulnificus leads to vomiting and diarrhea while external exposure produces wound infection or severe ulcerations; however, in immunocompromised people, particularly those with chronic liver disease, the pathogen can invade the bloodstream and cause septicemia, leading to death in almost 50% of the cases (22). Interestingly, biotype 2 can also infect fish (mainly eels) and shrimps and cause death by hemorrhagic septicemia, a disease called warm-water vibriosis (35). Human isolates of biotype 2 belong to the most virulent fish serovar, serovar E (1). Several studies have been performed with biotype 1 isolates

* Corresponding author. Mailing address: Departamento de Microbiología, Facultad de Biología, Universidad de Valencia, Valencia, Spain. Phone: 34 96 354 31 04. Fax: 34 96 354 45 70. E-mail: carmen [email protected]. † Supplemental material for this article may be found at http://aem .asm.org/. 䌤 Published ahead of print on 12 November 2010. 657

658

ROIG ET AL.

APPL. ENVIRON. MICROBIOL. TABLE 1. V. vulnificus strains used in the study: source, biotype, and country of isolation Strain(s)

Biotype/serovara

ATCC 33816, CECT 5164, CECT 5168, CECT 5169, CECT 529Tb,c,d CECT 5166 CECT 5165 E4, JE MLT364, MLT404, MLT406, MLT362, VV1003, VV352, VV425 V4 CS9133 CECT 5167, N87, KH03, YN03 L49 YJ106d CG118 CG100d, CG106 CG110, CG111 CECT 4867 94-9-118

t1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT

94-9-130 94-9-119c A2, A4, A5, A6, A7, PD-1, PD-3, PD-5, PD-12, PD-2-66, V1 CECT 4606d Riu-1c, Riu-3 94385 CECT 4608 CECT 5768, A13, A21, A22, CECT 5689d, CECT 5769, A15, A16, A17, A18, CECT 5198, CECT 5343, A10, A11, A14c,d 4/7/17, 4/7/19, 21B, 22, 26, 27, 21A CECT 4864, CECT 4999d, CECT 4605, CECT 4607, CECT 4604d, CECT 4602, CECT 4603, CECT 4601 PD-2-47, PD-2-50, PD-2-51, PD-2-55, PD-2-56, CECT 5763 C1, CECT 5762 Riu-2d CIP8190 CECT 4868 90-2-11 4-8-112 94-9-123 CCUG38521 ¨ 122 O CECT 4866 CECT 4865 UE516 CECT 898, CECT 897, CECT 4174d, CECT 4862 95-8-161c,d, 95-8-162 CECT 4869 95-8-7 95-8-6d 97, 162, 11028c,d, vv12, vv32 CT218c,d,e

Origin or reference

Country of isolation

Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt1/NT Bt2/SerA

Human blood Wound infection Seawater Seafood Environment Human blood Human blood Human blood Brackish water Human blood Seawater Oyster Seawater Diseased eel Human expectoration Water Human wound Eel tank water Healthy eel Seawater Leg wound Eel farm water Diseased eel

USA USA USA USA USA Australia South Korea Japan Japan Taiwan Taiwan Taiwan Taiwan Sweden Denmark Denmark Denmark Spain Spain Spain Spain Spain Spain

Bt2/SerA Bt2/SerE

Diseased eel Diseased eel

Denmark Spain

Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerE Bt2/SerI Bt2/SerI Bt2/SerI Bt2/SerI Bt3/SerO Bt2/SerE

Eel tank water Healthy eel Seawater Human blood Diseased eel Diseased eel Human wound Seawater Human blood Diseased eel Human blood Diseased shrimp Diseased eel Diseased eel Diseased eel Diseased eel Diseased eel Diseased eel Diseased human 18

Spain Spain Spain France Norway Denmark Denmark Denmark Sweden Sweden Australia Taiwan Taiwan Japan Denmark Belgium Denmark Denmark Israel

a

Bt, biotype; NT, nontypeable; SerE, serovar E; SerA, serovar A; SerI, serovar I; SerO, serovar O. Type strain. c Strain used to sequence the whole rtxA gene. d Strain used to locate the rtxA gene. e Cured derivative biotype 2 isolate. b

studied in detail so far also harbors a copy of rtxA1 although it is as yet unknown whether this copy is similar to the plasmidic or the MARTXVvbt1 toxin one (18). The relevance of both copies for virulence in fishes and mice is currently under study. Considering the aforementioned results, we hypothesized that the rtxA1 gene of V. vulnificus is a mosaic gene, which varies within and among biotypes because it has acquired, probably through horizontal gene transfer (HGT) and subsequent integration into the original allele, some DNA sequences from different bacterial species. To test this idea, we studied the domain struc-

ture of MARTXVv toxin as well as the evolution of the rtxA1 gene in V. vulnificus using different approaches. First, we sequenced the complete rtxA1 gene in seven isolates and a 828-bp fragment from the 5⬘ region of the gene in a collection of 115 strains of different biotypes and serovars (Table 1). Second, we analyzed the mosaic domain organization of the predicted protein by comparing it with the sequences previously published in V. vulnificus (strains YJ016, CMCP6, CECT 4999, and CECT 4602), V. cholerae (strain N16961), V. splendidus (12B01), and Aeromonas hydrophila (strain ATCC 7966) (27). The sequences of the MARTX

VOL. 77, 2011

MARTX TOXIN EVOLUTION IN VIBRIO VULNIFICUS

toxin of V. cholerae M66-2, AM19226, N16961, MO10, and O395 as well as of Listonella anguillarum M93 (20) were also included although the domain organization of these proteins has not been published previously. Finally, we analyzed the phylogeny of the rtxA1 gene from whole gene sequences, from the sequences of the selected common fragment, and from each separate domain using the previously published homologous sequences from V. cholerae, V. splendidus, A. hydrophila, and L. anguillarum as outgroups.

described by Satchell (27) by direct alignment by using InterProScan Sequence Search, Superfamily, Psi-BLAST, and SBase Web programs. Phylogeny reconstruction. Phylogenetic trees were reconstructed using the neighbor-joining algorithm (25) and the maximum-likelihood method. Neighborjoining trees were obtained using MEGA, version 4 (33), with the maximumcomposite-likelihood distance (34) in the case of nucleotide sequences and the Poisson correction for amino acid alignments (37). Support for the groupings derived in these reconstructions was evaluated by bootstrapping using 1,000 replicates (10). Maximum-likelihood trees were obtained using PhyML (13). The best evolutionary model for the sequences according to jModelTest (23) and considering the Akaike information criterion (AIC) turned out to be the generalized time-reversible model of substitution with a gamma distribution (GTR⫹G) accounting for heterogeneity in evolutionary rates among sites. This model was used to derive the maximum-likelihood trees obtained in this work. The whole or partial rtxA1 genes of V. cholerae (strains M66-2, MO10, N16961, O395, and AM19226), V. splendidus (strain 12B01), L. anguillarum (strain M93), and A. hydrophila (strain ATCC 7966) were used as outgroups. The congruence among phylogenetic reconstructions obtained with different alignments was checked using Shimodaira-Hasegawa (SH) (31) and expected-likelihood weight (ELW) tests as implemented in TreePuzzle, version 5.2 (29, 32). A maximum-parsimony network using the Wagner method (9, 16) was constructed using the program MOVE in the PHYLIP package (PHYLIP, Phylogeny Inference Package, version 3.6[http://www.evolution.genetics.washington .edu]) from the presence/absence of the main domains in the rtx gene to show the relationships among the above genes. Nucleotide sequence accession numbers. Sequences of the whole rtxA1 genes determined in this study were deposited in the GenBank under accession numbers FJ002578 to FJ002581 and GU452642 to GU452644. Sequences of the common fragment of the rtxA1 gene were deposited under accession numbers GU452542 to GU452641 and GU452645 to GU452660.

MATERIALS AND METHODS Bacterial strains and growth conditions. The V. vulnificus strains used in this study are listed in Table 1. Strains were routinely grown in Trypticase soy broth or agar plus 5 g liter⫺1 NaCl (TSB-1 or TSA-1, respectively; Pronadisa, Spain) at 28°C for 24 h. For genomic DNA extraction, bacteria were grown overnight with shaking at 28°C. The strains were maintained both as lyophilized and as frozen stocks at ⫺80°C in marine broth (Difco) plus 20% (vol/vol) glycerol. DNA extractions. Genomic DNA was extracted according to the mini-prep protocol for genomic DNA extraction (3). Plasmid DNA was extracted following the TENS protocol (36) with the modifications of Roig and Amaro (24). Location of the rtxA1 gene. The rtxA1 gene was located in the chromosome or in the plasmids of the strains CECT 529T, YJ106, CG100, CECT 4606, CECT 5689, A14, CECT 4999, CECT 4604, Riu-2, CECT 4174, 95-8-161, 95-8-6, 11028, and CT218 (Table 1) by Southern hybridization, essentially as described by Roig and Amaro (24). Briefly, DNA samples were subjected to electrophoresis in 0.7% (wt/vol) agarose gels (molecular grade; Roche) and transferred to the membranes (zeta-probe blotting membrane; Bio-Rad) with a vacuum blotter (model 785; Bio-Rad) following the manufacturer’s instructions. The samples were genomic DNA extractions (3) and plasmid extractions (24). As molecular weight markers (MWM) Generuler DNA ladder mix and plasmids of the E. coli strain 39R861 were used. Probes of 1,087 nucleotides (nt) were generated from PCR products obtained with primers 2F and 2R (Table 2) labeled with digoxigenin (DIG)-11-dUTP (Roche, Switzerland), and the probes against the MWM were derived from the MWM using a DIG High-Prime kit (Roche, Switzerland. Hybridization was performed using a DIG Easy Hyb Granules kit (Roche, Switzerland), and color was developed with anti-digoxigenin-alkaline phosphatase (AP) Fab fragments together with blocking reagent (Roche, Switzerland), which were used following the manufacturer’s instructions. PCR and DNA sequencing. To sequence the whole rtxA1 gene in the strains CECT 529T, 94-9-119, Riu-1, A14, 95-8-161, 11028, and CT218 and the common fragment (an 828-bp fragment from the 5⬘ region of the gene obtained with the primers 12F and 12R) in the V. vulnificus collection of strains, PCRs were performed using the primers listed in Table 2. These primers were designed from the published genome sequences of two human biotype 1 strains (YJ016 and CMCP6) (7, 15) and of the plasmids of two biotype 2 strains (CECT 4999 and CECT 4602) (18). Reaction mixtures for PCRs (120 ␮l) contained 0.48 mM forward and reverse primer, 2.5 units of Taq DNA polymerase (5 U/␮l GoTaq; Promega), 18 ␮l of 5⫻ Taq reaction buffer (Gotaq Green; Promega), 1.3 mM MgCl2, 0.75 mM deoxynucleoside triphosphate (dNTP) mix (Promega), and 1 ␮l of DNA. PCRs were performed in a Techne thermocycler (TC-412). The reaction started with 10 min of denaturation at 94°C and was followed by 45 cycles of 40 s of denaturation at 94°C, 45 s of annealing at 52°C, and 90 s of extension at 72°C. An additional extension at 72°C for 10 min completed the reaction. A negative (no template DNA) and a positive (purified DNA of positive isolates) control were included in each PCR batch. Amplified products were separated by electrophoresis at 100 V for 30 min in a 2% (wt/vol) agarose (Low EEO; Pronadisa) gel with Tris-acetate-EDTA (TAE) buffer (Real, Spain) and were visualized by staining with ethidium bromide. Highly conserved regions, repetitive regions, and protein domain analysis. The highly conserved terminal regions of the rtxA1 genes sequenced in this work were identified by alignment of the amino acid sequences of the following: MARTXVc toxin of strains M66-2, AM19226, MO10, O395, and N16961 (YP_002810174.1, YP_002175744.1, ZP_05237921.1, NC_009457.1, NC_ 002505.1, respectively); MARTX toxin of L. anguillarum (MARTXLa) of isolate M93 (EU_155486.1); MARTX toxin of V. splendidus (MARTXVs) of strain 12B01 (ZP_00989505.1); and MARTXVv toxin of YJ016, CMCP6, CECT 4999, and CECT 4602 (YP_001393222.1, YP_001393065.1, NP_937086.1, and NP_762440.1, respectively) using the program AlignX from Invitrogen. Repetitive sequences were identified by searching the consensus sequences in the alignments using the MEGA program as sequence editor (33) and the AlignX (Invitrogen) and Expasy programs (12). Finally, the domains were determined as

659

RESULTS Location of the gene rtxA1. All V. vulnificus isolates tested of the three biotypes carried a copy of the rtxA1 gene in the chromosome (described in the literature as chromosome II or “small chromosome” [7, 15]), and those of biotype 2 had an additional copy in the plasmid of 68 to 70 kb (see Fig. S1 in the supplemental material). MARTXVv toxin domain organization. Figure 1 shows the comparison of conserved and nonconserved domains of the predicted MARTX toxin for V. vulnificus (four sequences previously published and seven obtained in this work) compared with those of MARTXVc toxin (strains M66-2, AM19226, MO10, O395, and N16961), MARTXLa toxin (L. anguillarum) (strain M93), MARTXVs toxin (strain 12B01), and the MARTX toxin of A. hydrophila (MARTXAh) (strain ATCC 7966). Tables 3 and 4 summarize the main structural characteristics of these proteins in V. vulnificus. The predicted proteins vary in length, ranging from 4,596 aa for biotype 2 isolates to 5,207 aa for three biotype 1 isolates (Fig. 1). The multiple alignment of MARTXVv toxins showed that they could be divided in three regions, two corresponding to conserved sequences among biotypes, located at the N- and C-terminal ends, respectively, and another of variable sequence, located in the central portion of the protein (Fig. 1). Within MARTXVv toxin, the N- and C-terminal ends were very similar (93 to 100% and 90.5 to 100%, respectively) (see Table S1 in the supplemental material). The similarity of MARTXVv toxin to the MARTXVc, MARTXVs, and MARTXLa toxins was also high (61.1 to 100%) (with the exception of the strain O395, which lacks nearly all of the Nterminal region although similarity was even lower with MARTXAh toxin [46.5 to 57.2%]) (see Table S1 in the supplemental material). According to Satchel (27), the MARTX

6R_Bt1

4F_Bt2

3R_Bt2

3F_Bt1

2R

2F

1R

1F

CAACAACCCTACCGTGG CCGCAGGTATGTTGGGC

CACCGCCCATCACCGCAA

CCTACGGCAATGGCA CACT CACCGCTTGAACATCG AAGTGGCGATGGCAA TGTG

10R

11F 11R

12F

12R

13F 13R

12R

12F

11F 11R

10R

Sequence (5⬘–3⬘)

94-9-119

CCTACGGCAATGGCA CACT CACCGCTTGAACATCG AAGTGGCGATGGCAA TGTG

CACCGCCCATCACCGCAA

CAACAACCCTACCGTGG CCGCAGGTATGTTGGGC

CCTTCACCATTCACGC CATA GGTGTTGTGGCGGCAGC

13F 13R

12R

12F

11F 11R

10R

10F

GAAGAAGTGGGCTCGGAC 9R

8F1_Bt1

8R_Bt1

8F_Bt1

7R_Bt1

7F_Bt1

6R_Bt1

CCTACGGCAATGGCA CACT CACCGCTTGAACATCG AAGTGGCGATGGCAA TGTG

CACCGCCCATCACCGCAA

CAACAACCCTACCGTGG CCGCAGGTATGTTGGGC

CCTTCACCATTCACGC CATA GGTGTTGTGGCGGCAGC

11F 11R

10R

10F

9F 9R

8R2_Bt1

8F2_Bt1

GAAGAAGTGGGCTCGGAC 8R1_Bt1

8R1_Bt1 CGATTATGACTGGGACAA ACGC 8F2_Bt1 CCAGACCAGAAGGAAACA ATGC 8R2_Bt1 GAAATCTGAAGCGGGT GTGG 9F GTGTGCGTCAGATTCGG

GGATAACAAGGTGCTC TCGC 8F1_Bt1 GATACTCCGTCAGGGCGT

8R_Bt1

8F_Bt1

6F_Riu1

5R_Riu1

5F_Riu1

4R_Bt2

4F_Bt2

3R_Bt2

3F_Bt2

2R

2F

1R

1F

0R

Sequence (5⬘–3⬘)

CTTCATCTGCTTGGCTGGC

GACAAGGCATCAAGGAG

CGGTCAAGCAATAAGC CAGA GTATCTGGACGGAAACGG

GGTGGATGCGAAAGC GACT CTGTCTGAGTAAAGCG TTGG CAACCTGCGTGGCTATGG

CGTCTACAGCCAGTTCAG

CCACACGGTTGAAGTT GCCA ACGCCAGAAGATTTGCC TCGT GACGAGGCAAATCTTCT GGCG GATGGATGCGAATGGTCTT

CAGTCGTTTGTCTTTGGTG

GTGACGCTCTTTGGTGCTC

13R

13F

12F 12R

11R

11F

10R

10F

9R

9F

CAACAACCCTACCGTGG 14F CCGCAGGTATGTTGGGC 14R

CTTCATCTGCTTGGC TGGC GGATAACAAGGTGCTC TCGC GATACTCCGTCAGG GCGT CGATTATGACTGGGACA AACGC CCAGACCAGAAGGAAA CAATGC GAAATCTGAAGCGGGT GTGG GTGTGCGTCAGATTCGG GAAGAAGTGGGCTC GGAC CCTTCACCATTCACGC CATA GGTGTTGTGGCGGCAGC

AAGTGGCGATGGCAA TGTG CCTTGGTTGGTTTCAT CTACTACGAAGATGAGCAC

CACCGCTTGAACATCG

CACCGCCCATCACCGCAA AGTGTGCCATTGCCGTAGG

CCGCAGGTATGTTGGGC

CAACAACCCTACCGTGG

CCTTCACCATTCACGC CATA GGTGTTGTGGCGGCAGC

GAAGAAGTGGGCTCGGAC

GTGTGCGTCAGATTCGG

CGTTAGCAGCAGAATCG 6R_Riu1 GTGGTATGTTCCTTGAC GCGT TCTGAAGGCTATGCGAG 6R2_Riu1 GCTGGTGGTTTGTTAGA

GCGGTGTCTGGCTTATT

Primer 0F

Riu-1

ROIG ET AL.

13F 13R

CCTTCACCATTCACGC CATA GGTGTTGTGGCGGCAGC

10F

10F

GAAGAAGTGGGCTCGGAC 9R

8R1_Bt1 CGATTATGACTGGGACAA ACGC 8F2_Bt1 CCAGACCAGAAGGAAACA ATGC 8R2_Bt1 GAAATCTGAAGCGGGT GTGG 9F GTGTGCGTCAGATTCGG

9R

8R_Bt2 GTCCGTTGCGAAGTCTG AAGC 9F GTGTGCGTCAGATTCGG

7R_Bt2 CCAGTCAAACCACCA AAGC 8F_Bt2 GCCCATCATCACTCACC

7F_Bt2 GTCTTCCTGTAGTCCTTTC

Primer

GTGACGCTCTTTGGTGCTC 0F

Sequence (5⬘–3⬘)

11028

GTGACGCTCTTTGGT GCTC CAGTCGTTTGTCTTTGGTG 0R CAGTCGTTTGTCTTTGGTG 0R CAGTCGTTTGTCTTT GGTG CCACACGGTTGAAGTT 1F CCACACGGTTGAAGTT 1F CCACACGGTTGAAGTT GCCA GCCA GCCA ACGCCAGAAGATTTGCC 1R ACGCCAGAAGATTTGCC 1R ACGCCAGAAGATTTGCC TCGT TCGT TCGT GACGAGGCAAATCTTCT 2F GACGAGGCAAATCTTCT 2F GACGAGGCAAATCTTCT GGCG GGCG GGCG GATGGATGCGAATGG 2R GATGGATGCGAATGG 2R GATGGATGCGAATGG TCTT TCTT TCTT GATGTCGCCTTCCGCAA 3F_Bt1 GATGTCGCCTTCCGCAA 3F_Bt1 GATGTCGCCTTCCGCAA TACC TACC TACC GGTGGATGCGAAAGC 5R_Bt1 CTGAGCAATCCAACCCT 3R_Bt1 GGAGTCACAGTCAT GACT TTAC TGGC CTGTCTGAGTAAAGCG 6F_Bt1 CCTGCCACACTTTCATC 4F2_Bt1 AATGTTGTTCGCAGCA TTGG CCAG GCGGTGTCTGGCTTATT 6R_Bt1 GCGGTGTCTGGCTTATT 4R2_Bt1 CAGGGCGTAATCA AGCG CGTTAGCAGCAGAATCG 7F_Bt1 CGTTAGCAGCAGAATCG 5F_Bt1 CGCATCAAAACCAAGG GCGT GCGT TCAC TCTGAAGGCTATGCGAG 7R_Bt1 TCTGAAGGCTATGCGAG 5R_Bt1 CTGAGCAATCCAACCCT TTAC CTTCATCTGCTTGGCTGGC 6F_Riu1 GACAAGGCATCAAGGAG 6F_Bt1 CCTGCCACACTTTCATC

Primer

GTGACGCTCTTTGGTGCTC 0F

Sequence (5⬘–3⬘)

CECT 529T

TABLE 2. Primer pairs for sequencing rtxA1 in various strains

GGATAACAAGGTGCTC TCGC 8F1_Bt1 GATACTCCGTCAGGGCGT

8F_Bt1

6F_Bt2 GGTATCAAGAACATC AGTG 6R_Bt2 TGCTCGTCTCATTGCTTTC

8R_Bt1

7R_Bt1

5R_Bt2 ATGCGAAGAACCCAATG

5F_Bt2 CGCTTTGGTCATACTTGGC 7F_Bt1

3R_Bt2 GGTGGATGCGAAAGC GACT 4F_Bt2 CTGTCTGAGTAAAGCG TTGG 4R_Bt2 CAACCTGCGTGGCTATGG

CCACACGGTTGAAGTT GCCA 1R ACGCCAGAAGATTTGCC TCGT 2F GACGAGGCAAATCTTCT GGCG 2R GATGGATGCGAATGG TCTT 3F_Bt2 CGTCTACAGCCAGTTCAG

1F

CAGTCGTTTGTCTTTGGTG 0R

Primer

0R

Sequence (5⬘–3⬘)

GTGACGCTCTTTGGTGCTC 0F

0F

Primer

CECT 4999 (C), A14 (P), and 95-8-161 (P)a

660 APPL. ENVIRON. MICROBIOL.

MARTX TOXIN EVOLUTION IN VIBRIO VULNIFICUS

14F 14R

13F 13R

12R CTACTACGAAGATGA GCAC

a

C, rtxA1 copy located on a chromosome; P, rtxA1 copy located on a plasmid.

14R CTACTACGAAGATGA GCAC CTACTACGAAGATGA GCAC 14R

14R

CCTTGGTTGGTTTCAT 14F

14F

CCTTGGTTGGTTTCAT

14F

CCTTGGTTGGTTTCAT

12F

CACCGCCCATCACC GCAA CCTACGGCAATGGCA CACT CACCGCTTGAACATCG AAGTGGCGATGGCAA TGTG CCTTGGTTGGTTTCAT CTACTACGAAGATGA GCAC

VOL. 77, 2011

661

toxin subfamily carries three different types of repetitive sequences in the N- and C-terminal ends, which share a G-7XGXXN central motif. We were unable to find these repeats in the same number and positions described by Satchell (27) in the strains we analyzed, which showed the following repetitive sequences: A-type repeat, GXXGXXXXXG; B-type repeat B.1, TXVGXGXX; B-type repeat B.2, GXANXXTXX; and C-type repeat, GGXGXDXXX. The results obtained for the redefined repeat sequences are shown in Table 3 and Fig. 1. The new repeats, their positions as well as the entire N and C termini, were highly conserved across the species and even within the genus (Fig. 1; see also Tables S2 to S5 in the supplemental material). The internal region of the MARTXVv toxin was very variable, showing a global identity and similarity of 18% and 44.5%, respectively (see Table S1). However, these values rose to over 95% for the group formed by the human biotype 1 isolates (YJ016, CMCP6, and 94-9-119) and to 100% for biotype 2 isolates (see Table S1). The functional domains and their positions are presented in Fig. 1 and Table 4. Three different modular types of MARTXVv toxins (called types I, II, and III) (Fig. 1) were found, all of which possessed a CPD domain, an ␣/␤-hydrolase domain, and a newly reassigned domain, denoted Efa/LifA (common domains). This new domain corresponds to McfDUF (27) and was renamed because of its very high similarity to the Efa/LifA-like protein of Escherichia coli O127:H6 E2348/69 (YP_002328637.1) (E-values between 2e⫺64 and 3e⫺65 for the domains of the different MARTXVv toxins). MARTXVv toxin type I was similar to the MARTXVvbt1 toxin described by Satchell (27). This type was present in human isolates of biotype 1, strains YJ016, CMCP6, and 94-9-119, and contained the common domains plus a DUF domain, with no similarity to any known protein (V. vulnificus DUF [DUFVv]), a RID domain, and a newly reassigned domain called rtxA of Photorhabdus asymbiotica (rtxAPA). This new domain corresponds to PMT C1/C2 (27) and was renamed because of its similarity to the rtxA protein of P. asymbiotica (access number YP_003040934.1) (E-values of 3e⫺158 to 1e⫺170 for the domains of the different MARTXVv toxins). MARTXVv toxin type II was represented by the type strain of the species, isolate CECT 529T of biotype 1, an environmental isolate of biotype 1, Riu-1, and the isolate of biotype 3, 11028. Type II differed from type I in that it lacked the rtxAPA domain, and, consequently, it was smaller (around 4,704 aa). MARTXVv toxin type III was identical to MARTXVvbt2 toxin, a plasmidic variant described by Lee et al. (18), which, in addition to the common domains, had an ACD domain and two copies of the Efa/LifA domain flanking the common ␣/␤-hydrolase domain. MARTXVc toxin of the five strains analyzed showed the same domain organization although MARTXVc toxin of strain O395 lacked around 1,910 aa in the N-terminal region (Fig. 1). MARTXLa toxin showed an organization similar to that of MARTXVv toxin type II except that it lacked the DUFVv domain, and MARTXVs toxin showed an organization similar to that of MARTXAh toxin and different from that of the MARTXVv and MARTXVc toxins with a replacement of the ACD domain by a domain of unknown function different from that of V. vulnificus (V. splendidus DUF [DUFVs]).

662

ROIG ET AL.

APPL. ENVIRON. MICROBIOL.

FIG. 1. Domain organization of the MARTX toxin structures from the analyzed genomes. The external regions, the repeats (vertical lines), and the internal domains for each toxin are color coded as indicated at the bottom of the figure. Diagrams are drawn to scale.

Phylogenetic reconstruction. The different phylogenetic trees derived from the complete nucleotide sequences from V. vulnificus and the outgroup species were coincident in revealing a topology in which V. cholerae strains formed a wellsupported, monophyletic group quite distinct from its sister group, which included V. vulnificus and two other outgroups, V. splendidus and L. anguillarum, which were associated with V. vulnificus strain Riu-1 (Fig. 2). As a result of the variable patterns of modular organization among the different strains and species analyzed (Fig. 1), the phylogenetic reconstruction derived from the multiple alignment of the complete gene may not be an accurate representation of the evolutionary relationships among the different sequences considered. In fact, it was not possible to obtain a reliable alignment including the out-

group species A. hydrophila. Hence, alternative strategies were used to decipher the evolutionary history of rtxA1 in these taxa. To this end, the phylogenetic trees from the external and internal regions of rtxA1 were analyzed separately. As can be seen in Fig. 3, they were not congruent with one another or with the sequence from the whole gene (Fig. 2 and 3). Thus, all V. vulnificus isolates grouped together with L. anguillarum and separately from V. cholerae and V. splendidus when only the external 5⬘-terminal regions were considered (Fig. 3A); they formed a well-supported monophyletic group highly related to the V. cholerae group, with V. splendidus and L. anguillarum occupying external positions, when only the external 3⬘-terminal regions were considered. In contrast, V. vulnificus strains were divided into two groups, one formed by biotype 1 and 3

VOL. 77, 2011

MARTX TOXIN EVOLUTION IN VIBRIO VULNIFICUS

663

TABLE 3. Repeat sequences in MARTXVv toxins Profile of repeat sequences by type Strain (location of rtxA1 gene copy)a

V. vulnificus YJ016 CECT 529T Riu-1 CMCP6 94-9-119 A14 (P) 95-8-161 (P) CECT 4999 (C) CECT 4602 (P) CECT 4999 (P) 11028 V. cholerae M66-2 A. hydrophila ATCC 7966 L. anguillarum M93 a

A type A.1

B type B.1

B type B.2

C type C.1

No. of repeats

Location (aa)

No. of repeats

Location (aa)

No. of repeats

Location (aa)

No. of repeats

Location (aa)

9 9 9 9 9 9 9 9 9 9 9

30–1483 30–1480 30–1483 30–1483 30–1483 30–1485 30–1485 30–1485 30–1485 30–1485 30–1483

29 29 30 29 29 29 29 29 29 29 29

729–1330 726–1328 724–1325 729–1330 729–1330 731–1332 731–1332 731–1332 731–1332 731–1332 729–1330

18 18 18 18 18 18 18 18 18 18 18

742–1287 739–1284 756–1301 742–1287 742–1287 744–1289 744–1289 744–1289 744–1289 744–1289 742–1287

9 9 9 9 9 9 9 9 9 9 9

4584–5063 4078–4557 4096–4572 4584–5063 4584–5063 3973–4452 3973–4452 3973–4452 3973–4452 3973–4452 4081–4560

9 10 9

30–1478 30–3012 30–1483

29 27 29

726–1328 704–1286 729–1330

18 17 18

739–1284 717–1224 742–1287

9 9 9

3923–4402 4072–4542 3783–4256

C, chromosome; P, plasmid.

isolates together with L. anguillarum and V. splendidus as a basal taxon to this group and the other formed by biotype 2 isolates when only the internal region was considered (Fig. 3C). The phylogenetic reconstruction from the common fragment of 828 nt corresponding to the 5⬘ end of the rtxA1 gene is shown in Fig. S2 in the supplemental material. The fragment was highly similar (95 to 100%) although there were differences between strains that were not related to biotype, with isolate Riu-1 presenting the largest number of substitutions (see Fig. S2). Nevertheless, the similarity between fragments was too high to infer any conclusive results from the evolutionary history of this portion of the gene. A maximum-parsimony network showing the relationships among the genes from the presence/absence of the domains in the rtxA1 gene is shown in Fig. 4. In this analysis, we included A. hydrophila, which was too divergent at the sequence level of the complete gene to be incorporated in the previous maximum-likelihood tree, and excluded V. splendidus because of its different modular organization (Fig. 1). The network suggested a very plastic modular organization in this gene, with a minimum of seven independent gains/losses of domains. Of these, only one homoplasy (the independent gain of domain rtxAPA in A. hydrophila and three strains of biotype 1) was found. Furthermore, V. vulnificus represented the most divergent gene organizations (with four or five gains/losses of domains) among the sequences included in the analysis. The next issue we pursued was an individual analysis of each of these domains, for which we used the corresponding multiple nucleotide sequence alignments (Fig. 3D). The only domains common to all the strains and species considered were the ␣/␤-hydrolase and CPD, and their phylogenetic histories were not congruent, again, to one another or with that derived with the complete sequences for this gene (Fig. 2 and 3). The most relevant discrepancies were for the ␣/␤-hydrolase domain, for which the sequence from Riu-1 was identical to sequences from biotype 2 strains, and for the CPD domain, for which the three biotype 1 strains 94-9-119, CMCP6, and YJ016

grouped externally to the rest with V. splendidus, placing V. cholerae and L. anguillarum as members of the ingroup (Fig. 3D). The differences found among biotype 1 sequences for the ␣/␤-hydrolase domain could be explained by an HGT event of this domain from one biotype 2 strain to Riu-1. The simplest explanation for the placement of the three biotype 1 sequences in the CPD domain phylogenetic tree is also their acquisition from an unidentified donor by an HGT event. To further verify these hypotheses, we performed reciprocal tests of congruence for the ␣/␤-hydrolase and CPD domains and the complete gene sequences using the 13 strains common to all three data sets. The results of the Shimodaira-Hasegawa (SH) and expected-likelihood weight (ELW) tests are summarized in Table 5. All the comparisons were highly significant for both tests with only one exception, corresponding to the results of the SH test of the topology obtained with the complete gene sequences versus the CPD domain alignment and its maximumlikelihood tree. This indicates that the phylogenetic reconstructions obtained from each alignment were not congruent with one another, thus providing statistical support for the HGT hypothesis for both domains. The remaining domains considered in this gene had a less constant distribution in the different strains and species considered. Biotype 2 strains presented two copies of the Efa1/LifA-like domain. The phylogenetic tree derived for the sequences in this domain suggests that the duplication probably occurred before the divergence of biotype 2 strains (Fig. 3D). This was not the only salient feature in the evolution of this domain. Here, the biotype 3 sequence (from strain 11028) grouped with one of the paralogous copies from biotype 2, the one that was not present in biotype 1 strains. Furthermore, two biotype 1 strains (CECT 529T and 94-9-119) did not group with the remaining sequences from this biotype and diverged closer to the root of the V. vulnificus group than they did in the maximum-likelihood tree for the whole-gene alignment. Interestingly, one outgroup sequence, corresponding to L. anguillarum, was very similar to, and grouped with, sequences from V. vulnificus biotype 1 strains. This resem-

664

ROIG ET AL.

APPL. ENVIRON. MICROBIOL. TABLE 4. Putative functional domains in MARTXVv toxins

Strain (location of rtxA1 gene copy)a

Protein

Domain nameb

Gene

Position (aa)

% Identity

c

Position (aa)

% Identityc

YJ016

DUF RID ␣/␤-Hydrolase Efa1/LifA rtxAPA CPD

1957–2345 2368–2915 2918–3277 3276–3519 3604–4080 4082–4188

ND 87.6 85.8 54.1 57 75.7

5871–7035 7081–8719 8709–9609 9844–10569 10789–12201 12202–12525

ND 82.4 89.7 55.5 61.4 70.6

CMCP6

DUF RID ␣/␤-Hydrolase Efa1/LifA rtxAPA CPD

1957–2345 2368–2915 2918–3277 3276–3519 3604–4080 4082–4188

ND 87.6 85.8 54.1 57 75.7

5871–7035 7081–8719 8709–9609 9844–10569 10789–12201 12202–12525

ND 82.2 89.1 55.6 61.5 70

94-9-119

DUF RID ␣/␤-Hydrolase Efa1/LifA rtxAPA CPD

1957–2345 2368–2915 2918–3277 3276–3519 3604–4080 4082–4188

ND 86.5 85.2 53.4 57 76.6

5871–7035 7081–8719 8709–9609 9844–10569 10789–12201 12202–12525

ND 81.1 89.5 55.6 61.4 70.3

CECT 529T

DUF RID ␣/␤-Hydrolase Efa1/LifA CPD

1954–2342 2365–2912 2915–3224 3273–3516 3576–3682

ND 86.3 86.1 53.4 86.9

5862–7026 7072–8710 8700–9600 9839–10548 10687–11007

ND 80,9 89.6 56.2 75.5

Riu-1

DUF RID ␣/␤-Hydrolase Efa1/LifA CPD

1971–2359 2382–2929 2932–3241 3290–3533 3593–3699

ND 80.1 86.8 54.3 86

5913–7077 7123–8761 8751–9651 9870–10599 10738–11058

ND 78.9 90 57.0 75.8

CECT 4999 (C) CECT 4999 (P) CECT 4602 (P) A14 (P) 94-9-161 (P)

ACD Efa1/LifA (copy 1) ␣/␤-Hydrolase Efa1/LifA (copy 2) CPD

1958–2421 2492–2735 2810–3119 3168–3411 3471–3577

89 53.9 86.5 54.3 86

5856–7225 7495–8205 8385–9285 9523–10233 10372–10692

82.5 56.7 97.1 56.7 76.1

11028

DUF RID ␣/␤-Hydrolase Efa1/LifA CPD

1957–2345 2368–2915 2918–3227 3276–3519 3579–3685

ND 83.6 86.1 53.9 86

5871–7035 7081–8719 8709–9609 9844–10569 10695–11016

ND 81.5 90.2 56.3 76.1

a

C, chromosome; P, plasmid. CPD, autocatalytic cysteine protease domain; Efa/LifA, LifA-like protein of Escherichia coli O127:H6 E2348/69; RID, Rho-GTPase inactivation domain; DUFVv, domain with an unknown function; rtxPA, rtxA of P. asymbiotica; ACD, actin cross-linking domain. c Percent identity with the V. cholerae N16961 rtxA gene for ACD, RID, CPD, and ␣/␤-hydrolase domains, with the E. coli Efa/LifA gene for the Efa1/LifA domain, and with the P. asymbiotica rtxA gene for rtxAPA domain. ND, no similarity detected. b

blance was also revealed in the previous analysis of the distribution of domains (Fig. 4) and by the phylogenetic tree from the internal part of the gene (Fig. 3C). Domains ACD, DUFVv, RID, and rtxAPA were present in only a few strains (Fig. 3D), and their corresponding phylogenetic trees were not too informative on the evolution of the rtxA1 gene. ACD was present in only biotype 2 strains and in the outgroups V. cholerae and A. hydrophila, whereas RID was present in biotype 1 and 3 strains and in the outgroups V. cholerae and L. anguillarum (Fig. 1 and 3D). In both cases, the within-biotype distribution of evolutionary distances was very similar to that observed in the analysis of the complete gene (Fig. 2). Domain rtxAPA was present in only three biotype 1 strains and in three outgroups (P. asym-

biotica, V. splendidus, and A. hydrophila) but not in V. cholerae. Finally, DUFVv was present in only V. vulnificus biotypes 1 and 3. In fact, this domain did not show similarity to known proteins by BLAST search. Similar to other portions in this gene, the strain Riu-1 presented a remarkable accumulation of mutations in comparison to other strains of the same subtype. DISCUSSION In the first part of this work, we have analyzed the structural diversity of MARTX toxins in V. vulnificus biotypes by comparing them to other MARTX toxins. We have found three different modular types according to the domains identified in

VOL. 77, 2011

MARTX TOXIN EVOLUTION IN VIBRIO VULNIFICUS

665

FIG. 2. Maximum-likelihood tree derived from the aligned regions from complete rtxA1 gene sequences (20,394 nt) using the GTR⫹G model of evolution. Bootstrap support values higher than 80% are indicated in the corresponding nodes.

the internal part of the protein. In contrast, the external regions are highly conserved within the species and even the genus, which correlates with the proposed function for these regions, that is, to interact with the membrane of eukaryotic cells allowing the translocation of effector domains to the cytoplasm. Upon translocation, CPD is activated to process the MARTX toxin, thereby releasing the internal domains, which alter cell function (30). We have redefined two of the internal domains, Mcf-DUF and PMT C1/C2 (originally described by Satchell [27]) as Efa/LifA and rtxPA, respectively, because of the new similarities revealed by BLAST searches. Lymphostatin (Efa/LifA protein) is a common toxin in Gram-negative bacteria with multiple functions, cell adhesion, immunosuppression, and disruption of epithelial barrier function via the modulation of Rho GTPase activities, while no function has been reported for the rtxA of P. asymbiotya. Biotype 1 isolates feature MARTXVv toxin type I, a type already described by Satchell (27), and type II, which is simpler than the former because it lacks the rtxPA domain. In addition, all V. vulnificus biotype 2 isolates, irrespective of their serovar, have MARTXVv toxin type III described by Lee et al. (18) and harbor two copies of the gene, one plasmidic and the other chromosomal. It has been proposed that MARTXVv toxin type I is the main factor for human virulence in V. vulnificus since mutations in this gene practically abolish virulence in mice (19). However, we have found that other human isolates (the biotype 3 isolate) can produce a different type of toxin that lacks the rtxPA domain, which would suggest that this domain is not necessary for virulence. In addition, we have found that fish-pathogenic isolates (biotype 2) show a different type, which could be related to specific virulence for fish. The role of the new types of MARTXVv toxins in virulence as well as the reason for gene duplication in the biotype 2 remains to be elucidated.

In the second part of this work, we have analyzed the rtxA1 gene using different data sets demonstrating that MARTXVv toxins are mosaics of segments with different evolutionary histories. Previous studies of the evolution of V. vulnificus based on sequence-based typing of housekeeping genes suggest that the species can be subdivided into two recent and rapidly expanding phylogenetic clusters (8). One lineage would correspond to biotype 1 isolates, mostly from human sources, including the strains YJO16 and CMCP6 used in this study. The other lineage would be formed by biotype 1 and 2 isolates from various environments, including diseased fish. The phylogenetic position of biotype 3 isolates, which are genetically identical, would be variable depending on the gene and the model and method of analysis (6). An ongoing phylogenetic study of in lab using sequence-based typing of housekeeping and virulence-related genes (26) with the same strains included in the present work suggests that V. vulnificus species are subdivided in three lineages, one including biotype 1 and 2 from the fish farming environment, another including biotype 3 isolates, and a third including biotype 1 strains from human blood and the environment. Interestingly, none of the evolutionary histories inferred from the rtxA1 gene matched the previously hypothesized evolutionary histories of V. vulnificus (6, 8, 26). The phylogenetic tree derived from the complete rtxA1 gene (Fig. 2) presented an unlikely evolutionary scenario: V. cholerae isolates conformed to a well-supported, monophyletic group, but this was not the case for V. vulnificus, which included two outgroups, V. splendidus and L. anguillarum, both related to strain Riu-1. The inclusion of outgroup species within the ingroup can be due to several reasons, which we tried to analyze next by dividing the gene in three different portions. These fragments showed very different evolutionary relationships, which also differed from the previous result with the complete gene. The 5⬘ end (Fig. 3A) and the internal portion (Fig. 3C)

666

VOL. 77, 2011

FIG. 4. Maximum-parsimony reconstruction of the presence/absence of modules in the central portion of rtxA1 genes in the genomes analyzed. Gains or losses of domains are indicated next to each branch.

were partially coincident, but the placement of V. splendidus differed in the two trees, being an outgroup to all the sequences in the N-terminal coding portion of the gene and basal to a subgroup of V. vulnificus (including biotypes 1 and 3 and L. anguillarum) in the central portion. The 3⬘-end coding portion of the gene provided yet another history, but in this case the outgroup species appeared where they were supposed to (Fig. 3B). These results indicated that rtxA1 of V. vulnificus has a complex evolutionary history, and this is also probably true for other closely related species. The modular structure of the gene may be partially responsible for this complexity (Fig. 1) and, in consequence, our next step was to investigate the evolutionary history of these modules. The maximum-parsimony network for the module composition helped to disentangle the complex history of this protein (Fig. 4). The analysis of gains/losses of modules revealed that V. vulnificus biotype 2 isolates are closer to A. hydrophila than to the other biotypes of the species and that there have been several gain/loss events between biotype 2 and biotypes 1 and 3. The most likely explanation for this complex evolutionary history involves HGT events from other bacteria and subsequent recombination, yielding new combinations, which might have been favored by selection. The rtxA1 gene belongs to the segmentally variable gene family, previously defined as mosaics of one or more rapidly evolving, variable regions interrupted by conserved regions. Many of the mosaic genes identified

MARTX TOXIN EVOLUTION IN VIBRIO VULNIFICUS

667

encode proteins involved in host-pathogen interactions, defense mechanisms, and intracellular responses to external changes, and this is the case for MARTXVv toxin. We have detected by the bioinformatic analysis evidence of at least two HGT events affecting the ␣/␤-hydrolase and CPD domains and probably also to the Efa/LifA domain a fragment duplicated in biotype 2 isolates. In the cases of the ␣/␤-hydrolase and Efa/ LifA domains, the transferred DNA was probably from the same species, whereas in case of the CPD domain, it probably came from a different, unknown donor. With regard to the domains ACD, DUFVv, RID, and rtxAPA, we can suggest that (i) the DUFVv domain is exclusive of V. vulnificus biotype 1 and does not show similarity to known proteins, (ii) the rtxAPA domain of V. vulnificus is too different from rtxA of P. asymbiotica to have been received directly from this donor by HGT, and (iii) the ACD and RID domains have not been received from the species considered (V. cholerae or V. cholerae/L. anguillarum, respectively) because the distance for these domains is similar to that for the rest of the studied common domains. In the light of these results, the hypothesis of the origin of the ACD domain of MARTXVv toxin type III from the ACD domain of MARTXVc toxin has to be discarded. V. vulnificus expresses a natural competence system induced by chitin (14) that has also been described in V. cholerae (21), which could work under natural conditions and favor the exchange of genetic material. In addition, the species possesses conjugative plasmids, which can be transmitted between strains and can even mobilize other resident plasmids (such as the virulence one) by forming cointegrates (18). Both genetic exchange mechanisms can result in coinfection with two different but related rtx genes, which could recombine, resulting in a new, different rtx gene. This process is exemplified in the biotype 2 of the species, which has a duplicated rtxA1 gene in the chromosome and in the plasmid and produces a MARTX toxin that is completely different from the toxins of biotypes 1 and 3. The presence of identical duplicate genes in the genome of the biotype 2 isolates suggests that either the acquisition has been recent or a strong purifying selection is acting against mutations that modify gene function. The phylogenetic study performed in our lab (26) demonstrates that biotype 2 is polyphyletic and suggests that it has emerged by acquisition of a virulence plasmid from V. vulnificus isolates from fish farms. According to this hypothesis, the acquisition of the plasmid could have occurred before or after divergence from the common ancestor for biotype 2 strains. The phylogenetic analysis shown here suggests that the duplication of the Efa/LifA domain likely occurred after the divergence between biotypes 1 and 2 and before the expansion of the biotype 2 clonal complex and, in consequence, more or less contemporaneously to the acquisition of the virulence plasmid mentioned above. This acquisition might have favored recombination events between the chromosomal and plasmidic rtxA1 genes, and the selection of a new combination of the mosaic gene would probably have been advantageous in the

FIG. 3. Maximum-likelihood trees derived from 5⬘-terminal (A), 3⬘-terminal (B), and internal (C) regions of the rtxA1 gene using the GTR⫹G model of evolution. The numbers of nucleotide positions considered in the multiple alignments were 5,844, 3,039, and 7,988, respectively. Bootstrap support values higher than 80% are indicated in the corresponding nodes. (D) Maximum-likelihood tree derived from the different modules of the rtxA1 gene using the GTR⫹G model of evolution. Bootstrap support values higher than 80% are indicated in the corresponding nodes.

668

ROIG ET AL.

APPL. ENVIRON. MICROBIOL.

TABLE 5. Summary of SH and ELW tests for the complete rtxA1 gene sequence and those obtained for the ␣/␤-hydrolase and CPD modules using the 13 common strains and speciesa Alignment

Complete

␣/␤-Hydrolase module

CPD module

Topology

lnL

SH test

ELW test

Complete ␣/␤-Hydrolase domain CPD domain

⫺70351.24 ⫺72657.44

1.000 0.000

1.000 0.000

⫺76609.58

0.000

0.000

␣/␤-Hydrolase domain Complete CPD domain

⫺2165.08

1.000

1.000

⫺2287.23 ⫺2235.99

0.000 0.000

0.000 0.000

CPD domain Complete ␣/␤-Hydrolase domain

⫺1150.62 ⫺1159.13 ⫺1252.89

1.000 0.273 0.000

0.971 0.029 0.000

a Each alignment was used to evaluate the log likelihood (lnL) of the ML tree obtained with each of the three data sets. The probability values for each topology and test are shown. Complete, complete rtxA1 gene sequence.

fish farming environment and, in consequence, would have spread rapidly throughout suitable habitats throughout the seas, appearing as highly conserved among the different clonal complexes. The reasons for such an advantage will have to be elucidated by obtaining mutants deficient for the different domains and their combinations and then testing these mutants in selected in vitro models and in vivo. ACKNOWLEDGMENTS This work has been financed by grants AGL2008-03977/ACU and BFU2008-03000 and by Programa Consolider-Ingenio CSD200900006 from MICINN and project ACOMP/2009/240 from Conselleria d’Educacio ´ (Generalitat Valenciana). We thank the SCSIE of the University of Valencia for technical support in determining the sequences. REFERENCES 1. Amaro, C., and E. G. Biosca. 1996. Vibrio vulnificus biotype 2, pathogenic for eels, is also an opportunistic pathogen for humans. Appl. Environ. Microbiol. 62:1454–1457. 2. Amaro, C., E. G. Biosca, B. Fouz, and E. Garay. 1992. Electrophoretic analysis of heterogeneous lipopolysaccharides from various strains of Vibrio vulnificus biotypes 1 and 2 by silver staining and immunoblotting. Curr. Microbiol. 25:99–104. 3. Ausubel, F. M., R. Brent, R. E. Kingston, D. D. Moore, J. G. Seidman, J. A. Smith, and K. Struhl. 2007. Current protocols in molecular biology. John Wiley, New York, NY. 4. Bisharat, N., V. Agmon, R. Finkelstein, R. Raz, G. Ben-Dror, L. Lerner, S. Soboh, R. Colodner, D. N. Cameron, D. L. Wykstra, D. L. Swerdlow, and J. J. Farmer III. 1999. Clinical, epidemiological, and microbiological features of Vibrio vulnificus biogroup 3 causing outbreaks of wound infection and bacteraemia in Israel. Lancet 354:1421–1424. 5. Bisharat, N., C. Amaro, B. Fouz, A. Llorens, and D. I. Cohen. 2007. Serological and molecular characteristics of Vibrio vulnificus biotype 3: evidence for high clonality. Microbiology 153:846–856. 6. Bisharat, N., D. I. Cohen, M. C. Maiden, D. W. Crook, T. Peto, and R. M. Harding. 2007. The evolution of genetic structure in the marine pathogen, Vibrio vulnificus. Infect. Genet. Evol. 7:685–693. 7. Chen, C. Y., K. M. Wu, Y. C. Chang, C. H. Chang, H. C. Tsai, T. L. Liao, Y. M. Liu, H. J. Chen, A. B. Shen, J. C. Li, T. L. Su, C. P. Shao, C. T. Lee, L. I. Hor, and S. F. Tsai. 2003. Comparative genome analysis of Vibrio vulnificus, a marine pathogen. Genome Res. 13:2577–2587. 8. Cohen, A. L., J. D. Oliver, A. DePaola, E. J. Feil, and E. F. Boyd. 2007. Emergence of a virulent clade of Vibrio vulnificus and correlation with the presence of a 33-kilobase genomic island. Appl. Environ. Microbiol. 73: 5553–5565. 9. Eck, R. V., and M. O. Dayhoff. 1966. Evolution of the structure of ferredoxin based on living relics of primitive amino acid sequences. Science 152:363–366.

10. Felsenstein, J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791. 11. Fouz, B., F. J. Roig, and C. Amaro. 2007. Phenotypic and genotypic characterization of a new fish-virulent Vibrio vulnificus serovar that lacks potential to infect humans. Microbiology 153:1926–1934. 12. Gasteiger, E., A. Gattiker, C. Hoogland, I. Ivanyi, R. D. Appel, and A. Bairoch. 2003. ExPASy: the proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Res. 31:3784–3788. 13. Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696–704. 14. Gulig, P. A., M. S. Tucker, P. C. Thiaville, J. L. Joseph, and R. N. Brown. 2009. User friendly cloning coupled with chitin-based natural transformation enables rapid mutagenesis of Vibrio vulnificus. Appl. Environ. Microbiol. 75:4936–4949. 15. Kim, Y. R., S. E. Lee, C. M. Kim, S. Y. Kim, E. K. Shin, D. H. Shin, S. S. Chung, H. E. Choy, A. Progulske-Fox, J. D. Hillman, M. Handfield, and J. H. Rhee. 2003. Characterization and pathogenic significance of Vibrio vulnificus antigens preferentially expressed in septicemic patients. Infect. Immun. 71: 5461–5471. 16. Kluge, A. G., and J. S. Farris. 1969. Quantitative phyletics and the evolution of anurans. Syst. Zool. 18:1–32. 17. Lee, B. C., S. H. Choi, and T. S. Kim. 2008. Vibrio vulnificus RTX toxin plays an important role in the apoptotic death of human intestinal epithelial cells exposed to Vibrio vulnificus. Microbes Infect. 10:1504–1513. 18. Lee, C. T., C. Amaro, K. M. Wu, E. Valiente, Y. F. Chang, S. F. Tsai, C. H. Chang, and L. I. Hor. 2008. A common virulence plasmid in biotype 2 Vibrio vulnificus and its dissemination aided by a conjugal plasmid. J. Bacteriol. 190:1638–1648. 19. Lee, J. H., M. W. Kim, B. S. Kim, S. M. Kim, B. C. Lee, T. S. Kim, and S. H. Choi. 2007. Identification and characterization of the Vibrio vulnificus rtxA essential for cytotoxicity in vitro and virulence in mice. J. Microbiol. 45:146–152. 20. Li, L., J. L. Rock, and D. R. Nelson. 2008. Identification and characterization of a repeat-in-toxin gene cluster in Vibrio anguillarum. Infect. Immun. 76: 2620–2632. 21. Meibom, K. L., M. Blokesch, N. A. Dolganov, C. Y. Wu, and G. K. Schoolnik. 2005. Chitin induces natural competence in Vibrio cholerae. Science 310: 1824–1827. 22. Oliver, J. D. 2006. Vibrio vulnificus, p. 349–366. In F. L. Thompson, B. Austin, and J. Swings (ed.), Biology of vibrios. ASM Press, Washington, DC. 23. Posada, D. 2008. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25:1253–1256. 24. Roig, F. J., and C. Amaro. 2009. Plasmid diversity in Vibrio vulnificus biotypes. Microbiology 155:489–497. 25. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425. 26. Sanjuan, E., F. Gonzalez-Candelas, and C. Amaro. 2011. Polyphyletic origin of Vibrio vulnificus biotype 2 as revealed by sequence-based analysis. Appl. Environ. Microbiol. 77:688–695. 27. Satchell, K. J. F. 2007. MARTX, multifunctional autoprocessing repeats-intoxin toxins. Infect. Immun. 75:5079–5084. 28. Satchell, K. J. F., and B. Geissler. 2009. The multifunctional-autoprocessing RTX toxins of vibrios, p. 113–126. In Thomas Proft (ed.), Microbial toxins: current research and future trends. Caister Academic Press, Auckland, New Zealand. 29. Schmidt, H. A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002. Tree-Puzzle: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504. 30. Sheahan, K. L., C. L. Cordero, and K. J. Satchell. 2007. Autoprocessing of the Vibrio cholerae RTX toxin by the cysteine protease domain. EMBO J. 26:2552–2561. 31. Shimodaira, H., and M. Hasegawa. 1999. Multiple comparisons of loglikelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16:1114–1116. 32. Strimmer, K., and A. Rambaut. 2002. Inferring confidence sets of possibly misspecified gene trees. Proc. Biol. Sci. 269:137–142. 33. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24:1596–1599. 34. Tamura, K., M. Nei, and S. Kumar. 2004. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc. Natl. Acad. Sci. U. S. A. 101:11030–11035. 35. Tison, D. L., M. Nishibuchi, J. D. Greenwood, and R. J. Seidler. 1982. Vibrio vulnificus biogroup 2: new biogroup pathogenic for eels. Appl. Environ. Microbiol. 44:640–646. 36. Zhou, C., Y. Yang, and A. Y. Jong. 1990. Mini-prep in ten minutes. Biotechniques 8:172–173. 37. Zuckerkandl, E., and L. Pauling. 1965. Evolutionary divergence and convergence in proteins, p. 97–166. In V. Bryson and H. J. Vogel (ed.), Evolving genes and proteins. Academic Press, New York, NY.

B C D E F G H I K K L M N O

B C D E F G H I J K L M N O

B C D E F G H I J K L M N O

A B 95,9 95,6 93 99,6 95,9 98,1 96,7 98 96,7 98,1 96,5 98 96,6 98 96,6 98 96,6 97,9 94,5 90,5 92,3 90,1 87,9 48 49,5 62,4 64,0

A B 94,6 92,7 90,5 99,5 94,4 97,7 95,5 97,3 95,7 97,3 95,7 96,9 96,3 96,9 96,3 96,9 96,3 98,1 94,7 100 94,6 83,1 79,1 57,2 54,8 85,1 81,8

A B 94,8 90,8 91,3 99,5 94,8 96,7 97,7 44,5 45,6 44,5 45,6 44,5 45,6 44,5 45,6 44,5 45,6 95,1 94,4 65,8 65,5 95 93,6 8 7,7 79,4 79,8

Supplementary Table 1: Similarity among the different strains on the basis of the regions N-terminal, C-terminal and internal regions. N-terminal Bt1/Bt3 A B C D E K L BT1/BT2/Bt3 B 95,9 C D E F G H I J K L M N C 95,6 93 D 99,6 95,9 95,6 E 98,1 96,7 95,2 98,1 95,6 K 97,9 94,5 94,7 97,8 96,8 95,2 98,1 L 90,5 92,3 87,7 90,5 89,6 90,9 95,5 98 98,8 M 90,1 87,9 88,4 90,1 90 90 85,2 95,6 98,1 98,5 99,6 N 48 49,5 47 48,1 48,3 48,3 48,6 95,5 98 98,7 99,9 99,7 O 62,4 64,0 61,1 62,6 62,1 62,3 61,6 95,5 98 98,7 99,9 99,7 100 Bt2 95,5 98 98,7 99,9 99,7 100 100 F G H I J L M 94,7 97,8 96,8 96,8 97,2 96,9 96,9 96,9 G 99,6 87,7 90,5 89,6 89,8 89,9 89,7 89,7 89,7 90,9 H 99,9 99,7 88,4 90,1 90 90 90 90 90 90 90 85,2 I 99,9 99,7 100 47 48,1 48,3 48,7 48,5 48,6 48,6 48,6 48,3 48,6 46,5 J 99,9 99,7 100 100 61,1 62,6 62,1 62,2 62,4 62,1 62,1 62,1 62,3 61,6 57,0 32,9 L 89,8 89,9 89,7 89,7 89,7 M 90 90 90 90 90 85,2 N 48,7 48,5 48,6 48,6 48,6 48,6 46,5 O 62,2 62,4 62,1 62,1 62,1 61,6 57,0 C-terminal Bt1/Bt3 A B C D E K L BT1/BT2/Bt3 B 94,6 C D E F G H I J K L M N C 92,7 90,5 D 99,5 94,4 92,7 E 97,7 95,5 93,5 97,2 92,7 K 98,1 94,7 93,9 97,8 98,6 93,5 97,2 L 100 94,6 92,7 99,5 97,7 98,1 93,2 97,1 98,2 M 83,1 79,1 80,3 82,8 81,4 81,5 83,1 93,2 97,1 98,2 100 N 57,2 54,8 56,6 56,8 56,5 56,9 57,2 93,2 96,7 98,4 99 99 O 85,1 81,8 83,9 85,4 84,6 84,3 81,3 93,2 96,7 98,4 99 99 100 93,2 96,7 98,4 99 99 100 100 Bt2 93,9 97,8 98,6 98,8 98,8 98 98 98 F G H I J L M 92,7 99,5 97,7 97,3 97,3 96,9 96,9 96,9 98,1 G 100 80,3 82,8 81,4 81,5 81,5 81,6 81,6 81,6 81,5 83,1 H 99 99 56,6 56,8 56,5 56,6 56,6 56,8 56,8 56,8 56,9 57,2 56,3 I 99 99 100 83,9 85,4 84,6 84,0 84,0 83,7 83,7 83,7 84,3 81,3 74,0 53,9 J 99 99 100 100 L 97,3 97,3 96,9 96,9 96,9 M 81,5 81,5 81,6 81,6 81,6 83,1 N 56,6 56,6 56,8 56,8 56,8 57,2 56,3 O 84,0 84,0 83,7 83,7 83,7 81,3 74,0 Internal zone Bt1/Bt3 A B C D E K L BT1/BT2/Bt3 B 94,8 C D E F G H I J K L M N C 90,8 91,3 D 99,5 94,8 90,8 E 96,7 97,7 91,4 96,8 90,8 K 95,1 94,4 89,6 94,8 94,2 91,4 96,8 L 65,8 65,5 62,1 65,7 64,8 64,5 46 44,5 45,1 M 95 93,6 90,4 95 93,5 92,4 84,2 46 44,5 45,1 100 N 8 7,7 6,4 8 6,8 8,7 21,4 46 44,5 45,1 100 100 O 79,4 79,8 80,2 79,8 79,4 79,8 79,3 46 44,5 45,1 100 100 100 46 44,5 45,1 100 100 100 100 Bt2 89,6 94,8 94,2 46,5 46,5 46,5 46,5 46,5 F G H I J L M 62,1 65,7 64,8 41,1 41,1 41,1 41,1 41,1 64,5 G 100 90,4 95 93,5 57,4 57,4 57,4 57,4 57,4 92,4 84,2 H 100 100 6,4 8 6,8 46,8 46,8 46,8 46,8 46,8 8,7 21,4 22,3 I 100 100 100 80,2 79,8 79,4 80,3 80,3 80,3 80,3 80,3 79,8 79,3 31,1 72,5 J 100 100 100 100 L 41,1 41,1 41,1 41,1 41,1 M 57,4 57,4 57,4 57,4 57,4 84,2 N 46,8 46,8 46,8 46,8 46,8 21,4 22,3 O 80,3 80,3 80,3 80,3 80,3 79,3 31,1

M

N

46,5 57,0

32,9

N

32,9

M

N

56,3 74,0

53,9

N

53,9

M

N

22,3 31,1

72,5

N

72,5

A: YJ016 B: CECT529 C: Riu1 D: CMCP6 E: 94-9-119 F: A14 G: 95-8-161 H: CECT 4999 (C) I: CECT 4602 (P) J: CECT 4999 (P) K: 11028 L: V.cholerae M: L.anguillarum. N:A.hydrophila. O: V.splendidus

Supplementary Table 2: A type repeats A.1, positions and sequences of each specific repeat sequence Repeat number

2

3

4

5

6

7

8

9

10

Position

Position

Position

Position

Position

Position

Position

Position

Position

Position

Position

Sequence 30 - 39

Sequence 47 - 56

Sequence 112 - 121

Sequence 132 - 141

Sequence 178 - 187

Sequence 218 - 227

Sequence 286-295

Sequence 632 - 641

Sequence 1469 - 1478

Sequence

Sequence

GfgGqihayG 30 - 39 GfgGeihayG 30 - 39 GfgGeihayG

GsiGatvytG 47 - 56 GsiGakvytG 47 - 56 GsiGakvytG

GnhGdvsygG 112 - 121 GnhGdvnygG 112 - 121 GnhGdvnygG

GlsGnvtfaG 132 - 141 GlsGnvtfkG 132 - 141 GlsGnvtfkG

GshGdvtfdG 178 - 187 dsrGdvtfdG 178 - 187 dsrGdvtfdG

GkvGditlqG 218 - 227 GkvGdvtlqG 218 - 227 GkvGdvtlqG

GsgndfakeG 286-295 Gsgsgssfda 286-295 Gsgsgssfda

GeeGdltfrG 637 - 646 GeeGdltfrG 637 - 646 GeeGdltfrG

GehGwqsthG 1476 - 1485 GdqGwksthG 1476 - 1485 GdqGwksthG

30 - 39 GfgGeihayG 30 - 39 GfgGeihayG 30 - 39 GfgGeihayG

47 - 56 GsiGatvytG 47 - 56 GsiGakvytG 47 - 56 GsiGatvytG

112 - 121 GnhGdvnygG 112 - 121 GnhGdvnygG 112 - 121 GnhGdvsygG

132 - 141 GlsGnvtfkG 132 - 141 GlsGnvtfkG 132 - 141 GlsGnvtfkG

178 - 187 dsrGdvtfdG 178 - 187 GsrGdvtfdG 178 - 187 GsrGdvtfdG

218 - 227 GkvGdvtlqG 218 - 227 GkvGditlqG 218 - 227 GkvGditlqG

286-295 Gsgsgssfda 286-295 GsgssfdaqG 286-295 GsgssfdaqG

637 - 646 GeeGdltfrG 635 - 644 GeeGdltfrG 635 - 644 GeeGdltfrG

1476 - 1485 GdqGwksthG 1474 - 1483 GdqGwksthG 1474 - 1483 GdrGwksthG

30 - 39 GfgGeihayG 30 - 39 GfgGeihayG

47 - 56 GsiGatvytG 47 - 56 GsiGatvytG

112 - 121 GnhGdvnygG 112 - 121 GnhGdvsygG

132 - 141 GlsGnvtfkG 132 - 141 GlsGnvtfkG

178 - 187 dsrGdvtfdG 178 - 187 GsrGdvtfdG

218 - 227 GkvGdvtlqG 218 - 227 GkvGditlqG

286-295 GsgndfakeG 286-295 GsgndfakeG

632 - 641 GeeGdltfrG 649 - 658 GeeGaltfrG

1471 - 1480 GdqGwksthG 1488 - 1497 GdqGwksthG

30 - 39 GlgGvihayG 30 - 39 GmgGkihayG 30 - 39 GfgGeiharG 30 - 39 GfgGeihayG 30 - 39 GfgGeihayG

47 - 56 GsiGatvytG 47 - 56 GslGatvytG 47 - 56 GsiGatvytG 47 - 56 GsiGatvytG 47 - 56 GsiGatvytG

112 - 121 GrdGdiryqG 112 - 121 GqhGnisyaG 112 - 121 GnhGdvngG 112 - 121 GnhGdvsygG 112 - 121 GnhGdvsygG

132 - 141 GisGnvsfeG 132 - 141 GssGhvsfqG 132 - 141 GlsGnvtfgG 132 - 141 GlsGnvtfkG 132 - 141 GlsGnvtfkG

178 - 187 GsqGnihfaG 178 - 187 GsqGdvrfvG 178 - 187 GsrGnvtfdG 178 - 187 GsrGdvtfdG 178 - 187 dsrGdvtfdG

218 - 227 GreGdvnfnG 218 - 227 GkvGditlaG 218 - 227 GkvGditlqG 218 - 227 GkvGditlqG 218 - 227 GkvGdvtlqG

286-295 Gadsdsspfa 286 - 295 GsgGdfdaqG 286 - 295 GsgsdfkgeG 286 - 295 GsgssfdaqG 286-295 GsgssfdaqG

374 - 383 GlaGrqidaG 379-388 dlahsdvsss 635 - 644 GdeGdltfrG 635 - 644 GeeGdltfrG 635 - 644 GeeGdltfrG

610 - 619 GqrGdlvfrG 635 - 644 GeeGdltfrG 1471 - 1480 GgdGwesthG 1474 - 1483 GdrGwksthG 1474 - 1483 GdqGwksthG

30 - 39 GfgGeihayG

47 - 56 GsiGakvytG

112 - 121 GnhGdvnygG

132 - 141 GlsGnvtfkG

178 - 187 dsrGdvtfdG

218 - 227 GkvGdvtlqG

286-295 Gsgsgssfda

637 - 646 GeeGdltfrG

1476 - 1485 GdqGwksthG

30 - 39 47 - 56 112 - 121 132 - 141 178 - 187 218 - 227 GkvGdvtlqG GfgGeihayG GsiGakvytG GnhGdvnygG GlsGnvtfkG dsrGdvtfdG In capital, the letters shared with the consensus sequence. Underlined the sequences that differs from the consensus sequence, C (chromosomal) or ( P) plasmid in biotype 2 strains

286-295 Gsgsgssfda

637 - 646 GeeGdltfrG

1476 - 1485 GdqGwksthG

Strains V. cholerae M66-2, N16961, MO10, AM19226, RC395 CECT4999 (C ) V. vulnificus A14 (P) V. vulnificus 94-8-161 (P) V. vulnificus 11028 YJ016 V. vulnificus CECT529T V. vulnificus Riu-1 A. hydrophila ATCC 7966 L. anguillarum M93 V. splendidus12B01 V. vulnificus CMCP6 V. vulnificus 94-9-119 V. vulnificus CECT4999 (P) V. vulnificus CECT4602 (P)

1

11

1447-1456 3003 - 3012 aqgwqhagaG GkeGldmanG 1474 - 1483 GdqGwksthG

Supplementary Table 3: B type repeats B.1 positions and sequences of each specific repeat sequence Repeat Number Strains

1 Position Sequence

2 Position Sequence

V. cholerae M66-2, N16961, MO10, AM19226, RC395 V. vulnificus CECT4999 (C) V. vulnificus A14 (P) V. vulnificus 94-8-161 (P) V. vulnificus 11028 V. vulnificus YJ016 V. vulnificus CECT529T V. vulnificus Riu-1 A. hydrophila ATCC 7966

724-731 TqiGsGng 704 - 711 TqVGnGrl

723-730 TklGaGql

L. anguillarum M93 V. splendidus12B01 V. vulnificus CMCP6 V. vulnificus 94-9-119 V. vulnificus CECT4999 (P) V. vulnificus CECT4602 (P)

Repeat Number Strains V. cholerae M66-2, N16961, MO10, AM19226, RC395 V. vulnificus CECT4999 (C) V. vulnificus A14 (P) V. vulnificus 94-8-161 (P)

17

3 Position Sequence 726 - 733

4 Position Sequence 745-752

5 Position Sequence 806 - 813

6 Position Sequence 825 - 832

7 Position Sequence 844 - 851

8 Position Sequence 863 - 870

9 Position Sequence 882 - 889

10 Position Sequence 901 - 908

11

12

Position Sequence 920-927

Position Sequence 939 - 946

13 Position Sequence 958-965

14 Position Sequence 977-984

15 Position Sequence 996 - 1003

TqVGkGdv

TkmGeGel

ThVGdGtt

TkVGnGdt

ThVGdGqt

TkVGdGts

ThVGeGna

TkVGnGda

ThiGdGms

TkVGnGtt

ThiGhGst

TkVGndlt

ThVGdGts

731 - 738 TqVGkGnv 731 - 738 TqVGkGnv

750 - 757 TkVGdGdl 750 - 757 TkVGdGdl

811 - 818 ThVGdGtt 811 - 818 ThVGdGtt

830 - 837 TkVGnGdt 830 - 837 TkVGnGdt

849 - 856 ThVGdGqt 849 - 856 ThVGdGqt

868 - 875 TkVGdGts 868 - 875 TkVGdGts

887 - 894 ThVGeGna 887 - 894 ThVGeGna

906 - 913 TkVGnGda 906 - 913 TkVGnGda

925-932 ThiGdGms 925-932 ThiGdGms

944 - 951 TkVGnGtt 944 - 951 TkVGnGtt

963 - 970 ThVGsGst 963 - 970 ThVGsGst

982-989 TkVGddlt 982-989 TkVGddlt

1001 - 1008 ThVGdGts 1001 - 1008 ThVGdGts

731 - 738 TqVGkGnv 729 - 736 TqVGkGnv 729 - 736 TqVGkGdv

750 - 757 TkVGdGdl 748 - 755 TkVGdGdl 748 - 755 TkVGdGdl

811 - 818 ThVGdGtt 809-816 ThVGdGtt 809 - 816 ThVGdGtt

830 - 837 TkVGnGdt 828 - 835 TkVGnGdt 828 - 835 TkVGnGdt

849 - 856 ThVGdGqt 849 - 856 ThVGdGqt 847 - 854 ThVGdGqt

868 - 875 TkVGdGts 866-873 TkVGdGts 866 - 873 TkVGdGts

887 - 894 ThVGeGna 885 - 892 ThVGeGna 885 - 892 ThVGeGna

906 - 913 TkVGnGda 904-911 TkVGnGda 904 - 911 TkVGnGda

925-932 ThiGdGms 923-930 ThiGdGms 923-930 ThiGdGms

944 - 951 TkVGnGtt 944 - 951 TkVGnGtt 942 - 949 TkVGnGtt

963 - 970 ThVGsGst 961 - 968 ThlGsGst 961 - 968 ThVGsGst

982-989 TkVGddlt 980-987 TkVGddlt 980-987 TkVGndlt

1001 - 1008 ThVGdGts 999 - 1006 ThVGdGts 999 - 1006 ThVGdGts

726 - 733 TqVGkGdv 743 - 750 TqVGkGdv

745 - 752 TkVGdGdl 762 - 769 TkVGdGdl

806 - 813 ThVGdGtt 804-811 TkkGkGn

825 - 832 TkVGnGdt 823 - 830 ThVGdGtt

844 - 851 ThVGdGqt 842 - 849 TkVGnGdt

863 - 870 TkVGdGts 861 - 868 ThVGdGqt

882 - 889 ThVGeGna 880 - 887 TkVGdGts

901 - 908 TkVGnGda 899 - 906 ThVGeGna

920-927 ThiGdGms 918 - 925 TkVGnGda

939 - 946 TkVGnGtt 937-944 ThiGdGms

958 - 965 ThVGsGst 956 - 963 TkVGnGtt

977-984 TkVGddlt 975 - 982 ThVGsGst

996 - 1003 ThVGdGts 994-1001 TkVGndlt

784-791 TqiGqGdt 729 - 736 TqVGkGdv 729 - 736 TqVGkGdi 729 - 736 TqVGkGdv 729 - 736 TqVGkGdv

803 - 810 TrVGdGdl 748 - 755 TkVGdGdl 748 - 755 TkVGdGdl 748 - 755 TkVGdGdl 748 - 755 TkVGdGdl

822 - 829 ThVGnGdt 809 - 816 ThVGdGtt 809 - 816 ThiGdGst 809 - 816 ThVGdGtt 809 - 816 ThVGdGtt

841 - 848 TkVGdGta 828 - 835 TkVGnGdt 828 - 835 TkVGdGdt 828 - 835 TkVGnGdt 828 - 835 TkVGnGdt

860-867 TqiGqGda 847 - 854 ThVGdGqt 847 - 854 ThVGdGqt 847 - 854 ThVGdGqt 847 - 854 ThVGdGqt

879 - 886 TkVGdGna 866 - 873 TkVGdGts 866 - 873 TkVGdGts 866 - 873 TkVGdGts 866 - 873 TkVGdGts

917 - 924 TkVGnGmt 885 - 892 ThVGeGna 885 - 892 ThVGeGna 885 - 892 ThVGeGna 885 - 892 ThVGeGna

ThiGdGms 904 - 911 TkVGnGda 904 - 911 TkVGnGda 904 - 911 TkVGnGda 904 - 911 TkVGnGda

936 - 943 ThVGaGet 923-930 ThiGdGms 923-930 ThiGdGms 923-930 ThiGdGms 923-930 ThiGdGms

955 - 962 TkVGdGlt 942 - 949 TkVGnGtt 942 - 949 TkVGnGts 942 - 949 TkVGnGtt 942 - 949 TkVGnGtt

974-981 shiGnGts 961 - 968 ThVGsGst 961 - 968 TqVGnGst 961 - 968 ThVGsGst 961 - 968 ThVGsGst

993 - 1000 TkVGnGtt 980-987 TkVGddlt 980-987 TkVGddlt 980-987 TkVGndlt 980-987 TkVGndlt

1012 - 1019 ThVGgGlt 999 - 1006 TkVrnGtt 999 - 1006 TkVrnGtt 999 - 1006 ThVGdGts 999 - 1006 ThVGdGts

731 - 738 TqVGkGnv

750 - 757 TkVGdGdl

811 - 818 ThVGdGtt

830 - 837 TkVGnGdt

849 - 856 ThVGdGqt

868 - 875 TkVGdGts

887 - 894 ThVGeGna

906 - 913 TkVGnGda

925-932 ThiGdGms

944 - 951 TkVGnGtt

963 - 970 ThVGsGst

982-989 TkVGddlt

1001 - 1008 ThVGdGts

731 - 738 TqVGkGnv

750 - 757 TkVGdGdl

811 - 818 ThVGdGtt

830 - 837 TkVGnGdt

849 - 856 ThVGdGqt

868 - 875 TkVGdGts

887 - 894 ThVGeGna

906 - 913 TkVGnGda

925-932 ThiGdGms

944 - 951 TkVGnGtt

963 - 970 ThVGsGst

982-989 TkVGddlt

1001 - 1008 ThVGdGts

23 Position Sequence 1168 - 1175

24 Position Sequence 1187 - 1194

25 Position Sequence 1206 - 1213

26 Position Sequence 1225 - 1232

27 Position Sequence 1244 - 1251

28 Position Sequence 1263 - 1270

29 Position Sequence 1282 - 1289

30 Position Sequence 1301 - 1308

31 Position Sequence 1320-1327

Position Sequence 1034 - 1041

18 Position Sequence 1072-1079

19 Position Sequence 1091 - 1098

20 Position Sequence 1111 - 1118

21 Position Sequence 1130 - 1137

22 Position Sequence 1149 - 1156

ThVGdGlt

ThVGdatt

TkVGeGtt

ThVGdGtt

TkVGdGlg

TqVGdGdr

TkVGdGqe

ThVGnGdd

TkVGhGqn

TqVGdGds

TkVGdGmq

TtVGnGln

1039 - 1046 ThVGdGlt 1039 - 1046 ThVGdGlt

1077-1084 ThVGdatt 1077-1084 ThVGdatt

1096 - 1103 TkVGeGtt 1096 - 1103 TkVGeGtt

1116 - 1123 ThVGdGtt 1116 - 1123 ThVGdGtt

1135 - 1142 TkVGdGlg 1135 - 1142 TkVGdGlg

1154 - 1161 TqVGdGdr 1154 - 1161 TqVGdGdr

1173 - 1180 TkVGdGqe 1173 - 1180 TkVGdGqe

1192 - 1199 ThVGnGdd 1192 - 1199 ThVGnGdd

1211 - 1218 TkVGnGrn 1211 - 1218 TkVGnGrn

1230 - 1237 TqVGdGds 1230 - 1237 TqVGdGds

1249 - 1256 TkVGdGmq 1249 - 1256 TkVGdGmq

1268 - 1275 TtVGdGls 1268 - 1275 TtVGdGls

1039 - 1046

1077-1084

1096 - 1103

1116 - 1123

1135 - 1142

1154 - 1161

1173 - 1180

1192 - 1199

1211 - 1218

1230 - 1237

1249 - 1256

1268 - 1275

TkVGdGvs

TkVGdGln

ihiGdGlg

1287-1294 TkiGdGvs 1287-1294 TkiGdGvs

1306 - 1313 TkVGdGln 1306 - 1313 TkVGdGln

1325-1332 ihiGdGlg 1325-1332 ihiGdGlg

1287-1294

1306 - 1313

1325-1332

ThVGdGlt 1037 - 1044 ThVGdGlt 1037 - 1044 ThVGdGlt

ThVGdatt 1075-1082 ThVGdatt 1075-1082 ThVGdatt

TkVGeGtt 1094 - 1101 TkVGeGtt 1094 - 1101 TkVGeGtt

ThVGdGtt 1114 - 1121 ThVGdGtt 1114 - 1121 ThVGdGtt

TkVGdGlg 1133 - 1140 TkVGdGlg 1133 - 1140 TkVGdGlg

TqVGdGdr 1152 - 1159 TqVGdGdr 1152 - 1159 TqVGdGdr

TkVGdGqe 1171 - 1178 TkVGdGqe 1171 - 1178 TkVGdGqe

ThVGnGdd 1190 - 1197 ThVGnGdd 1190 - 1197 ThVGnGdd

TkVGnGrn 1209 - 1216 TkVGnGrn 1209 - 1216 TkVGnGrn

TqVGdGds 1228 - 1235 TqVGdGds 1228 - 1235 TqVGdGds

TkVGdGmq 1247 - 1254 TkVGdGmq 1247 - 1254 TkVGdGmq

TtVGdGls 1266 - 1273 TtVGdGls 1266 - 1273 TtVGdGls

TkiGdGvs 1285 - 1292 TkiGdGvs 1285 - 1292 TkVGdGvs

TkVGdGln 1304 - 1311 TkVGdGln 1304 - 1311 TkVGdGln

ihiGdGlg 1323-1330 ihVGgGln 1323-1330 ihVGgGln

1034 - 1041 ThVGdGlt 1032 - 1039 TkVGnGtt

1072-1079 ThVGdatt 1051 - 1058 ThVGdGlt

1091 - 1098 TkVGeGtt 1089-1096 ThVGdatt

1111 - 1118 ThVGdGtt 1108 - 1115 TkVGeGtt

1130 - 1137 TkVGdGlg 1128 - 1135 ThVGdGtt

1149 - 1156 TqVGdGdr 1147 - 1154 TkVGdGlg

1168 - 1175 TkVGdGqe 1166 - 1173 TqVGdGdr

1187 - 1194 ThVGnGdd 1185 - 1192 TkVGdGqe

1206 - 1213 TkVGnGrn 1204 - 1211 ThVGnGdd

1225 - 1232 TqVGdGds 1223 - 1230 TkVGnGrn

1244-1251 TkVGdsmq 1242 - 1249 TqVGdGds

1263 - 1270 TtVGdGls 1261 - 1268 TkVGdGmq

1282 - 1289 TkVGdGvs 1280 - 1287 TtVGdGls

1301 - 1308 TkVGdGln 1299 - 1306 TkVGdGvs

1320-1327 ihiGdGlg 1318 - 1325 TkVGdGln

1069-1076 TkiGdGtt 1037 - 1044 ThVGdGlt 1037 - 1044 ThVGdGlt 1037 - 1044 ThVGdGlt 1037 - 1044 ThVGdGlt

1089 - 1096 ThVGaGst 1075-1082 ThVGdatt 1075-1082 ThVGdatt 1075-1082 ThVGdatt 1075-1082 ThVGdatt

1108 - 1115 TkVGdGlv 1094 - 1101 TkVGeGtt 1094 - 1101 TkVGeGtt 1094 - 1101 TkVGeGtt 1094 - 1101 TkVGeGtt

1127 - 1134 TqVGnGdr 1114 - 1121 ThVGdGtt 1114 - 1121 ThVGdGtt 1114 - 1121 ThVGdGtt 1114 - 1121 ThVGdGtt

1146 - 1153 TkVGdGre 1133 - 1140 TkVGdGlg 1133 - 1140 TkVGdGlg 1133 - 1140 TkVGdGlg 1133 - 1140 TkVGdGlg

1165 - 1172 TqVGnGdg 1152 - 1159 ThVGdGdr 1152 - 1159 ThVGdGdr 1152 - 1159 TqVGdGdr 1152 - 1159 TqVGdGdr

1184 - 1191 TkVGdGrq 1171-1178 skVGdGqe 1171 - 1178 TkVGdGqe 1171 - 1178 TkVGdGqe 1171 - 1178 TkVGdGqe

1203 - 1210 TqVGeGdg 1190 - 1197 ThVGdGdd 1190 - 1197 ThVGnGdd 1190 - 1197 ThVGnGdd 1190 - 1197 ThVGnGdd

1222 - 1229 TrVGdGiq 1209 - 1216 TkVGnGrn 1209 - 1216 TkVGnGrn 1209 - 1216 TkVGnGrn 1209 - 1216 TkVGnGrn

1241 - 1248 TtVGdGls 1228 - 1235 TqVGdGds 1228 - 1235 TqVGdGds 1228 - 1235 TqVGdGds 1228 - 1235 TqVGdGds

1260-1267 vkvGdGtt 1247 - 1254 TkVGdGmq 1247 - 1254 TkVGdGmq 1247 - 1254 TkVGdGmq 1247 - 1254 TkVGdGmq

1279 - 1286 TkVGdGln 1266 - 1273 TrVGdGln 1266 - 1273 TtVGdGls 1266 - 1273 TtVGdGls 1266 - 1273 TtVGdGls

1285 - 1292 TvVGdGis 1285 - 1292 TkiGdGis 1285 - 1292 TkVGdGvs 1285 - 1292 TkiGdGvs

1304-1311 TkiGdGln 1304 - 1311 TkVGdGln 1304 - 1311 TkVGdGln 1304 - 1311 TkVGdGln

1323-1330 ihVGgGlg 1323-1330 ihVGgGln 1323-1330 ihVGgGln 1323-1330 ihVGgGln

1039 - 1046 ThVGdGlt

1077-1084 ThVGdatt

1096 - 1103 TkVGeGtt

1116 - 1123 ThVGdGtt

1135 - 1142 TkVGdGlg

1154 - 1161 TqVGdGdr

1173 - 1180 TkVGdGqe

1192 - 1199 ThVGnGdd

1211 - 1218 TkVGnGrn

1230 - 1237 TqVGdGds

1249 - 1256 TkVGdGmq

1268 - 1275 TtVGdGls

1287-1294 TkiGdGvs

1306 - 1313 TkVGdGln

1325-1332 ihiGdGlg

1039 - 1046 1077-1084 1096 - 1103 1116 - 1123 1135 - 1142 1154 - 1161 ThVGdGlt ThVGdatt TkVGeGtt ThVGdGtt TkVGdGlg TqVGdGdr In capital letters the letters shared with the consensus sequence. Underlined the sequences that differs from the consensus sequence, C (chromosomal) or ( P) plasmid in biotype 2 strains

1173 - 1180 TkVGdGqe

1192 - 1199 ThVGnGdd

1211 - 1218 TkVGnGrn

1230 - 1237 TqVGdGds

1249 - 1256 TkVGdGmq

1268 - 1275 TtVGdGls

1287-1294 TkiGdGvs

1306 - 1313 TkVGdGln

1325-1332 ihiGdGlg

V. vulnificus 11028 V. vulnificus YJ016 V. vulnificus CECT529T V. vulnificus Riu-1 A. hydrophila ATCC 7966 L.anguillarum M93 V. splendidus12B01 V. vulnificus CMCP6 V. vulnificus 94-9-119 V. vulnificus CECT4999 (P) V. vulnificus CECT4602 (P)

Supplementary Table 4: B type repeats B.2, positions and sequences of each specific repeat sequence . Repeat Number Strains V. cholerae M66-2, N16961, MO10 Vibrio cholerae AM19226, RC395 V. vulnificus CECT4999 (C) V. vulnificus A14 (P) V. vulnificus 94-8-161 (P) V. vulnificus 11028 V. vulnificus YJ016 V. vulnificus CECT529T V. vulnificus Riu-1 A. hydrophila ATCC 7966 L.anguillarum M93 V. splendidus12B01 V. vulnificus CMCP6 V. vulnificus 94-9-119 V. vulnificus CECT4999 (P) V. vulnificus CECT4602 ( P)

Repeat Number Strains V. cholerae M66-2, N16961, MO10, AM19226, RC395 CECT4999 (C) V. vulnificus A14 (P) V. vulnificus 94-8-161 (P)

1 Position Sequence 739 - 747 GgANvlTkm 739 - 747 GgANvlTkm

2 Position Sequence 758 - 766 GgANviThi 758 - 766 GgANviThi

3 Position Sequence 781 - 789 GgANilTkk 781 - 789 GgANilTkk

4 Position Sequence 800 - 808 GgANvlThv 800 - 808 GgANvlThv

5 Position Sequence 819 - 827 GgANilTkv 819 - 827 GgANilTkv

6 Position Sequence 952-960 GNvNifThi 952-960 GNANifThi

7 Position Sequence 971-979 GqANimTkv 971-979 GqANimTkv

8 Position Sequence 990 - 998 GkANimThv 990 - 998 GkANimThv

9 Position Sequence 1028 - 1036 GkANimThv 1028 - 1036 GkANimThv

744 - 752 GgANvlTkv 744 - 752 GgANvlTkv

763 - 771 GgANviThi 763 - 771 GgANviThi

786 - 794 GgANilTkk 786 - 794 GgANilTkk

805 - 813 GgANvlThv 805 - 813 GgANvlThv

824 - 832 GgANilTkv 824 - 832 GgANilTkv

957 - 965 GnANifThv 957 - 965 GnANifThv

976 - 984 GqANimTkv 976 - 984 GqANimTkv

995 - 1003 GkANiyThv 995 - 1003 GkANiyThv

1033 - 1041 GkANimThv 1033 - 1041 GkANimThv

744 - 752 GgANvlTkv 742 - 750 GgANvlTkv 742 - 750 GgANvlTkv 739 - 747 GgANvlTkv 756 - 764 GgANvlTkv

763 - 771 GgANviThi 761 - 769 GgANviThi 761 - 769 GgANviThi 758 - 766 GgANviThi 775 - 783 GgANviThi

786 - 794 GgANilTkk 784 - 792 GgANilTkk 784 - 792 GgANilTkk 781 - 789 GgANilTkk 798 - 806 GgANilTkk

805 - 813 GgANvlThv 803 - 811 GgANvlThv 803 - 811 GgANvlThv 800 - 808 GgANvlThv 817 - 825 GgANvlThv

824 - 832 GgANilTkv 822 - 830 GgANilTkv 822 - 830 GgANilTkv 819 - 827 GgANilTkv 836 - 844 GgANilTkv

957 - 965 GnANifThv 955 - 963 GnANifThv 955 - 963 GnANifThv 952 - 960 GnANifThv 969 - 977 GnANifThv

976 - 984 GqANimTkv 974 - 982 GqANimTkv 974 - 982 GqANimTkv 971 - 979 GqANimTkv 988 - 996 GqANvmTkv

995 - 1003 GkANiyThv 993 - 1001 GkANiyThv 993 - 1001 GkANiyThv 990 - 998 GkANiyThv 1007 - 1015 GkANiyThv

1033 - 1041 GkANimThv 1031 - 1039 GkANimThv 1031 - 1039 GkANimThv 1028 - 1036 GkANimThv 1045 - 1053 GkANimThv

717 - 725 GgANvlTkl 742 - 750 GgANvlTkv 742 - 750 GgANvmTkv 742 - 750 GgANvlTkv 742 - 750 GgANvlTkv

736 - 744 GgANvlThv 761 - 769 GgANviThi 761 - 769 GgANiiThi 761 - 769 GgANviThi 761 - 769 GgANviThi

759 - 767 GgANvlTkk 784 - 792 GgANilTkk 784 - 792 GgANilTkk 784 - 792 GgANilTkk 784 - 792 GgANilTkk

778 - 786 GgANvlTqi 803 - 811 GgANvlThv 803 - 811 GgANvlThi 803 - 811 GgANvlThv 803 - 811 GgANvlThv

797 - 805 GaANvlTrv 822 - 830 GgANilTkv 822 - 830 GgANilTkv 822 - 830 GgANilTkv 822 - 830 GgANilTkv

816 - 824 GgANvlThv 955 - 963 GnANifThv 955 - 963 GnANifTqv 955 - 963 GnANifThv 955 - 963 GnANifThv

873 - 881 GvANifTkv 974 - 982 GqANimTkv 974 - 982 GqANimTkv 974 - 982 GqANimTkv 974 - 982 GqANimTkv

930 - 938 GnANvfThv 993 - 1001 GkANiyThv 993 - 1001 GkANiyThv 993 - 1001 GkANiyThv 993 - 1001 GkANiyThv

949 - 957 GqANlfTkv 1031 - 1039 GkANimThv 1031 - 1039 GkANimThv 1031 - 1039 GkANimThv 1031 - 1039 GkANimThv

744 - 752 GgANvlTkv

763 - 771 GgANviThi

786 - 794 GgANilTkk

805 - 813 GgANvlThv

824 - 832 GgANilTkv

957 - 965 GnANifThv

976 - 984 GqANimTkv

995 - 1003 GkANiyThv

1033 - 1041 GkANimThv

744 - 752 GgANvlTkv

763 - 771 GgANviThi

786 - 794 GgANilTkk

805 - 813 GgANvlThv

824 - 832 GgANilTkv

957 - 965 GnANifThv

976 - 984 GqANimTkv

995 - 1003 GkANiyThv

1033 - 1041 GkANimThv

10 Position Sequence 1047 - 1055

11 Position Sequence 1124 - 1132

12 Position Sequence 1143 - 1151

13 Position Sequence 1162 - 1170

14 Position Sequence 1181 - 1189

15 Position Sequence 1200 - 1208

16 Position Sequence 1219 - 1227

17 Position Sequence 1257 - 1265

18 Position Sequence 1276 - 1284

GeANivTkl 1052 - 1060 GeANivTkv 1052 - 1060 GeANivTkv

GkANliTkv 1129 - 1137 GkANiiTkv 1129 - 1137 GkANiiTkv

GqANvfTqv 1148 - 1156 GqANlfTqv 1148 - 1156 GqANlfTqv

GeANliTkv 1167 - 1175 GeANviTkv 1167 - 1175 GeANviTkv

GeANiiThv 1186 - 1194 GkANiiThv 1186 - 1194 GkANiiThv

GkANviTkv 1205 - 1213 GkANviTkv 1205 - 1213 GkANviTkv

GeANivTqv 1224 - 1232 GeANivTqv 1224 - 1232 GeANivTqv

GqANitTtv 1262 - 1270 GkANitTtv 1262 - 1270 GkANitTtv

GdANinTkv 1281 - 1289 GdANinTki 1281 - 1289 GdANinTki

1052 - 1060 GeANivTkv

1129 - 1137 GkANiiTkv

1148 - 1156 GqANlfTqv

1167 - 1175 GeANviTkv

1186 - 1194 GkANiiThv

1205 - 1213 GkANviTkv

1224 - 1232 GeANivTqv

1262 - 1270 GkANitTtv

1281 - 1289 GdANinTki

V. vulnificus 11028

1222 - 1230 GeANivTqv 1222 - 1230 GeANivTqv 1219 - 1227 GeANivTqv 1236 - 1244 GeANivTqv

1260 - 1268 GkANitTtv 1260 - 1268 GkANitTtv 1257 - 1265 GkANitTtv 1274 - 1282 GkANitTtv

1279 - 1287 GdANinTki 1279 - 1287 GdANinTkv 1276 - 1284 GdANinTkv 1293 - 1301 GdANinTkv

1140 - 1148 1159 - 1167 GrANvlTkv GeANlvTqv 1184 - 1192 1203 - 1211 GqANiiThv GkANviTkv 1184 - 1192 1203 - 1211 GkANviThv GkANiiTkv 1184 - 1192 1203 - 1211 GkANiiThv GkANviTkv 1184 - 1192 1203 - 1211 GkANiiThv GkANviTkv

1178 - 1186 GkANviTkv 1222 - 1230 GeANivTqv 1222 - 1230 GeANiiTqv 1222 - 1230 GeANivTqv 1222 - 1230 GeANivTqv

1197 - 1205 GdANivTqv 1260 - 1268 GkANvtTrv 1260 - 1268 GkANitTtv 1260 - 1268 GkANitTtv 1260 - 1268 GkANitTtv

1216 - 1224 GkANivTrv 1279 - 1287 GdANinTvv 1279 - 1287 GdANinTki 1279 - 1287 GdANinTkv 1279 - 1287 GdANinTki

1186 - 1194 GkANiiThv

1205 - 1213 GkANviTkv

1224 - 1232 GeANivTqv

1262 - 1270 GkANitTtv

1281 - 1289 GdANinTki

1052 - 1060 1129 - 1137 1148 - 1156 1167 - 1175 1186 - 1194 1205 - 1213 GeANivTkv GkANiiTkv GqANlfTqv GeANviTkv GkANiiThv GkANviTkv In capital letters the letters shared with the consensus sequence. Underlined the sequences that differs from the consensus sequence, C (chromosomal) or ( P) plasmid in biotype 2 strains

1224 - 1232 GeANivTqv

1262 - 1270 GkANitTtv

1281 - 1289 GdANinTki

V. vulnificus YJ016 V. vulnificus CECT529T V. vulnificus Riu-1 A. hydrophila ATCC 7966 L. anguillarum M93 V. splendidus12B01 V. vulnificus CMCP6 V. vulnificus 94-9-119 V. vulnificus CECT4999 (P) V. vulnificus CECT4602 (P)

1050 - 1058 GeANivTkv 1050 - 1058 GeANivTkv 1047 - 1055 GeANivTkv 1064 - 1072 GeANivTkv

1127 - 1135 GkANiiTkv 1127 - 1135 GkANiiTkv 1124 - 1132 GkANiiTkv 1141 - 1149 GkANiiTkv

1146 - 1154 GqANlfTqv 1146 - 1154 GqANvfTqv 1143 - 1151 GqANvfTqv 1160 - 1168 GqANvfTqv

1165 - 1173 GeANviTkv 1165 - 1173 GeANiiTkv 1162 - 1170 GeANviTkv 1179 - 1187 GeANiiTkv

1006 - 1014 GkANivThv 1050 - 1058 GeANivTkv 1050 - 1058 GeANivTkv 1050 - 1058 GeANivTkv 1050 - 1058 GeANivTkv

1127 - 1135 GkANiiTkv 1127 - 1135 GkANiiTkv 1127 - 1135 GkANiiTkv 1127 - 1135 GkANiiTkv

1025 - 1033 1102 - 1110 GkANiiTkv GrANliTkv 1146 - 1154 1164-1172 GqANvfThv GeANviskv 1146 - 1154 1165 - 1173 GqANifThv GeANiiTkv 1146 - 1154 1165 - 1173 GqANvfTqv GeANiiTkv 1146 - 1154 1165 - 1173 GqANvfTqv GeANiiTkv

1052 - 1060 GeANivTkv

1129 - 1137 GkANiiTkv

1148 - 1156 GqANlfTqv

1167 - 1175 GeANviTkv

1184 - 1192 GkANiiThv 1184 - 1192 GkANiiThv 1181 - 1189 GkANiiThv 1198 - 1206 GkANiiThv

1203 - 1211 GkANviTkv 1203 - 1211 GkANviTkv 1200 - 1208 GkANviTkv 1217 - 1225 GkANviTkv

Supplementary Table 5: C type repeats C.1, positions and sequences of each specific repeat sequence . Repeat number

1 2 3 4 5 6 Strains Position Position Position Position Position Position Sequence Sequence Sequence Sequence Sequence Sequence 3923 - 3931 4194 - 4202 4212 - 4220 4230 - 4238 4248 - 4256 4284-4292 V. cholerae M66-2, N16961, MO10, AM19226, GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. cholerae RC395 4144-4152 3783-3791 4054-4062 4072-4080 4090-4098 4108-4116 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. cholerae O395 2026-2034 2297-2302 2315-2323 2333-2341 2351-2359 2387-2395 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus CECT4999 (C) 3973 - 3981 4244 - 4252 4262 - 4270 4280 - 4288 4298 - 4306 4334-4342 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus A14 3973 - 3981 4244 - 4252 4262 - 4270 4280 - 4288 4298 - 4306 4334-4342 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus 94-8-161 3973 - 3981 4244 - 4252 4262 - 4270 4280 - 4288 4298 - 4306 4334-4342 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus 11028 4081-4089 4352 - 4360 4370 - 4378 4388 - 4396 4406 - 4414 4442-4450 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus YJ016 4584 - 4592 4855 - 4863 4873 - 4881 4891 - 4899 4909 - 4917 4945-4953 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus CECT529T 4078 - 4086 4349 - 4357 4367 - 4375 4385 - 4393 4403 - 4411 4439-4447 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus Riu-1 4096 - 4104 4364 - 4372 4382 - 4390 4400 - 4408 4418 - 4426 4454-4462 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv A. hydrophila ATCC 7966 4072-4080 4334 - 4342 4352 - 4360 4370 - 4378 4388 - 4396 4424 - 4432 GGdGkDlga GGrGdDvfy GGaGqDtgi GGqGdDvav GGaGqDylv aGqGaDiqv L. anguillarum M93 3783 - 3791 4048 - 4056 4066 - 4074 4084 - 4092 4102 - 4110 4139-4147 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGhDmgv GGdGnDtav mGsGnDyav V. splendidus 12B01 4253 - 44261 4523 - 4531 4541- 4549 4459-4567 4577-4585 4613-4621 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGtGrDyvv V. vulnificus CMCP6 4584 - 4592 4855 - 4863 4873 - 4881 4891 - 4899 4909 - 4917 4945-4953 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus 94-9-119 4584 - 4592 4855 - 4863 4873 - 4881 4891 - 4899 4909 - 4917 4945-4953 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus CECT4999 (P) 3973 - 3981 4244 - 4252 4262 - 4270 4280 - 4288 4298 - 4306 4334-4342 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv V. vulnificus CECT4602 (P) 3973 - 3981 4244 - 4252 4262 - 4270 4280 - 4288 4298 - 4306 4334-4342 GGqGaDiqv GGdGkDlga GGrGdDvfy GGeGnDmgv GGdGnDtav tGsGrDyvv In capital letters the letters shared with the consensus sequence. Underlined the sequences that differs from the consensus sequence, C (chromosomal) or ( P) plasmid in biotype 2 strains

7 Position Sequence 4356 - 4364 GGdGdDhli 4216-4224 GGdGeDhli 2459-2467 GGdGdDhli 4406 - 4414 GGeGeDhli 4406 - 4414 GGeGeDhli 4406 - 4414 GGeGeDhli 4514 - 4522 GGeGeDhli 5017 - 5025 GGeGeDhli 4511 - 4519 GGeGeDhli 4526 - 4534 GGeGdDhli 4496 - 4504 GGdGqDrli 4210-4218 aGeGeDyli 4685-4693 GGeGeDhli 5017 - 5025 GGeGeDhli 5017 - 5025 GGeGeDhli 4406 - 4414 GGeGeDhli 4406 - 4414 GGeGeDhli

8 Position Sequence 4376 - 4384 GGeGrDlmv 4236-4244 GGeGrDlmv 2479-2487 GGeGrDlmv 4426 - 4434 GGeGrDlmv 4426 - 4434 GGeGrDlmv 4426 - 4434 GGeGrDlmv 4534 - 4542 GGeGrDlmv 5037 - 5045 GGeGrDlmv 4531 - 4539 GGeGrDlmv 4546 - 4554 GGeGrDlmv 4516 - 4524 GGaGdDllv 4230 - 4238 GGeGsDvmv 4705-4713 GGeGrDvmv 5037 - 5045 GGeGhDlmv 5037 - 5045 GGeGrDlmv 4426 - 4434 GGeGrDlmv 4426 - 4434 GGeGrDlmv

9 Position Sequence 4394-4402 GGtdvDsfv 4254-4262 GGtdvDsfv 2497-2505 GGtdvDsfv 4444-4452 GGtdvDsfv 4444-4452 GGtdvDsfv 4444-4452 GGtdvDsfv 4552 - 4560 GGtdvDsfv 5055-5063 GGtdvDsfv 4549-4557 GGtdvDsfv 4564-4572 GGtdvDsfv 4534 - 4542 GGtGvDsfv 4248 - 4256 GGaGvDsfv 4723-4731 GGtgvDsfv 5055-5063 GGtdvDsfv 5055-5063 GGtdvDsfv 4444-4452 GGtdvDsfv 4444-4452 GGtdvDsfv

Supplementary Fig. 1 Determination of the plasmidic and/or cromosomal location of the rtxA1 gene in the strains CECT4602 (A), CECT4999 (B), A14 (C), 95-8-161 (D), CT218 (E), YJ016 (F), CECT529 (G) and 11028 (H) by southern blot with a probe (a PCR product of 1087 nt obtained with the primers 2F&R, labelled with dUTPdigoxigenin). A, genomic DNA and B, plasmid DNA. M. GeneRuler DNA ladder mix. P. Plasmid extraction of the E.coli strain 39R861

Supplementary Fig. 2. Maximum likelihood tree derived from the common fragment of 828 bp, corresponding to the N-terminal end of the protein, in a collection of 115 strains belonging to the different biotypes and serovars using the GTR+G mode of evolution.