Nucleotide Sequence of the Gene Encoding the Major Subunit of CS3 ...

2 downloads 0 Views 665KB Size Report
The complete nucleotide sequence of a 612-base-pair DNA fragment containing the gene for the major fimbrial subunit of CS3 of enterotoxigenic Escherichia ...
Vol. 56, No. 12

INFECTION AND IMMUNITY, Dec. 1988, p. 3297-3300 0019-9567/88/123297-04$02.00/0 Copyright © 1988, American Society for Microbiology

Nucleotide Sequence of the Gene Encoding the Major Subunit of CS3 Fimbriae of Enterotoxigenic Escherichia coli MAIRE BOYLAN,"2 CYRIL J. SMYTH,2 AND JUNE R. SCOTT'* Department of Microbiology and Immunology, Emory University School of Medicine, Atlanta, Georgia 30322,1 and Department of Microbiology, Moyne Institute, Trinity College, Dublin 2, Republic of IrelandReceived 28 July 1988/Accepted 13 September 1988

The complete nucleotide sequence of a 612-base-pair DNA fragment containing the gene for the major fimbrial subunit of CS3 of enterotoxigenic Escherichia coli is presented. A possible promoter region, a ribosome-binding site, and two potential signal peptidase cleavage sites are indicated. Unlike the best-studied fimbrial proteins, the predicted CS3 sequence has no Cys residues.

with a calculated molecular mass of 18.4 kDa (Fig. 1). A putative ribosome-binding site (33) is located 6 nucleotides upstream of the translation initiation codon. Further upstream is a potential promoter consisting of a -10 region separated by 17 nucleotides from a -35 region, both of which match in four of six bases with the consensus E. coli promoter sequences (10, 11). While the genetic code has been found to be almost universal, there are differences in codon preference between genes encoding highly expressed proteins and genes coding for proteins that are less abundant (12). In the gene encoding the CS3 polypeptide, the optimal codons for E. coli are not always used preferentially. The codon adaptation index value (32) was calculated to be 0.202, indicating a low codon usage bias. This finding is in agreement with the general observation that plasmid- and transposon-encoded genes tend to have less codon bias than do chromosomally encoded genes (32). Since the CS3 antigen is present on the bacterial cell surface, it can be expected that it is produced in a precursor form with an N-terminal signal sequence. The 17-kDa polypeptide previously identified in minicell analyses (2) appears to correspond to this precursor. Although the signal peptidase cleavage site cannot be assigned unambiguously, cleavage between residues 15 and 16 (between the Ala and Met in the sequence Leu-Ser-Ala-Met [Fig. 1]) appears to be a good candidate for three reasons. First, the majority of signal sequences of E. coli proteins (and all signal sequences of E. coli fimbrial subunits reported) end with an alanine residue (3, 17, 36). Second, the processed form of the CS3 subunit was labeled in the presence of [35S]methionine. No cysteine residues are present in the derived amino acid sequence (Fig. 1), and the only methionine, apart from that specified by the initiation codon, occurs at position 16. Therefore, for the processed form of the CS3 subunit to be [35S]methionine labeled, the signal peptidase would have to cleave at a site before residue 16. Third, a hydropathy profile of the predicted CS3 polypeptide shows that the N-terminal 15-aminoacid putative signal peptide is highly hydrophobic (Fig. 2). The N-terminal portion of the amino acid sequence deduced from the nucleotide sequence has properties of a typical procaryotic signal sequence (Fig. 1; 36). The 15residue peptide has two positively charged residues (lysines) at positions 3 and 5, followed by a core of hydrophobic residues (six of nine amino acids from residues 6 to 15 are hydrophobic). The putative CS3 signal sequence begins with

Enterotoxigenic Escherichia coli (ETEC) strains are important causes of diarrheal disease in humans and animals. These strains express colonization factor antigens, usually associated with fimbriae, which permit their adherence to host intestinal epithelial cells. CS3 fimbriae, which are wiry fibrillar structures 2 to 4 nm in diameter (23) are produced by human ETEC isolates belonging to serovars 06, 08, 080, and 085 (3, 19). An ETEC strain whose only fimbriae were of the CS3 type was found to adhere to human small intestine enterocytes (19), and human volunteers who developed diarrhea after challenge with an ETEC strain that had CS3 fimbriae showed a significant serological response to that antigen (23). This suggests that CS3 is clinically important. In vitro cloning analysis (2) showed that a 5.1-kilobase DNA fragment derived from a plasmid in an ETEC strain encodes the determinant required for surface expression of the CS3 antigen. Methionine-containing proteins with molecular masses of 94, 26, 24, 17, and 15 kilodaltons (kDa) are encoded by this chimeric plasmid. On the basis of reaction with antibody in a Western blot (immunoblot), inhibition of protein processing, and the release of surface antigen in heat shock experiments, the 17- and 15-kDa polypeptides appeared to be the precursor and processed forms, respectively, of the CS3 fimbrial subunit. Subcloning and analysis of TnS insertions within the cloned 5.1-kilobase fragment revealed both the location of the gene encoding the 17-kDa polypeptide and the direction of its transcription (2). In this communication, the complete nucleotide sequence of the CS3 subunit gene, determined by the dideoxy method of sequencing plasmid DNA (30), is presented (Fig. 1). The CS3 gene is located between the Tn5 insertion in plasmid pCS119 and the HindIlI cloning site of the vector plasmid pBR322 (2). Plasmid pCS119 was used as the template for sequencing in one direction. The other strand was sequenced from plasmid pCS190, which contains a 1.8-kilobase HindIII fragment from pCS119 in the Hindlll site of pUC9. All sequence analysis was performed with double-stranded plasmid DNA by the dideoxy chain termination method using synthetic oligonucleotide primers, and the entire sequence was obtained on both strands. Analysis of the nucleotide sequence revealed a single open reading frame, read in the direction previously deduced for CS3 (2), which encodes a polypeptide of 168 amino acids *

Corresponding author. 3297

3298

NOTES

INFECT. IMMUN.

62 AGCAGTACAGTTCCAGGTACGTATACTGTTGGTCTTAACGTAACCAGTAATGTTATTTAAAG -10 -35 114

TGAATGTATGAGGGATTCG ATG TTA AAA ATA AAA TAC TTA TTA ATA GGT CTT Met Leu Lys Ile Lys Tyr Leu Leu Ile Gly Leu rbs 11

162 TCA CTG TCA GCT ATG AGT TCA TAC TCA CTA GCT GCA GCG GGG CCC ACT Ser Leu Ser Ala Met Ser Ser Tyr Ser Leu Ala Ala Ala Gly Pro Thr 27 T

210 CTA ACC AAA GAA CTG GCA TTA AAT GTG CTT TCT CCT GCA GCT CTG GAT Leu Thr Lys Glu Leu Ala Leu Asn Val Leu Ser Pro Ala Ala Leu Asp 43 258 GCA ACT TGG GCT CCT CAG GAT AAT TTA ACA TTA TCC AAT ACT GGC GTT Ala Thr Trp Ala Pro Gln Asp Asn Leu Thr Leu Ser Asn Thr Gly Val 59

0

a x cI

306 TCT AAT ACT TTG GTG GGT GTT TTG ACT CTT TCA AAT ACC AGT ATT GAT Ser Asn Thr Leu Val Gly Val Leu Thr Leu Ser Asn Thr Ser Ile Asp 75

354 ACA GTT AGC ATT GCG AGT ACA AGT GTT TCT GAT ACA TCT AAG AAT GGT Asn Thr Val Ser Ile Ala Ser Thr Ser Val Ser Asp Thr Ser Lys Gly 91 402 ACA GTA ACT TTT GCA CAT GAG ACA AAT AAC TCT GCT AGC TTT GCC ACC Thr Val Thr Phe Ala His Glu Thr Asn Asn Ser Ala Ser Phe Ala Thr 107 450 ACC ATT TCA ACA GAT AAT GCC AAC ATT ACG TTG GAT AAA AAT GCT GGA Thr Ile Ser Thr Asp Asn Ala Asn Ile Thr Leu Asp Lys Asn Ala Gly 123 498 AAT ACG ATT GTT AAA ACT ACA AAT GGG AGT CAG TTG CCA ACT AAT TTA Asn Thr Ile Val Lys Thr Thr Asn Gly Ser Gln Leu Pro Thr Asn Leu 139

546 CCA CTT AAG TTT ATT ACC ACT GAA GGT AAC GAA CAT TTA GTT TCA GGT Pro Leu Lys Phe Ile Thr Thr Glu Gly Asn Glu His Leu Val Ser Gly 155 595 AAT TAC CGT GCA AAT ATA ACA ATT ACT TCG ACA ATT AAA TAA TTATATA Asn Tyr Arg Ala Asn Ile Thr Ile Thr Ser Thr Ile Lys ---

168 612 ATAGACGTAGCCTTCGA

FIG. 1. Nucleotide sequence and deduced amino acid sequence of the CS3 fimbrial subunit gene. The numbers above each line refer to nucleotide position, and those below each line refer to amino acid position. The ATG start codon is indicated by a horizontal arrow, and the potential signal peptidase-splicing site is indicated by a vertical arrow. The proposed ribosome-binding site (rbs) and the promoter consensus sequences (-35 and -10) are underlined.

Leu-Leu-Ile (residues 7 to 9), which is typical of the start for core sequences. Another feature common to signal sequences is the occurrence of a helix-breaking residue (glycine or proline) or a large polar residue (notably glutamine)

four to eight residues before the cleavage site. The deduced CS3 protein has glycine at position 10 (Fig. 1), five residues before the proposed cleavage site. A second possible signal peptidase cleavage site is the Ala-Ala-Ala tripeptide at residues 21 to 23. In the amino acid sequence of FimA (PilA), the major subunit of type 1 fimbriae, an identical triplet at positions 20 to 22 is cleaved between the first two alanine residues (16, 28). If CS3 were cleaved after the Ala at residue 21, the signal peptide generated would still be highly hydrophobic. Moreover, the length of the signal peptide would be 21 amino acids, which is within the range (21 to 23 amino acids) deduced for other E. coli fimbrial subunits (1, 3, 5, 7, 8, 13-16, 18, 22, 25, 2729, 34). Such a processed form of CS3 would not be 35S labeled. The results of recent N-terminal amino acid sequence

RESIDUE

NUMBER

FIG. 2. Hydropathy plot for the CS3 fimbrial subunit. Hydropathy values were calculated by the method of Kyte and Doolittle (21). Positive values indicate hydrophobicity, and negative values indicate hydrophilicity. The arrow indicates the proposed site of cleavage by the signal peptidase.

analysis of purified CS3 polypeptide are consistent with the Ala at residue 22 being at the N terminus of the mature polypeptide (R. H. Hall, J. H. Collins, M. and M. Levine, personal communication). These investigators found that the first 27 amino acid residues of the protein agree with the prediction based on the DNA sequence (Fig. 1); however, they have some evidence suggesting the presence of a second isoform of this protein as well. On the basis of sodium dodecyl sulfate-polyacrylamide gel electrophoresis, the molecular mass of the cytoplasmic precursor was estimated to be 17 kDa and that of the mature CS3 antigen was estimated to be 15 kDa (2). Although these values are slightly lower than those predicted from the DNA sequence for a polypeptide cleaved at either of the sites shown in Fig. 1 (see above), they are in reasonable agreement, since many proteins migrate anomalously in sodium dodecyl sulfate-polyacrylamide gels. The C-terminal portion of the CS3 polypeptide, like that of other fimbrial subunits, is hydrophobic. It has been proposed that such segments may be involved in intersubunit interactions which maintain the integrity of fimbrial superstructure (3, 17). By using the FASTP algorithm (24), the predicted CS3 polypeptide amino acid sequence was compared with the amino acid sequences of the following fimbrial proteins: E. coli K88ab, K88ac, K88ad, and K88ab(FaeC) (5, 7, 8, 13, 14, 27), K99 (29), CFA/I (15), PapA, PapE, PapF, and PapG (1, 25), FimA, FimF, FimG, and FimH (16, 18, 20, 28), F71, F72, and Fll (34, 35), and Afal (22); two different type 1 fimbriae of Klebsiella pneumoniae (9); and those of Bacteroides gingivalis (4), Bacteroides nodosus (6), Neisseria gonorrhoeae (26), and Pseudomonas aeruginosa (31). There was

VOL. 56, 1988

homology among the leader sequences of E. coli fimbrial proteins and that of CS3, but no other significant homology was found. Although CS3 fimbriae are morphologically similar to K88 and K99 fimbriae, the low level of homology between the amino acid sequences of their major subunits may indicate independent evolution in different animal species, i.e., in humans and domestic animals, respectively. CS3 is unusual among fimbrial proteins in having no Cys residues. The other fimbrial proteins without Cys are CFA/I (15), CS2 (17), CS1 (H. Matthews and C. J. Smyth, unpublished data), and K88ab (8). Since disulfide bonds are thought to be important for protein-protein interaction, which is of critical importance in a structural protein, the morphogenesis of CS3 fimbriae may be significantly different from that of PAP and type 1 fimbriae. Therefore, a molecular analysis of the morphogenesis of CS3 fimbriae may be of considerable value in elucidating a possible novel biological mechanism for organelle assembly. We thank Mike Caparon for valuable advice and encouragement with the DNA sequence analysis, Judy Caron for performing computer analyses of the sequence, Des Higgins for performing seqtience comparisons and searches, and Paul Sharp for calculating the codon adaptation index. We are also grateful to Des Higgins for facilitating electronic mail communications between the authors. Sequence comparisons were carried out in the facilities of the Irish National Centre for Bioinformatics. This work was supported by Public Health Service grant A128870 from the National Institutes of Health to J.R.S. and a grant to C.J.S. from the Health Research Board of Ireland. M.B. held a studentship from the Health Research Board of Ireland during part of this study. LITERATURE CITED 1. Baga, M., S. Normark, J. Hardy, P. O'Hanley, D. Lark, 0. Olsson, G. Schoolnik, and S. Falkow. 1984. Nucleotide sequence of the papA gene encoding the Pap pilus subunit of human uropathogenic Escherichia coli. J. Bacteriol. 157:330-333. 2. Boylan, M., D. C. Coleman, and C. J. Smyth. 1987. Molecular cloning and characterization of the genetic determinant encod-

ing CS3 fimbriae of enterotoxigenic Escherichia coli. Microb. Pathog. 2:195-209. 3. de Graaf, F. K., and F. R. Mooi. 1986. The fimbrial adhesins of Escherichia coli. Adv. Microb. Physiol. 28:65-143. 4. Dickinson, D. P., M. A. Kubiniec, F. Yoshimura, and R. J. Genco. 1988. Molecular cloning and sequencing of the gene encoding the fimbrial subunit protein of Bacteroides gingivalis. J. Bacteriol. 170:1658-1665. 5. Dykes, C. W., I. J. Halliday, M. J. Read, A. N. Hobden, and S. Harford. 1985. Nucleotide sequences of four variants of the K88 gene of porcine enterotoxigenic Escherichia coli. Infect. Immun. 50:279-283. 6. Elleman, T. C., and P. A. Hoyne. 1984. Nucleotide sequence of the gene encoding pilin of Bacteroides nodosus, the causal organism of ovine footrot. J. Bacteriol. 160:1184-1187. 7. Gaastra, W., P. Klemm, and F. K. de Graaf. 1983. The nucleotide sequence of the K88ad protein subunit of porcine enterotoxigenic Escherichia coli. FEMS Microbiol. Lett. 18:177-183. 8. Gaastra, W., F. R. Mooi, A. R. Stuitje, and F. K. de Graaf. 1981. The nucleotide sequence of the gene encoding the K88ab protein subunit of porcine enterotoxigenic Escherichia coli. FEMS Microbiol. Lett. 12:41-46. 9. Gerlach, G.-F., and S. Clegg. 1988. Characterization of two genes encoding antigenically distinct type-1 fimbriae of Klebsiella pneumoniae. Gene 64:231-240. 10. Gold, L., D. Pribnow, T. Schneider, S. Shinedling, B. S. Singer, and G. Stormo. 1981. Translational initiation in prokaryotes. Annu. Rev. Microbiol. 35:365-403. 11. Hawley, D. K., and W. McClure. 1983. Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. 11:2237-2255.

NOTES

3299

12. Ikemura, T. 1981. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J. Mol. Biol. 151:389-409. 13. Josephsen, J., F. Hansen, F. K. de Graaf, and W. Gaastra. 1984. The nucleotide sequence of the protein subunit of the K88ac fimbriae of porcine enterotoxigenic Escherichia coli. FEMS Microbiol. Lett. 25:301-306. 14. Klemm, P. 1981. The complete amino-acid sequence of the K88 antigen, a fimbrial protein from Escherichia coli. Eur. J. Biochem. 117:617-627. 15. Klemm, P. 1982. Primary structure of the CFA1 fimbrial protein from human enterotoxigenic Escherichia coli strains. Eur. J. Biochem. 124:339-348. 16. Klemm, P. 1984. The fimA gene encoding the type-1 fimbrial subunit of Escherichia coli: nucleotide sequence and primary structure of the protein. Eur. J. Biochem. 143:395-399. 17. Klemm, P. 1985. Fimbrial adhesins of Escherichia coli. Rev. Infect. Dis. 7:321-340. 18. Klemm, P., and G. Christiansen. 1987. Three fim genes required for the regulation of length and mediation of adhesion of Escherichia coli type 1 fimbriae. Mol. Gen. Genet. 208:439-445. 19. Knutton, S., D. R. Lloyd, D. C. A. Candy, and A. S. McNeish. 1985. Adhesion of enterotoxigenic Escherichia coli to human small intestinal enterocytes. Infect. Immun. 48:824-831. 20. Krogfelt, K. A., and P. Klemm. 1988. Investigation of minor components of Escherichia coli type 1 fimbriae: protein chemical and immunological aspects. Microb. Pathog. 4:231-238. 21. Kyte, I., and R. F. Doolittle. 1982. A simple method of displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105-132. 22. Labigne-Roussel, A., M. A. Schmidt, W. Walz, and S. Falkow. 1985. Genetic organization of the afimbrial adhesin operon and nucleotide sequence from a uropathogenic Escherichia coli gene encoding an afimbrial adhesin. J. Bacteriol. 162:1285-1292. 23. Levine, M. M., P. Ristaino, G. Marley, C. Smyth, S. Knutton, E. Boedeker, R. Black, C. Young, M. L. Clements, C. Cheney, and R. Patnaik. 1984. Coli surface antigens 1 and 3 of colonization factor antigen II-positive enterotoxigenic Escherichia coli: morphology, purification, and immune responses in humans. Infect. Immun. 44:409-420. 24. Lipman, D. J., and W. R. Pearson. 1985. Rapid and sensitive protein similarity searches. Science 227:1435-1441. 25. Lund, B., F. Lindberg, and S. Normark. 1988. Structure and antigenic properties of the tip-located P pilus proteins of uropathogenic Escherichia coli. J. Bacteriol. 170:1887-1894. 26. Meyer, T. F., E. Billyard, R. Haas, S. Storzbach, and M. So. 1984. Pilus genes of Neisseria gonorrhoeae: chromosomal organization and DNA sequence. Proc. Natl. Acad. Sci. USA 81: 6110-6114. 27. Mooi, F. R., M. van Buuren, G. Koopman, B. Roosendaal, and F. K. de Graaf. 1984. K88ab gene of Escherichia coli encodes a fimbria-like protein distinct from the K88ab fimbrial adhesin. J. Bacteriol. 159:482-487. 28. Orndorif, P. E., and S. Falkow. 1985. Nucleotide sequence of pilA, the gene encoding the structural component of type 1 pili in Escherichia coli. J. Bacteriol. 162:454-457. 29. Roosendaal, B., W. Gaastra, and F. K. de Graaf. 1984. The nucleotide sequence of the gene encoding the K99 subunit of enterotoxigenic Escherichia coli. FEMS Microbiol. Lett. 22: 253-258. 30. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74:5463-5467. 31. Sastry, P. A., J. R. Pearlstone, L. B. Smillie, and W. Paranchych. 1983. Amino acid sequence of pilin isolated from Pseudomonas aeruginosa. FEBS Lett. 151:253-256. 32. Sharp, P. M., and W.-H. Li. 1987. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15:1281-1295. 33. Shine, J., and L. Dalgarno. 1974. The 3'-terminal sequence of

3300

NOTES

Escherichia coli 16S ribosomal RNA: complementary to nonsense triplets and ribosome binding sites. Proc. Natl. Acad. Sci. USA 71:1342-1346. 34. van Die, I., and H. Bergmans. 1984. Nucleotide sequence of the gene encoding the F72 fimbrial subunit of a uropathogenic Escherichia coli strain. Gene 32:83-90. 35. van Die, I., M. Diksterhuis, H. de Cock, W. Hoekstra, and H.

INFECT. IMMUN. Bergmans. 1986. Structural variation of P-fimbriae from uropathogenic Escherichia coli, p. 39-46. In D. L. Lark, S. Nor-

mark, B.-E. Uhlin, and H. Wolf-Watz (ed.), Protein-carbohydrate interactions in biological systems. FEMS Symposium no. 31. Academic Press, Inc. (London), Ltd., London. 36. Watson, M. E. E. 1984. Compilation of published signal sequences. Nucleic Acids Res. 12:5145-5164.

Suggest Documents