A Member of the Immunoglobulin Superfamily in Bacteriophage T4

8 downloads 0 Views 117KB Size Report
Abstract. We report a prediction that the highly immunogenic outer capsid (Hoc) protein of the prokaryotic phage. T4 contains three tandem immunoglobulin-like ...
Virus Genes 14:2, 163±165, 1997 # 1997 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.

A Member of the Immunoglobulin Superfamily in Bacteriophage T4 ALEX BATEMAN,1 SEAN R. EDDY,2 & VADIM V. MESYANZHINOV3 1

MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England, 2Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, 63110, USA, 3Howard Hughes Medical Institute and Bakh Institute of Biochemistry, Moscow 117071, Russia Received September 27, 1996; Accepted January 1, 1997

Abstract. We report a prediction that the highly immunogenic outer capsid (Hoc) protein of the prokaryotic phage T4 contains three tandem immunoglobulin-like domains. Immunoglobulin-like folds have previously been identi®ed in prokaryotic proteins but these share no recognizable sequence similarity with eukaryotic immunoglobulin superfamily (IgSF) folds, and may represent products of convergent evolution. In contrast, the Hoc immunoglobulin-like folds are proposed, based on immunoglobulin-like sequence consensus matches detected by hidden Markov modeling. We propose that the Hoc immunoglobulin-like domains and eukaryotic immunoglobulin-like domains are likely to be related by divergence from a common ancestor. Key words: Bacteriophage T4, hidden Markov model, Hoc, immunoglobulin superfamily, immunoglobulin domain

Members of the immunoglobulin superfamily (IgSF) are of fundemental importance in the vertebrate immune system and include antibodies, T cell receptors, Major Histocompatibility antigens and Cell adhesion molecules (1±4). Immunoglobulinlike domains have also been found in non immune functions, such as in the giant muscle protein titin. There has been some speculation as to how the immunoglobulin-like domain arose (2,5) and so it would be interesting if prokaryotic examples could be identi®ed. Here we report that a hidden Markov model of the IgSF protein fold (6±7) predicts that the highly immunogenic outer capsid protein (Hoc) of the prokaryotic phage T4 (8) contains an IgSF domain. A dot plot of Hoc shows that there are 3 tandem 100 amino acid repeats in the protein, suggesting that the Hoc protein may in fact be composed of three tandem IgSF domains. An HMM was built from a multiple alignment of 219 IgSF domains close in sequence to that of telokin. These domains were chosen because it has been argued that these are members of the I set, which represents the extant members of an ancestral

immunoglobulin-like domain and therefore may be more likely to ®nd distantly related sequences. The HMM was used to search the SWISS-PROT release 31 database (9). Residues 191 to 286 of the 376 amino acid length HOCÿ BPT4 entry matched the model with a log likelihood score of 17.8 bits. A score of 16 bits would be considered a signi®cant score in searching a database of the size of SWISS-PROT. A second weak match was detected by the model between residues 1 and 102. A dot plot of the Hoc sequence against itself shows that there are 3 repeat sequences within the ®rst 300 amino acids, which suggests that although the HMM does not detect any signi®cant matches outside the region of residues 191 to 286, Hoc may be composed of at least three tandem IgSF domains. An alignment of these repeats to the sequence of telokin is given in Table 1. HMM scores of this magnitude are highly reliable in our experience, we attempted to check the HMM result by two other methods; checking key residues and PHD secondary structure prediction. Key residues in the core of the Telokin IgSF domain are conservatively substituted though there is signi®cant sequence divergence. The conserved tryptophan in the

164

Bateman, Eddy and Mesyanzhinov Table 1. An alignment of the immunoglobulin like Hoc domains from prokaryotic phage T4 with the I set molecule telokin TELOKIN VKPYFSKTIRDLEVVEGSAARFDCKIEGY . . . PDPEVVWFKDDQSIRESR . . Sec.Str .EEEEEE . . . EEEEEEE . . EEEEEEEEEE . . . . . . EEEEEE . . EEE . . . . . . Hoc 1 MTFTVDITPKTPTGVIDETKQFTATPSGQtgggtiTYAWSVDNVPQdgaea. Hoc 97 TTLAVTPA . SPAAGVIGTPVQFTAALASQpdgasaTYQWYVDDSQVggetns Hoc 191 MNPQVTLTPPSINVQQDASATFTANVTGApeepqiTYSWKKDSSPVegstn . Phd . . EEEE . . . . . . EEEEE . . . EEEEEE . . . . . . . EEEEEEEE . . . . . . . . . EE HFQIDYDEDGNCSLIISDVCGDDDAKYTCKAVNSLGEATCTAELIVETM . EEEEEEE . . . EEEEEEE . . . . . . . EEEEEEEE . . . . EEEEEEEEEEE . TFSYV1kg . paGQKTIKVVATNTLSEGGpetae . . . . . . aTTTITVKNk 93 TFSYTpt . . tsGVKRIKCVAQVTATDYDALsvts . . . . . nEVSLTVNKk 189 VYTVDts . . svGSQTIEVTATVTAADYNpvtvtkt . . gnvTVTAKVAPe 286 EEEEEE . . . . . . EEEEEEEEEEEE . . . . . EEEEE . . . . . EEEEEEEEE . Regions expected to share the same structure as telokin are shown in upper case. Those regions expected to differ in conformation from telokin are shown in lower case. The secondary structure (Sec. str) of telokin is shown, as is the Phd secondary structure prediction (Phd) of the three Hoc domains. Key residues important for the structure of the Ig fold of telokin are shown in bold letters (5). The chemical nature of these sidechains are mainly conserved, implying structural similarity of the Hoc domains to telokin.

C strand is conserved in all the Hoc domains. The cysteine in the B strand of telokin is conservatively substituted by alanine in the Hoc domains. The alignment of three Hoc domains in Table 1 was submitted to the PHD secondary structure prediction server. PHD predicts an eight stranded b sheet with all but one of telokin's b strands present in Hoc domain 3 in the same approximate position. The Hoc domains lack the disulphide bridge found in many IgSF domains, this is not unique, the second domain of CD4 lacks this feature (10±11). One feature of the Hoc domains which is not wholly consistent with eukaryotic IgSF structures is the replacement of C in the F strand with P in two of three domains. The b strand structure of the Hoc domains would have to be non-canonical in this region. An HMM built from the alignment of the 3 Hoc domains failed to fmd any signi®cant matches to other sequences in Swiss Prot 31. The highly immunogenic outer capsid protein of Bacteriophage T4 is exposed on the surface of the mature T4 head. Hoc is added late in head maturation to the centre of hexagonal capsomers of gp23, the major phage head structural protein (12). Hoc is nonessential for T4 structure or viability. The closely related Bacteriophage T2 does not express a Hoc homologue (13). The exact function of Hoc is unknown. IgSF domains are known to be involved in a wide range of binding interactions. Perhaps Hoc's

IgSF domains are responsible for its binding af®nity to gp23 capsomers. Bacteriophage T4 has recently been developed as a surface display vector using T4 minor ®brous protein ®britin (14). The non-essential nature of Hoc make it a good candidate for a new phage display system using T4. The Hoc protein could be engineered to express foreign peptides or even proteins on the surface of the virion. Antibody IgSF domains have been used as a scaffold for phage display experiments. An approximate knowledge of the tertiary structure of Hoc could be used to de®ne sites at which peptide insertions can be made. Is this an example of a prokaryotic IgSF domain, indicating that the major protein component of the vertebrate immune system is an ancient structure with prokaryotic homologues? Unfortunately, the origin of bacteriophage and their evolution are open questions. It has been noted that many T4 proteins are more similar in sequence and enzymatic mechanism to eukaryotic than prokaryotic homologues (15), which raises the interesting possibility that T4 is a late evolutionary invention constructed in part from eukaryotic genes. We argue from sequence similarity that Hoc and eukaryotic IgSF domains share a common ancestor but we cannot tell whether Hoc is a recent horizontal acquisition or an ancestral gene of T4; nor can we determine whether it originated in a eukaryotic or prokaryotic genome. Regardless one

T4 Ig Domains

cannot help but appreciate the irony of a virus exploiting an immunoglobulin-like structure against its hapless host. Notes 1 We have recently demonstrated that two bacterial proteins contain immunoglobulin-like domains (Bateman A., Eddy S.R. and Chothia C., Prot Sci 5: 1939±1941, 1996).

References 1. Williams A.F., Immunology Today 8, 298±303, 1987. 2. Williams A.F., and Barclay A.N., Ann Rev Immunol 6, 381± 405, 1988. 3. Kuma K., Iwabe N., and Miyata T., Curr Biol 1, 384±393, 1991. 4. Bork P., Holm L., and Sander C., J Mol Biol 242, 309±320, 1994.

165

5. Harpaz Y., and Chothia C., J Mol Biol 238, 528±539, 1994. 6. Krogh A., Brown M., Mian I.S., Sjolander K., and Haussler D., J Mol Biol 235, 1501±1531, 1994. 7. Eddy S.R., Mitchison G., and Durbin R., J Comput Biol 2, 9±23, 1995. 8. Kaliman A.V., Khasanova M.A., Kryukov V.M., Tanyashin V.I., and Bayev A.A., Nucleic Acids Res 18, 4277, 1990. 9. Bairoch A., and Boeckmann B., Nucleic Acids Res 19, 2247± 2249, 1991. 10. Wang J., Yan Y., Garret T.P.J., Liu J., Rodgers D.W., Garlick R.L., Tarr G.E., Husain Y., Reinharz E.L., and Harrison S.C., Nature 348, 411±418, 1990. 11. Garret T.P.J., J Mol Biol 234, 763±778, 1993. 12. Yanagida M., J Mol Biol 109, 515±537, 1977. 13. Yanagida M., Suzuki Y., and Toda T., Adv Biophys 17, 97±146, 1984. 14. E®mov V.P., Nepluev I.V., and Mesyanzhinov V.V., Virus Genes 10, 173±177, 1995. 15. Bernstein H., and Bernstein C., J Bacteriol 171, 2265±2270, 1989.