BTKbase, mutation database for X-linked ... - BioMedSearch

4 downloads 0 Views 307KB Size Report
Ilkka Lappalainen, Tracy Lester5, Jeroen G. Noordzij6, Hans D. Ochs7, Juha Ollila, ..... Ott,L.A., Ochs,H., Wedgewood,R. and Rosen,F. (1990) Genomics, 6,.
242–247

 1998 Oxford University Press

Nucleic Acids Research, 1998, Vol. 26, No. 1

BTKbase, mutation database for X-linked agammaglobulinemia (XLA) Mauno Vihinen*, Oliver Brandau1, Lars J. Brandén2,3, Sau-Ping Kwan4, Ilkka Lappalainen, Tracy Lester5, Jeroen G. Noordzij6, Hans D. Ochs7, Juha Ollila, Sandy M. Pienaar8, Pentti Riikonen, Bratin K. Saha9 and C. I. Edvard Smith1,2 Department of Biosciences, Division of Biochemistry, PO Box 56, FIN-00014 University of Helsinki, Finland, 1Abteilung für Pädiatrische Genetik, Kinderpoliklinik, Klinikum Innenstadt der Universität München, Goethestrasse 29, D-80336 München, Germany, 2Center for BioTechnology, Department of Biosciences at Novum, Karolinska Institute, S-14157 Huddinge, Sweden, 3Department of Immunology, Microbiology, Pathology and Infectous Diseases (IMPI), Karolinska Institute, Huddinge University Hospital, S-14186 Huddinge, Sweden, 4Department of Immunology, Rush Medical School, Chicago, IL 60612, USA, 5Unit of Clinical Genetics, Institute of Child Health, 30 Guilford Street, London WC1N 1EH, UK, 6Department of Immunology, Erasmus University Rotterdam, PO Box 1738, 3000 DR Rotterdam, The Netherlands, 7Department of Pediatrics, University of Washington, Seattle, WA 98195, USA, 8ICH Laboratory, Red Cross Children’s Hospital, Cape Town 7700, South Africa and 9Department of Pathology, Emory University/GHS, Atlanta, GA 30303, USA Received October 3, 1997; Accepted October 6, 1997

ABSTRACT X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding for Bruton’s agammaglobulinemia tyrosine kinase (BTK). A database (BTKbase) of BTK mutations has been compiled and the recent update lists 463 mutation entries from 406 unrelated families showing 303 unique molecular events. In addition to mutations, the database also lists variants or polymorphisms. Each patient is given a unique patient identity number (PIN). Information is included regarding the phenotype including symptoms. Mutations in all the five domains of BTK have been noticed to cause the disease, the most common event being missense mutations. The mutations appear almost uniformly throughout the molecule and frequently affect CpG sites that code for arginine residues. The putative structural implications of all the missense mutations are given in the database. The improved version of the registry having a number of new features is available at http://www. helsinki.fi/science/signal/btkbase.html INTRODUCTION X-linked agammaglobulinemia (XLA) is a hereditary immunodeficiency caused by mutations in the gene coding for Bruton’s agammaglobulinemia tyrosine kinase (BTK) (1,2). Patients with XLA have decreased numbers of mature B cells in their peripheral blood and show a lack of all immunoglobulin isotypes causing susceptibility to severe bacterial infections (3). Patients

are treated with both antibiotics and immunoglobulin replacement therapy. The BTK gene was mapped to the midportion of the long arm of X-chromosome at Xq21.3–Xq22 (4–7) and the 37.5 kb gene contains 19 exons, 18 of which code for a 77 kDa protein (8–12). BTK is expressed in all hematopoietic lineages except for T lymphocytes and plasma cells (13,14). The murine gene for Btk has also been cloned and sequenced (10,12). BTK is crucial for signaling in B cells (15,16). It belongs to a group of related cytoplasmic protein tyrosine kinases (PTKs) formed by TEC (17), ITK/TSK/EMT (18,19) and BMX (20), known as the Tec family. The Tec family proteins consist of five distinct structural domains (1,2,21,22), which are from the N-terminus, pleckstrin homology (PH) domain of ∼120 amino acids, Tec homology (TH) domain (∼60–80 residues), Src homology 3 (SH3) domain of ∼60 residues, SH2 domain (∼100 amino acids), and the catalytic kinase domain of ∼280 residues. The BTK protein is 659 residues long. Mutations in all the five domains have been noticed to cause XLA (1,8,9,23–26). The structural consequences of the mutations in all the domains have been addressed with computer-aided molecular modeling (23–35). BTK has been shown to interact with several partners (for review see refs 15,36). BTK is the only protein where PH domain mutations are known to cause a disease. The C-terminal portion of the PH domain and the first half of the adjacent TH domain are responsible for Gβγ binding. Recently, the three dimensional structure of the BTK PH domain and Btk motif of the TH domain has been determined (37). The TH domain comprises two regions, a Btk motif and a proline rich segment (22,34). The Btk motif binds Zn2+ ions (34,37). The whole TH domain is present

*To whom correspondence should be addressed. Tel: +358 9 708 59081; Fax: +358 9 708 59068; Email [email protected]

243 Nucleic Acids Acids Research, Research,1994, 1998,Vol. Vol.22, 26,No. No.11 Nucleic

243

Figure 1. The BTKbase front page.

only in TEC family members. The proline rich segment is bound by the SH3 domains of FYN, HCK, and LYN (38). Recently, Gq-protein α-subunit has been shown to stimulate BTK (39). The SH2 and SH3 domains each recognize short peptide motifs bearing either phosphotyrosine (pTyr) residue or polyprolines, respectively. These domains link BTK to partner molecules. The BTK SH3 domain has been shown to bind the c-cbl protooncogene (40), Vav, Sam68 and EWS (41). The kinase domain is the only catalytic region in the TEC family kinases. A conserved ATP-binding site is located between two structural lobes (28). The upper lobe, which is formed mainly of β-strands, has turned relative to the lower α-helical lobe in the inactive form of the enzyme. All the known kinases contain several highly conserved residues, which are involved in substrate and cofactor binding as well as in some structurally crucial sites. BTKbase BTK mutation data, both published and directly submitted information, have been collected into a database called BTKbase (23–26,33–36,42–45) (Fig. 1). The database contains information about the mutations and XLA patients. The study group gives for each patient an individual patient identity number (PIN). The PIN consists of the type of mutation and a running number indicating mutations affecting the same amino acid or the same non-coding region. A more detailed description of the formation

of PINs is given in Vihinen et al. (46). The PIN is given as soon as a mutation is available to the study group. Data can be kept confidential until published. The database contains the following information for each patient if available: identification of the entry (PIN, accession number, etc.), plain English description of the mutation, literature reference(s), formal characterisation of the mutation and various characteristics of the patient. NEW FEATURES IN BTKbase Some of the new features are introduced in Figure 2. Entries The Web version of the database contains MEDLINE links for all the references in each entry as well as links to OMIM and ESID registry, when available. Mutations are linked to the reference sequences at the genomic, mRNA and amino acid level. Submission It is now possible to submit mutation data to the BTKbase just by filling in a questionnaire on the Web pages. The program makes the data directly into format suitable for inclusion into the database. However, the entries are added only after curator inspection.

244

Nucleic Acids Research, 1998, Vol. 26, No. 1

Figure 2. Montage of some of the key features of BTKbase. Top left, page displaying mutations affecting one residue. Top right, list of mutation types. Right, restriction enzyme cleavage sites in the genomic sequence of BTK and below, restriction enzyme sites in the BTK exons and intron boundaries. Also alterations to restriction pattern caused by mutations are shown. Below left, addition and removal of restriction enzyme recognition sites due to mutations. Below, inframe deletions. The three dimensional structure indicates locations of missense mutations in the kinase domain. Colour coding for numbers of mutations is as in Figure 3.

Distribution of the mutations

Mutation statistics

The localization of the mutations in the BTK gene and protein can be analyzed on pages where the sequences are given. By clicking the figures the mutation(s) affecting a certain position is listed. This can be performed either at DNA, mRNA or amino acid level.

Several tables provide information about the distribution of mutations. These tables are automatically updated when the distribution version of the registry is generated. Therefore our tables always provide the very latest results.

245 Nucleic Acids Acids Research, Research,1994, 1998,Vol. Vol.22, 26,No. No.11 Nucleic

245

Figure 3. Mutations causing X-linked agammaglobulinemia (XLA). The mutations indicated above the sequence cause either severe (classical) or moderate XLA, whereas those denoted below the sequence are for clinically mild disease. The number of affected families is colour coded: from black (one family), blue, green, magenta, to red (five or more families). Insertions are indicated with @, deletions with #, and stop codons with *. The amino acid substitutions in multiple mutations are in parentheses. The arginine residues having CpG dinucleotide are shown in red in the sequence.

Restriction enzyme digestion Several clickable pages are related to restriction enzymes and modifications in restriction pattern when mutations appear. Both additions and deletions of restriction sites due to mutations can be directly picked from the sequences and tables. Also lists of restriction enzymes not cutting the genomic or cDNA are provided. These analyses include all the restriction enzymes in the latest version of REBASE (47).

SRS analysis at EBI BTKbase provides direct link to the SRS search engine (48) at EBI’s MutRes mutation database registry for further analysis.

MAJOR FINDINGS FROM THE ANALYSIS OF THE BTKbase There are altogether 463 patients in the database with XLA mutations that are scattered all along the BTK gene (Fig. 1). The patients represent 406 unrelated families. There are 303 unique mutations (65%). The distribution of the mutations in the five structural domains is approximately according to the length of the domains. Four double mutations and one triple mutation have been detected. The first promoter region mutation has recently been described (35). The gene defect of nine gross deletions have not been characterized in detail. The major alterations may be underrepresented due to the mutation detection methods used. The figures are calculated from the number of mutations, i.e., all the alterations in the families having multiple mutations are taken into account.

246

Nucleic Acids Research, 1998, Vol. 26, No. 1

This version of the BTKbase includes also information about BTK polymorphisms and variants. Variations have not usually been systematically analyzed. The most common variation identified is at nucleotide 2013 in exon 18 changing a C to a T. This variation has been reported in seven studies and there are data for more than 100 patients. Ethnic variations are remarkable at this position. All the variations in the exons including 1086 C→T, 1626 C→G, 1815 C→G, are silent and in the third wobble codon position. In addition there are variations also in the introns as well as the 3′ flanking region. The distribution of the mutations is shown in Figure 3. 184 patients have missense mutations. No missense mutations have been detected in exons 8 and 9 from residue 186 to 288, which could indicate higher tolerance for mutations in the TH and SH3 domains or redundant functions. The missense mutations appear mainly in the first two positions within the codon. 131 (72%) were transitions and 53 (28%) transversions. The most common alterations were G→A (64 cases) and C→T (26 cases). The nonsense mutations were mainly transitions (59) whereas there were 31 transversions. Fifty three of the nonsense mutations appear at position 1. The most common change was C to T at position 1 altering CGA to TGA (28 occurrences). Substitutions in nine patients alter the start codon and prevent expression of the protein. Altogether, there were 190 transitions and 84 transversions in the missense and nonsense mutations, corresponding to 71 and 29% of the single amino acid substitutions. The most frequently affected sites are CpG dinucleotides. The 33 CpG dinucleotides in the coding region form only 3.3% of the gene, but still CG to TG or CA mutations constitute 33% of the single base substitutions. The most mutated sites have generally pyrimidines 5′ and purines 3′ to the mutated 5-methylcytosine (49). Eight of 18 CpG containing arginine residues were affected, whereas none of the residual 15 CpG sites encoding non-arginine residues was mutated (Fig. 3). Intron mutations in 58 families cause aberrant splicing. These mutations are concentrated mainly at locations +1 (25 occurrences) and 11 (4). Skipping of the exon 9 in three families causes an inframe deletion in the C-terminus of the SH3 domain. Insertions and deletions have been characterized from 30 and 83 families, respectively. The larger deletions encompass whole exons. Direct repeats appear in the immediate vicinity of all these mutations. In addition to the splice site mutation resulting in the inframe skipping of exon 9, 12 other families with inframe deletions have been found. All these mutations delete substantial parts of the protein. The severity of XLA can vary even among family members carrying the same mutation (reviewed in ref. 3). Most of the data in the BTKbase are associated with severe (classical) XLA patients (∼97%). In many instances mild disease-causing mutations leads also to classical XLA, even in the same family. A more detailed discussion of the mild diseases can be found in ref. 23. Many of the mutations affect functionally significant, conserved residues. The majority of the missense mutations in the PH domain are in the putative binding region. In the TH domain the missense mutations affect Zn2+ binding (34). There are no missense mutations in the SH3 domain. Most of the amino acid substitutions in the SH2 domain impair pTyr binding. In the kinase domain, the mutations are mainly on one side of the molecule. This face of the enzyme is involved in binding of the ATP, Mg2+ and substrate. Although a large number of these

mutations affect the cofactor or putative substrate binding residues, many alterations appear in structurally crucial sites. Some of the mutations causing less severe XLA are not in the immediate vicinity of the binding sites or structurally crucial positions. DISTRIBUTION OF THE DATABASE The primary distribution media is World Wide Web at http://www.helsinki.fi/science/signal/btkbase.html . Temporarily updated version of database is also available via anonymous ftp at csb.ki.se in the directory pub/btkbase. Use anonymous as username and your e-mail address as password. Inquiries and new data can be sent to [email protected]. New mutations can be submitted preferably by using the form available at the Web pages. ACKNOWLEDGEMENTS This work was supported by Finnish Academy, Biocentrum Helsinki, Instrumentariumin tiedesäätiö, the Swedish Medical Research Council, the Swedish Cancer Society and the Åke Wiberg Foundation.

REFERENCES 1 Vetrie,D., Vorechovsky,I., Sideras,P., Holland,J., Davies,A., Flinter,F., Hammarström,L., Kinnon,C., Levinsky,R., Bobrow,M., et al. (1993) Nature, 361, 226–233. 2 Tsukada,S., Saffran,D.C., Rawlings,D.J., Parolini,O., Allen,R.C., Klisak,I., Sparkes,R.S., Kubagawa,H., Mohandas,T., Quan,S., et al. (1993) Cell, 72, 279–290. 3 Sideras,P. and Smith,C.I.E. (1995) Adv. Immunol., 59, 135–223. 4 Kwan,S.-P., Kunkel,L., Bruns,G., Wedgewood,R.I., Latt,S. and Rosen,F.S. (1986) J. Clin. Invest., 77, 649–652. 5 Ott,J., Mensink,E.J.B.M., Thompson,A., Schot,I.D. and Schuurman,R.K.B. (1986) Hum. Genet., 74, 280–283. 6 Kwan,S.-P., Terwillinger,J., Parmley,R., Raghu,G., Sandkuyl,L.A., Ott,L.A., Ochs,H., Wedgewood,R. and Rosen,F. (1990) Genomics, 6, 238–242. 7 Vorechovsky,I., Vetrie,D., Holland,J., Bentley,D., Thomas,K., Zhou,J.-N., Notarangelo,L.D., Plebani,A., Fontan,G., Ochs,H.D., et al. (1994) Genomics, 21, 517–524. 8 Ohta,Y., Haire,R.N., Litman,R.T., Fu,S.M., Nelson,R.P., Kratz,J., Kornfeld,S.J., de la Morena,M., Good,R.A. and Litman,G.W. (1994) Proc. Natl. Acad. Sci. USA, 91, 9062–9066. 9 Hagemann,T.L., Chen,Y., Rosen,F.S. and Kwan,S.-P. (1994) Hum. Mol. Genet., 3, 1743–1749. 10 Sideras,P., Müller,S., Shiels,H., Jin,H., Khan,W.N., Nilsson,L., Parkinson,E., Thomas,J.D., Brandén,L., Larsson,I., et al. (1994) J. Immunol., 153, 5607–5617. 11 Rohrer,J., Parolino,O., Belmont,J.W. and Conley,M.E. (1994) Immunogenetics, 40, 319–324. 12 Oeltjen,J.C., Malley,T.M., Muzny,D.M., Miller,W., Gibbs,R.A. and Belmont,J.W. (1997) Genome Res., 7, 315–329. 13 Smith,C.I.E., Baskin,B., Humire-Greiff,P., Zhou,J.-n., Olsson,P.G., Maniar,H.S., Kjellén,P., Lambris,J.D., Christensson,B., Hammarström,L., et al. (1994) J. Immunol., 152, 557–565. 14 de Weers,M., Verschuren,M.C.M., Kraakman,M.E.M., Mensink,R.G.J., Schuurman,R.K.B., van Dongen,J.J.M. and Hendriks,R.W. (1993) Eur. J. Immunol., 23, 3109–3114. 15 Vihinen,M. and Smith,C.I.E. (1996) Crit. Rev. Immunol., 17, 495–496. 16 Mattsson,P., Vihinen,M., and Smith,C.I.E. (1996) BioEssays, 10, 825–834. 17 Mano,H., Mano,K., Tang,B., Koehler,M., Yi,T., Gilbert,D.J., Jenkins,N.A., Copeland,N.G. and Ihle,J.N. (1993) Oncogene, 8, 417–424. 18 Siliciano,J.D., Morrow,T.A. and Desiderio,S.V. (1992) Proc. Natl. Acad. Sci. USA, 89, 11194–11198.

247 Nucleic Acids Acids Research, Research,1994, 1998,Vol. Vol.22, 26,No. No.11 Nucleic 19 Heyeck,S.D. and Berg,L.J. (1993) Proc. Natl. Acad. Sci. USA, 90, 669–673. 20 Tamagnone,L., Lahtinen,I., Mustonen,T., Virtaneva,K., Francis,F., Muscatelli,F., Alitalo,R., Smith,C.I.E, Larsson,C. and Alitalo,K. (1994) Oncogene, 9, 3683–3688. 21 Smith,C.I.E., IslamK.B., Vorechovsky,I., Olerup,O., Wallin,E., Rabbani,H., Baskin,B. and Hammarström,L. (1994) Immunol. Rev., 138, 159–183. 22 Vihinen,M., Nilsson,L. and Smith,C.I.E. (1994) FEBS Lett., 350, 263–265. 23 Vihinen,M., Cooper,M.D., de Saint Basile,G., Fischer,A., Good,R.A., Hendriks,R.W., Kinnon,C., Kwan,S.-P., Litman,G.W., Notarangelo,L.D. et al. (1995) Immunol. Today, 16, 460–465.Vihinen,M., Cooper,M.D., de Saint Basile,G., Fischer,A., Good,R.A.,Hendriks,R.W., Kinnon,C., Kwan,S.-P., Litman,G.W., Notarangelo,L.D. et al. (1995) Immunol. Today, 16, 460–465. 24 Vihinen,M., Iwata,T., Kinnon,C., Kwan,S.-P., Ochs,H.D., Vorechovsky,I. and Smith,C.I.E. (1996) Nucleic Acids Res., 24, 160–165. 25 Vihinen,M., Brooimans,R.A., Kwan,S.-P., Lehväslaiho,H., Litman,G.W., Resnick,I., Ochs,H.D., Schwaber,J.H., Vorechovsky,I, and Smith,C.I.E. (1996) Immunol. Today, 17, 502–506. 26 Vihinen,M., Belohradsky,B.H., Haire,R.N., Holinski-Feder,E., Kwan,S.-P., Lappalainen,I., Lehväslaiho,H., Lester,T., Meindl,A., Ochs,H.D. et al. (1997) Nucleic Acids Res., 25, 166–171. 27 Zhu,Q., Zhang,M., Rawlings,D.J., Vihinen,M., Hagemann,T., Saffran,D.C., Kwan,S.-P., Nilsson,L., Smith,C.I.E., Witte,O.N., Chen,S.-H. and Ochs,H.D. (1994) J. Exp. Med., 180, 461–470. 28 Vihinen,M., Vetrie,D., Maniar,H.S., Ochs,H.D., Zhu,Q., Vorechovsky,I., Webster,A.D.B., Notarangelo,L.D., Nilsson,L., Sowadski,J.M. and Smith,C.I.E. (1994) Proc. Natl. Acad. Sci. USA, 91, 12803–12807. 29 Vihinen,M., Nilsson,L. and Smith,C.I.E. (1994) Biochem. Biophys. Res. Commun., 205, 1270–1277. 30 Vihinen,M., Zvelebil,M.J.J.M., Zhu,Q., Brooimans,R.A., Ochs,H.D., Zegers,B.J.M., Nilsson,L., Waterfield,M.D. and Smith,C.I.E. (1995) Biochemistry, 34, 1475–1481. 31 Vorechovsky,I., Vihinen,M., de Saint Basile,G., Honsová,S., Hammarström,L., Müller,S., Nilsson,L., Fischer,A. and Smith,C.I.E. (1995) Hum. Mol. Genet., 4, 51–58. 32 Jin,H., Webster,A.D.B., Vihinen,M., Sideras,P., Vorechovsky,I., Hammarström,L., Bernatowska-Matuszkiewicz,E., Smith,C.I.E., Bobrow,M. and Vetrie,D. (1995) Hum. Mol. Genet., 4, 693–700.

247

33 Saha,B.K., Curtis,S.K., Vogler,L.B. and Vihinen,M. (1997) Mol. Med., 3, 477–485. 34 Vihinen,M., Nore,B., Mattsson,P.T., Bäckesjö,C.-M., Nars,M., Koutaniemi,S., Watanabe,C., Lester,T., Jones,A., Ochs,H.D. and Smith,C.I.E. (1997) FEBS Lett., 413, 205–210. 35 Holinski-Feder,E., Weiss,M., Brandau,O., Jedele,K.B., Nore,B., Beckesjö,C.-M., Vihinen,M., Götz,G., Hubbard,S.R., Belohradsky,B.H., Smith,C.I.E. and Meindl,A. (1998) Pediatrics, in press. 36 Vihinen,M., Mattsson,P. and Smith,C.I.E. (1997) Frontiers Biosci., 2, 384–399. 37 Hyvönen,M. and Saraste,M. (1997) EMBO J., 16, 3396–3404. 38 Cheng,C., Ye,Z.-S. and Baltimore,D. (1994) Proc. Natl. Acad. Sci. USA, 91, 8152–8155. 39 Bence,K., Ma,W., Kozasa,T. and Huang,X.-Y. (1997) Nature, 389, 296–299. 40 Cory,G.O.C., Lovering,R.C., Hinshelwood,S., MacCarthy-Morrogh,L., Levinsky,R.J. and Kinnon,C. (1995) J. Exp. Med., 182, 611–615. 41 Guinamard,R., Fougereau,M. and Seckinger,P. (1997) Scand. J. Immunol., 45, 587–595. 42 Kornfeld,S.J., Haire,R.N., Strong,S.J., Tang,H.Y., Sung,S.S.J., Fu,S.M. and Litman,G.W. (1996) Mol. Med., 2, 619–623. 43 Vorechovsky,I., Luo,L., Hertz.,J.M., Froland,S.S., Fiorini,N., Quinti,I., Paganelli,R., Segers,R., Hammarstrom,L., Webster,A.D.B. and Smith, C.I.E. (1997) Hum. Mutat., 9, 418–425. 44 Haire,R.N., Ohta,Y., Strong,S.J., Litman,R.T., Liu,Y.Y., Prchal,J.T., Cooper,M.D. and Litman,G.W. (1997) Am. J. Hum. Genet., 60, 798–807. 45 Brooimans,R.A., van den Berg,J.A.M., Rijkers,G.T., Sanders,L.A.M., van Amstel,J.K.P., Tilanus,M.G.J., Grubben,M.J.A.L. and Zegers,B.J.M. (1997) J. Med. Genet., 34, 484–488. 46 Vihinen,M., Lehväslaiho,H. and Cotton,R.D. (1998) In Ochs,H.D., Smith,C.I.E. and Puck,J. (eds) Primary Immunodeficiency Diseases. A Molecular and Genetic Approach (in press). 47 Roberts,R.J. and Macelis,D. (1997) Nucleic Acids Res., 25, 248–262. [See also this issue Nucleic Acids Res. (1998) 26, 338–350.] 48 Etzold,T. and Argos,P. (1993) Comput. Appl. Biosci., 9, 49–57. 49 Ollila,J., Lappalainen,I. and Vihinen,M. (1996) FEBS Lett., 396, 119–122.

Suggest Documents