1998 Oxford University Press
Human Molecular Genetics, 1998, Vol. 7, No. 11 1703–1712
Characterization of the myotubularin dual specificity phosphatase gene family from yeast to human Jocelyn Laporte+, François Blondeau+, Anna Buj-Bello+, Dimtry Tentler1, Christine Kretz, Niklas Dahl1 and Jean-Louis Mandel* Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/ULP, 1 rue Laurent Fries, BP 163, 67404 Illkirch Cedex, France and 1Unit of Clinical Genetics, Department of Genetics and Pathology, University Hospital, 75185 Uppsala, Sweden Received April 20, 1998; Revised and Accepted July 17, 1998
DDBJ/EMBL/GenBank accession nos U58032–U58034, AA317931, AF031519, AF072928, AF072929, AF073482, AF073879–AF073883, AF073996, AF073997, AF076432
X-linked myotubular myopathy (XLMTM) is a severe congenital muscle disorder due to mutations in the MTM1 gene. The corresponding protein, myotubularin, contains the consensus active site of tyrosine phosphatases (PTP) but otherwise shows no homology to other phosphatases. Myotubularin is able to hydrolyze a synthetic analogue of tyrosine phosphate, in a reaction inhibited by orthovanadate, and was recently shown to act on both phosphotyrosine and phosphoserine. This gene is conserved down to yeast and strong homologies were found with human ESTs, thus defining a new dual specificity phosphatase (DSP) family. We report the presence of novel members of the MTM gene family in Schizosaccharomyces pombe, Caenorhabditis elegans, zebrafish, Drosophila, mouse and man. This represents the largest family of DSPs described to date. Eight MTM-related genes were found in the human genome and we determined the chromosomal localization and expression pattern for most of them. A subclass of the myotubularin homologues lacks a functional PTP active site. Missense mutations found in XLMTM patients affect residues conserved in a Drosophila homologue. Comparison of the various genes allowed construction of a phylogenetic tree and reveals conserved residues which may be essential for function. These genes may be good candidates for other genetic diseases. INTRODUCTION X-linked recessive myotubular myopathy (XLMTM; OMIM 310400) is a very severe congenital muscle disorder. Affected boys present severe hypotonia and respiratory insufficiency at birth, leading to high neonatal mortality. Since skeletal muscle biopsies show small rounded muscle fibres with centrally located nuclei resembling myotubes, it has been suggested that the disorder results
from an arrest in late myogenesis (1). Autosomal recessive and dominant forms have also been described and share in common, together with XLMTM, the typical muscle biospy features, but with milder phenotypes and a delayed muscle weakness (2). The MTM1 gene was isolated by a positional cloning strategy from the Xq28 region (3) and 81 mutations have been found to date in unrelated patients (3–6). The encoded protein, named myotubularin, contains a 12 amino acid consensus sequence for the active site of tyrosine phosphatases, but shows no significant similarity, outside this small segment, to other known tyrosine phosphatases. Protein tyrosine phosphatases (PTPs) and dual specificity phosphatases (DSPs) form a wide class of proteins involved in many physiological processes, like cell growth and differentiation. They regulate components of diverse signal transduction pathways (7). PTPs are characterized by the presence of a conserved catalytic domain of ∼240 amino acids, which contains the active site motif (I/V)HCxAGxxR(S/T)G (8). Tyrosine phosphatases may contain other protein domains, for example SH2 domains (9). Their great diversity is one characteristic of this class of proteins, which includes transmembrane and intracellular PTPs found in various subcellular compartments. It has been estimated that the human genome may encode ∼500 PTPs (7). The enzymatic activity of myotubularin had not been demonstrated initially, but several missense mutations were found to affect the putative active site. We show here that myotubularin indeed displays a phosphatase activity on the synthetic substrate p-nitrophenyl phosphate (p-NPP) that can be inhibited by tyrosine phosphatase inhibitors. Independently, Cui et al. have recently shown that myotubularin has a dual tyrosine and serine phosphatase activity in vitro (10). Myotubularin was found to be highly conserved in Saccharomyces cerevisiae and Caenorhabditis elegans (3), which appeared very surprising for a protein implicated in a muscle-specific human disorder. We also reported contigs of human ESTs corresponding to three homologous genes that, together with MTM1, define a new family of putative tyrosine phosphatases in man (3). Over the last few years, the number of random cDNA expressed sequence tags (ESTs) has grown exponentially (11). In order to characterize the MTM family further from yeast to
*To whom correspondence should be addressed. Tel: +33 3 88 65 32 44; Fax: +33 3 88 65 32 46; Email:
[email protected] +These authors contributed equally to this work
1704 Human Molecular Genetics, 1998, Vol. 7, No. 11 human, we screened EST and genomic DNA databases and conventional cDNA libraries. We report the partial or complete coding sequences for 20 different genes from Schizosaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, zebrafish, mouse and man. We identified a total of eight human genes and report for most of them the chromosomal localization and data on their expression pattern. Protein sequence comparisons define the PTP/DSP signature specific to this large family and other conserved domains of probable functional importance, such as the recently described SET interacting domain (SID) (10). In particular, missense mutations found in XLMTM patients affect amino acids that are conserved in Drosophila. However, a subclass of myotubularin homologues appears to lack a functional DSP active site. It includes the Sbf1 gene (hMTMR5 in the present study), the product of which was recently proposed to act as an antiphosphatase that could prevent the action of active phosphatases of the myotubularin family (10). RESULTS Myotubularin exhibits a phosphatase activity In order to test the enzymatic activity of myotubularin, we used a baculovirus vector to express the complete protein coding sequence (histidine-tagged), in either a wild-type or mutated (C375S) form. The C375S mutation is located in the consensus active site of PTP/DSP and has been shown to abolish tyrosine phosphatase activity in PTPs (12,13). The purified wild-type myotubularin hydrolyzes p-NPP, a synthetic phosphatase substrate, in a reaction inhibited by sodium phosphate. The enzymatic activity of the mutated C375S myotubularin was reduced by ∼95% in comparison with the wild-type (Fig. 1). The phosphatase activity of myotubularin was also inhibited by addition of sodium orthovanadate, a tyrosine phosphatase inhibitor that interacts with the enzymatic active site (14,15). In contrast, hydrolysis of p-NPP was unaffected by the presence of okadaic acid, a specific inhibitor of serine/threonine phosphatases (7). Myotubularin was recently shown to exhibit tyrosine and serine phosphatase activity in vitro and is thus a DSP (10). Identification of MTM homologues in humans and other species Potential MTM homologous genes were initially identified by searching the EST database (16) using the nucleotide and protein sequences of the hMTM1 gene. The BLAST program (17) was used to search in all six reading frames. After alignment of the identified sequences into contigs of overlapping clones, the sequence data were verified and completed by sequencing the largest clone of each contig (see Materials and Methods for identification of sequenced clones). For hMTMR1, hMTMR2 and hMTMR3, the cDNA sequences were completed with respect to our previous report (accession nos U58032–U58034) (3). Genomic sequences showing strong similarity to MTM1 were also retrieved from the GenBank and various genomic databases. Sixteen contigs were identified. Some of the EST contigs have been reported in the TIGR database (see legend to Fig. 2). In addition, cDNA clones that correspond to the mMTM1, mMTMR1 and zMTMR2 genes were obtained by screening mouse and zebrafish cDNA libraries, respectively. A chromosome 22 genomic sequence (GenBank accession no. AC003071) defines part of the structure for hMTMR3. There is
Figure 1. Myotubularin exhibits a tyrosine phosphatase activity. Bars show the activity of either wild-type or mutated myotubularin in the presence or absence of several inhibitors. p-NPP was used as substrate. Values are percentages of the activity of wild-type myotubularin at 6 ng/µl and represent the means ± SD of two independent experiments (each performed in duplicate). CON, control, i.e. wild-type baculovirus; WT, wild-type myotubularin; CS, mutated myotubularin; 1, 0.1 mM sodium orthovanadate; 2, 1 mM sodium orthovanadate; 3, 10 mM sodium orthovanadate; 4, 5 µM okadaic acid; 5, 10 mM sodium phosphate.
no conservation of the genomic structure between this gene and hMTM1 in the compared region. The C.elegans cDNA clone yk21d9 was sequenced entirely and the corresponding protein sequence (ceMTMH1) differs at its C-terminus from that predicted from genomic sequencing (see Materials and Methods). A second gene (ceMTMH2) is predicted from available genomic sequences of C.elegans and matches four ESTs. The ceMTMH1 gene has 10 exons for an open reading frame of 931 amino acids. As for the human MTM1 gene, the first exon is non-coding and there are two functional polyadenylation sites, since two different kinds of cDNA clones are present in the EST database which differ only in the 3′-untranslated region. However, the position of exons is not conserved between this C.elegans gene and human MTM1 (18). RT–PCR experiments allowed us to identify an intron and correct a frameshift error in a genomic sequence from Drosophila, defining the dMTMH1 gene. Sequences recently deposited in the databases indicate the existance of at least one additional MTM homologue gene in Drosophila (see Materials and Methods). As a result, sequence information was obtained on a total of 20 genes belonging to the MTM family, with the entire coding sequence being known for eight of them (see Fig. 2 for the length of sequenced coding regions). A total of eight different human genes and six mouse genes were identified. The available mMTMH2 sequence does not overlap with the mMTMH1 and mMTMH3 sequences and RT–PCR experiments on myoblast mRNA suggest that these sequences correspond to three separate genes (data not shown). Three of these mammalian genes (including hMTMR5/Sbf1) contain a variant, non-functional PTP/DSP signature (Fig. 2; 10) and a recent human genomic sequence (accession no. AL021155) indicates the presence of another such gene or pseudogene. Human genes homologous to MTM1 were named hMTMR1–hMTMR7 (R for related). We used
1705 Human Genetics, 1998, 7, No. NucleicMolecular Acids Research, 1994, Vol. Vol. 22, No. 1 11 1705
Figure 2. Representation of the partial or complete coding sequences of members of the MTM family. The horizontal lines represent the span of the sequenced coding region of each gene. We have designated the human genes MTMR1–MTMR7 (R for related) and used the same R designation for the corresponding mouse and zebrafish orthologues. When the correspondence with a human gene cannot be unambiguously determined, we have designated the genes MTMH (H for homologue). M is the methionine start codon, and stars represent stop codons. The first methionine (M) is not conserved between hMTMR1 and mMTMR1. The hMTMR1 cDNA clone appears chimeric, as several cDNA clones cover the N-terminal part of mMTMR1, and this is confirmed by 209 kb of genomic sequence (AA002223) available for hMTMR1. A black filled square denotes a PTP/DSP signature, and an empty square stands for a variant signature. The comparison area used in Figure 3 is shown by a double arrowed line. The GenBank accession nos related to the different genes are as follow: hMTM1, U46024; hMTMR1, U58032; hMTMR2, U58033; hMTMR3, U58034; hMTMR4, EST AA317931; hMTMR5, AF072929; hMTMR6, AF072928; hMTMR7, AF073482; mMTM1, AF073996; mMTMR1, AF073997; mMTMR7, AF073882; mMTMH1, AF073879; mMTMH2, AF073880; mMTMH3, AF073881; zMTMR2, AF073883; dMTMH1, AF076432, and AC002594 for cosmid DS05973; ceMTH1, AF031519, in cosmid T24A11; ceMTH2, Z81546 and ORF F53A2.8, in cosmid F53A2; scMTMH, Z49610; spMTMH, Z98974. One additional gene is present in Drosophila, defined by recently deposited sequences (see Materials and Methods). EST contigs found in the TIGR database (http://www.tigr.org ): hMTM1, THC2108331, HT48648; hMTMR1, THC211120, HT48723; hMTMR2, THC211460, HT48724; hMTMR3, THC214662, HT48725; hMTMR5, THC207058.
the same nomenclature for mouse (m) and zebrafish (z) genes that could be identified as clear orthologues of the human genes. The other genes were named MTMH (H for homologue). Sequence diversity and phylogenetic analysis In order to analyse the extent of evolutionary relatedness between the different homologues and to emphasize conserved residues which may be essential for function, the predicted protein sequences were aligned and a phylogenetic analysis was performed using CLUSTALW (19). hMTMR4, mMTMH2 and mMTMH3 were not used in the comparison as they do not encompass the whole 202 amino acid region of comparison.
hMTMR4 (defined only by the dbEST sequence information) seems to be closer to hMTMR3, with 76% identity over 122 amino acids and