Cancer Research and Development Center, Frederick, Maryland 21702, USA; 3Department of ... University Genome Sequence Center (GSC), we identi-.
Oncogene (1997) 15, 1583 ± 1586 1997 Stockton Press All rights reserved 0950 ± 9232/97 $12.00
Gene structure of the human MET proto-oncogene Fuh-Mei Duh1, Stephen W Scherer3, Lap-Chee Tsui3, Michael I Lerman2, Berton Zbar2 and Laura Schmidt1 1
Intramural Research Support Program, SAIC Frederick, and 2Laboratory of Immunobiology, National Cancer Institute-Frederick Cancer Research and Development Center, Frederick, Maryland 21702, USA; 3Department of Genetics, the Hospital for Sick Children, Toronto, Ontario, M5G 1X8, Canada
By direct sequencing of cosmids using primers designed from the known cDNA sequence, we identi®ed 19 exons in the human MET proto-oncogene, and sequenced the corresponding 5' and 3' exon-intron junctions. By homology search in the database of the Washington University Genome Sequence Center (GSC), we identi®ed one additional exon. These 20 exons, together with a previously reported exon, bring the total exon number of MET to 21. Oligonucleotide primers were designed to amplify each exon and adjacent intronic sequences to permit examination of each exon for mutations. By restriction mapping, we assembled a 110 kb genomic contig that covered almost the entire MET protooncogene. This information is relevant for the screening of recently reported mutations of the MET gene which cause hereditary papillary renal carcinomas and for the search for additional mutations of the same gene which may play a role in the pathogenesis of common human carcinomas including carcinomas of the breast, ovary and pancreas. Keywords: MET proto-oncogene; HGF/SF receptor; renal carcinoma
Introduction The MET receptor tyrosine kinase transduces motility, proliferation and morphogenic signals of the ligand hepatocyte growth factor/scatter factor (HGF/SF) in epithelial cells (Weidner et al., 1993). During embryogenesis, the MET receptor ± HGF/SF pathway is required for normal muscle and liver development (Bladt et al., 1995). The MET proto-oncogene has been implicated in human neoplasia. Increased MET protein expression has been found in a number of common human carcinomas (Di Renzo et al., 1992, 1994). The human MET cDNA encodes a protein of 1408 amino acids (aa) consisting of a large extracellular domain containing 950 residues, a transmembrane domain (aa 951 ± 973), and an intracellular tyrosine kinase domain (aa 1102 ± 1351) (Cooper et al., 1984; Park et al., 1987). Alternative splicing at the 5' end of the gene has been reported; this leads to a protein of 1390 aa (Ponzetto et al., 1991). The processed form of the MET protein consists of an alpha and a beta chain; the alpha chain consists of aa 1 ± 307; the beta chain consists of aa 308 ± 1408 or aa 308 ± 1390 in the alternatively spliced form. Correspondence: L Schmidt Received 10 May 1997; revised 11 June 1997; accepted 11 June 1997
Hereditary papillary renal carcinoma (HPRC) is a recently described, inherited form of human cancer characterized by a predisposition to develop bilateral papillary renal tumors (Zbar et al., 1994, 1995). Recent studies demonstrated missense mutations in the tyrosine kinase domain of MET in patients with HPRC (Schmidt et al., 1997). The cosegregation of these mutations with the disease clearly indicates that they predispose to the development of papillary renal carcinomas. Of particular interest, three of the mutations were located in codons that are homologous to codons in c-KIT and the RET protooncogene that are targets of naturally-occurring mutations (Hofstra et al., 1994; Nagata et al., 1995). The availability of a full length cDNA sequence of the MET gene, as well as genomic cosmids of the MET gene, made it possible for us to use a direct sequencing approach for the identi®cation of exon-intron boundaries. Primers were designed every 200 ± 300 base pairs based on the known MET cDNA sequence. These primers were used to sequence genomic cosmids that contained portions of the MET gene. Discrepancies between the genomic and cDNA sequences pointed to exon-intron boundaries. The strategies used in the present work led to the de®nition of 20 exon-intron boundaries de®ning 20 coding exons. These exons are in addition to a 5' noncoding exon identi®ed by Gambarotta et al. (1994). In addition, we report 5' and 3' ¯anking intronic sequences at each of the 20 newly identi®ed boundaries, and a panel of primer pairs suitable for amplifying and scanning the exons for mutations. Results and discussion We de®ned the exon-intron boundaries, and ¯anking intronic sequences of the MET proto-oncogene by directly sequencing three overlapping cosmids that cover a 70 kb genomic region. We identi®ed 19 exons in the genomic region that corresponds to the MET cDNA sequence from nucleotide 1395 to 4626 bp. We were unable to obtain any cosmids that covered the region that corresponded to MET cDNA nucleotides 181 ± 1395. However, by performing homology searches, we found the genomic sequence that corresponded to MET cDNA sequence 8 ± 2458 in the GSC database (Genome Sequencing Center, personal communication). We identi®ed an additional exon at position 181 ± 1394 and con®rmed the published exon at position 8 ± 180 (Gambarotta et al., 1994). The 3' and 5' intronic sequences adjacent to each exon were de®ned (Table 1). Presented in Table 1 are the 20 nucleotides adjacent to each of the exon-intron
Exon structure of the human MET proto-oncogene F-M Duh et al
1584
boundaries. Exon 1 and the beginning of the exon 2 contained the 5' UTR sequence; exons 2 ± 21 contained the coding exons; exon 21 contained the 3' UTR. The size of the coding exons was small, ranging from 81 ± 231 bp, with the exception of exon 2 which was 1214 bp. Other receptor tyrosine kinase genes such as c-KIT, and the RET proto-oncogene also consist of small exons (Kwok et al., 1993; Andre et al., 1992). The extracellular domain of MET was located in exons 2 ± 13. The transmembrane domain was located in exon 13 together with the end of the extracellular domain, and the beginning of the intracellular domain. The tyrosine kinase domain was located in exons 15 ± 21 (Figure 1). We determined the size of the six smaller Table 1 No.
Exons 5' Position*
Size (bp)
introns; introns 4, 7, 8, 9, 13 and 20 were 769, 98, 680, 716, 194 and 95 bp respectively. By analysing the available sequence in the GSC database, we also determined the sizes of introns 1, 2, 3, 5 and 6; they were 26 490, 31 384, 8091, 14 329 and 1920 bp, respectively. Our results indicate that the minimum size of the MET proto-oncogene is 110 kb. We constructed a genomic contig of 110 kb with three overlapping cosmids Y63e3, Y169h6, and Y182b3 and a GSC sequence contig H_RG253B13.SEQ. The contig which is shown in Figure 2b, contains almost the entire MET gene. The SacI restriction map of the MET genomic region (Figure 2a) showed it contained various sizes of
Exons and the intron-exon boundary sequences of the MET proto-oncogene 3' of the Previous Intron
Intron-Exon Boundary 5' Exon. . . . . .3' Exon
5' of the Next Intron
*Position de®ned according to Genbank # J02958. {ND=Not determined
Figure 1 Exon/intron boundaries and domains of the MET proto-oncogene. The position of the cDNA sequence is de®ned according to GenBank # J02958. The extracellular domain of MET is located in nucleotide numbers 181 ± 3044, the transmembrane domain is located in nucleotide numbers 3045 ± 3113, and the tyrosine kinase domain is located in nucleotide numbers 3498 ± 4247. The four digit numbers indicate the ®rst nucleotide of the exon. For example, `1395' indicates that nucleotide 1395 is the ®rst nucleotide within exon 3. The position of the ATP binding site, autophosphorylation site and SH2 docking sites are indicated by an arrow
Figure 2 The MET proto-oncogene genomic maps. (a) The genomic restriction map, the SacI (S) restriction sites were determined in the three comids by restriction enzyme digestion with SacI and in the GSC sequence contig H_RG253B3.SEQ by sequence analysis with GCG Findpatterns program (Devereaux et al., 1984). (b) The genomic contig was assembled with cosmids Y63e3, Y169h6 and Y182b3 and the sequence contig H_RG253B13.SEQ, by restriction mapping (c). The arrows point at the beginning of MET exons on the genomic contig. Exons 11 ± 19 were not shown, because except intron 13, the size of introns 10 ± 19 were not determined
Exon structure of the human MET proto-oncogene F-M Duh et al
restriction fragments, ranging from 0.4 ± 20 kb. We also positioned the beginning of exons 1 to 10, 20 and 21, on the genomic contig (Figure 2c). We were unable to locate exons 11 ± 19, because, except intron 13, the sizes of introns 10 ± 19 were not determined. We sequenced the corresponding 3' UTR region in cosmid Y182b3 and extended 745 bp more than the published sequence. No polyadenylation signal was found in the sequenced 3' UTR. However, we found four putative polyadenylation sites from GSC database (Genome Sequencing Center, personal communication). Polyadenylation signals ATTAAA and AGTAAA were found about 2 kb downstream of the published sequence. Polyadenylation signals AATAAA and ATTAAA were also found about 3.5 and 3.6 kb downstream of the published sequence. The multiple polyadenylation sites predict the size of MET protooncogene message to be 6.6, 8.1 and 8.2 kb. We designed oligonucleotide primer pairs to examine exons for disease-producing mutations (Table 2). Primers were designed in the intron to amplify each exon and adjacent splice junctions. These primers have been tested and used for SSCP and for sequencing each exon. The utility of this information for diagnostic purposes is exempli®ed in our recent work describing the identi®cation of nine distinct mutations in the tyrosine kinase domain of the MET proto-oncogene and four polymorphisms (Schmidt et al., 1997).
These results will facilitate the search for mutations in the MET proto-oncogene in patients with hereditary and sporadic papillary renal carcinoma, and common human carcinomas. They will be invaluable in de®ning the disease spectrum produced by germline and somatic mutations of the MET proto-oncogene. Materials and methods The MET cDNA pMOG was obtained from Dr George vande Woude. The position of cDNA sequences was de®ned according to GenBank #J02958. Cosmid Y182b3, Y169h6, and Y63e3, were isolated from the Lawrence Livermore National Laboratory chromosome 7 cosmid library LL07NC01`Y'. Direct sequencing by primer walking was the strategy used to determine the intron-exon boundaries of MET proto-oncogene. Primers designed from the cDNA sequence were used to sequence the cDNA and cosmid DNAs. The cDNA plasmid was sequenced to verify the validity of the primers and to con®rm the published sequence. The position of the intron-exon boundaries and the ¯anking intron sequences were deduced by comparing the genomic sequences with the cDNA sequences (Altschul et al., 1990). Sequencing reactions of plasmid DNA and cosmid DNA were performed with AmpliTaq DNA polymerase (FS) Dye Terminator Cycle Sequencing Kit (Perkin-Elmer, Foster City, CA) following the standard protocols. One to 2 mg of cosmid DNA was used in each reaction with an addition of 10% DMSO. All sequencing reactions were run on an ABI 373
Table 2 Primer pairs for ampli®cation of the coding exons of the MET proto-oncogene Exon
cDNA (nt)
ND=Not determined
Exon size (bp)
Forward primer (5'-3')
Reverse primer (5'-3')
Size of pcr product (bp)
1585
Exon structure of the human MET proto-oncogene F-M Duh et al
1586
Stretch Automated DNS Sequencer (Applied Biosystems, Foster City, CA). Database searching was performed with the BLAST servers in NCBI (http://www.ncbi.nlm.nih.gov/), and the GSC (http://www.genome.wustl.edu/gsc/gschmpg.html).
Acknowledgements The authors wish to thank the Genome Sequencing Centre, Washington University, St Louis for communication of
DNA sequence data prior to publication. We acknowledge the National Cancer Institute for allocation of computing time and sta support at the Frederick Biomedical Supercomputer Center. The content of this publication does not necessarily re¯ect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US government. The ¯anking intronic sequences of exons 3 ± 21 were deposited in GenBank accession numbers U96969-U96999.
References Altschul SF, Fish W, Miller W, Myers EW and Lipman DJ. (1990). J. Mol. Biol., 215, 403 ± 410. Andre K, Martin E, Cornu F, Hu W-X, Wang X-P and Galbert F. (1992). Oncogene, 7, 685 ± 691. Bladt F, Riethmacher D, Isenmann S, Sguzzi A and Birchmeier C. (1995). Nature, 376,768 ± 711. Cooper CS, Park M, Blair DG, Tainsky MA, Juebner K, Croce CM and Vande Woude GF. (1984). Nature, 311, 29 ± 33. Devereaux J, Haeberli P and Smithies OA. (1984). Nucleic Acids Res., 12, 387 ± 395. Di Renzo MF, Olivero M, Prat M, Bongarzone I, Pilotti S, Bel®ore A, Costantino A, Vigneri R, Pierotti MA and Comoglio PM. (1992). Oncogene, 7, 2549 ± 2553. Di Renzo MF, Olivero M, Katsaros D, Crepaldi T, Gaglia P, Zola P, Sismondi P and Comoglio PM. (1994). Int. J. Cancer, 58, 658 ± 662. Gambarotta G, Pistoi S, Giordano S, Comoglio PM and Santoro C. (1994). J. Biol. Chem., 269, 12852 ± 12857. Hofstra RM, Landsvater RM, Ceccherini I, Stulp RP, Stelwagen T, Luo T, Pasini B, Hoppener JWM, van Amstel HKP, Romeo G, Lips CJM and Buys CHCM. (1994). Nature, 367, 375 ± 376. Kwok JBJ, Gardner E, Warner JP, Ponder BAJ and Mulligan LM. (1993). Oncogene, 8, 2575 ± 2582. Nagata H, Worobec AS, Oh CK, Chowshury BA, Tannenbaum S, Suziki Y and Metcalfe DD. (1995). Proc. Natl. Acad. Sci. USA, 92, 10560 ± 10564.
Park M, Dean M, Kaul K, Braun MJ, Gonda MA and Vande Woude GF. (1987). Proc. Natl. Acad. Sci. USA, 84, 6379 ± 6383. Ponzetto C, Giordano S, Peverali F, Della Valle G, Abate ML, Vaula G and Comoglio PM. (1991). Oncogene, 6, 553 ± 559. Schmidt L, Duh F-M, Chen F, Kishida T, Glenn G, Choyke P, Scherer SW, Zhuang Z, Lubensky I, Dean M, Allikmets R, Chidambaram A, Bergerheim UR, Feltis JT, Zamarron A, Richard S, Lips CJM, Walther MM, Tsui L-T, Geil L, Orcutt ML, Stackhouse T, Lipan J, Slife L, Brauch H, Decker J, Storkel S, Niehans G, Hughson MD, Moch H, Lerman MI, Linehan WM and Zbar B. (1997). Nature Genet., 16, 68 ± 73. Weidner KM, Sachs M and Birchmeier W. (1993). J. Cell Biol., 121, 145 ± 154. Zbar B, Glenn G, Lubensky I, Choyke P, Walther MM, Magnuson G, Bergerheim USR, Pettersson S, Amin M, Hurley K and Linehan WM. (1995). J. Urol., 153, 907 ± 912. Zbar B, Tory K, Merino M, Schmidt L, Glenn G, Choyke P, Walther MM, Lerman M and Lineham WM. (1994). J. Urol., 151, 561 ± 566.