Complete Nucleotide Sequence of a Methylcholanthrene-inducible ...

6 downloads 0 Views 3MB Size Report
Complete Nucleotide Sequence of a Methylcholanthrene-inducible. Cytochrome P-450 (P-450d) Gene in the Rat*. (Received for publication, October 10, 1984).
THEJOURNAL OF BIOLOGICAL CHEMISTRY Q 1985 by The American Society of Biological Chemists, Inc.

Vol. 260,No. 8. Issue of April 25, pp. 50264032,1985 Printed in U.S.A.

Complete Nucleotide Sequenceof a Methylcholanthrene-inducible Cytochrome P-450 (P-450d)Gene in theRat* (Received for publication, October 10, 1984)

Kazuhiro Sogawa,Osamu GotohS, Kaname KawajiriS, Tadashi Harada, andYoshiaki FujiiKuriyama From the Department of Biochemistry, Cancer Institute, Japanese Foundationfor Cancer Research, T o s h i m - k u 170, Japan and the $Department of Biochemistry, Saitamn Cancer Center Research Institute, Ina-machi, Saitama362, Japan

The rat cytochrome P-450d gene which is inducibly expressed by the administration of 3-methylcholanthrene (MC) has been cloned and analyzed for the complete nucleotide sequence. The gene is 6.9 kilobases long and is separated into 7 exons by 6 introns. The insertion sites of the introns in this gene are wellconserved as compared with those of another MC-inducible cytochrome P-45Oc gene, but are completely different from those of a phenobarbital-inducible cytochrome P-450e gene. The overall homologies in the coding nucleotide and deduced amino acid sequences were 75% and 68% between the two MC-inducible cytochrome P-450 genes, respectively. The similarity of the gene organizationbetween cytochrome P-450d and P-45Oc as well as their homology in the deduced amino acid and the nucleotide sequences suggests that these two genes of MC-inducible cytochromes P-450 constitute a different subfamily than those of the phenobarbital-inducible one in thecytochrome P-450 gene notable sequence homology family. In contrast with the in the coding region of the two MC-inducible cyto5’- and 3‘chromes P-450, all the introns and the flanking regionsof the two genes showed virtually no sequence homology between them except for several short DNA segments that are located in the promoter region and the first intron. The nucleotide sequences and the locations of these conserved short DNA segments in the two genes suggest that they may affect the expressionof the genes. Middle repetitive sequence reported as ID or identifier sequence were found in and in the vicinityof the cytochrome P-450d gene.

Administration of 3-methylcholanthrene intorats markedly induces the synthesis of two different forms of cytochrome P450,P-45Oc and P-450d, in the liver microsomes (1).The accumulation of these hemoproteins that act as theterminal oxidase in the NADPH-dependent electron pathway enhances the activities of drug metabolism and biotransformation of chemical carcinogens into their ultimateforms (2, 3). It has been shown that the two cytochromes have some immunological and spectral properties in common, but they have clearly different substrate specificities; cytochrome P450c preferably hydroxylates, for example, benzo[a]pyrene * This work was partly supported by grants-in-aid for Scientific Research from the Ministry of Education, Science and Culture of Japan, funds obtained under the Life Science Project from the Institute of Physical and Chemical Research, Japan, and funds from the Princess Takamatsu Cancer Research Fund. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisernent” in accordance with 18 U.S.C. Section 1734 solelyto indicate this fact.

and 7-ethoxycoumarin, while cytochrome P-450d shows higher activity toward 3-amino-l-methyl-5H-pyrido[4,3-b]indole (Trp-P-2), and 2-amino-6-methyldipyrido[1,2-a:3’,2’-d] imidazole (Glu-P-1) (4, 5). In addition to 3-methylcholanthrene, various compounds that arenot apparently related in the chemical structure, such as 8-naphthoflavone, 2,3,7,8tetrachlorodibenzo-p-dioxin, and polychlorinated biphenyls are known to induce both forms of the cytochrome to a similar extent. On the other hand, isosafrole induces rather specifically cytochrome P-450d than P-45Oc (1,6). Recently, the cDNAs of cytochrome P-45Oc (7) and P-450d (8) have been cloned and the respective primary structures were deduced from their nucleotide sequences. Moreover, the gene structure for cytochrome P-45Oc has been elucidated by sequence analysis of cloned genomic DNAs (9). From these studies, localization of similarity and diversity in the amino acid sequence is observed between cytochrome P-45Oc and P450d. Furthermore, these two forms of the cytochrome show sequence homologies with phenobarbital-inducible cytochrome P-450b and P-450e, and also with other forms of cytochromes P-450 whose sequences have been determined up to date, suggesting that all these forms of cytochrome P450 constitutea unique family of hemoproteins, which is probably derived from a common ancestor (10). In order to understand the molecular evolution as well as themechanisms underlying the induction by the drug administration of these forms of cytochrome P-450, it will be of great significance to elucidate and compare the gene structures of various forms of cytochrome P-450. In this paper, we present the complete nucleotide sequence of cytochrome P-450d gene as compared with those of cytochrome P-450 genes so far determined. The cytochrome P450d gene was about 6.9 kilobases long and was separated into 7 exons by 6 introns. The intron-exon arrangement of cytochrome P-450d genewasveryhomologous to that of cytochrome P-45Oc gene, but it differed greatly from that of cytochrome P-450e gene. EXPERIMENTAL PROCEDURES”~

RESULTS

Nucleotide Sequence of Cytochrome P-450d Gene and the Deduced Amino Acid Sequence-As shown in Fig. 2, we de-. Portions of this paper (including “Experimental Procedures,” part of “Results,” and Figs. 1and 2) are presented in miniprint at theend of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are available from the Journal of Biological Chemistry, 9650 Rockville Pike, Bethesda, MD 20814. Request Document No. 84M-3140, cite the authors, and include a check or money order for $4.00 per set of photocopies. Full size photocopies are also included in the microfilm edition of the Journal that is available from Waverly Press. The abbreviations used are: MC, 3-methylcholanthrene; PB, phenobarbital; kb, kilobase(s); bp, base pairs(s).

5026

5027

Gene Structurefor Cytochrome P-450d termined complete nucleotide sequence of cytochrome P-450d gene. Coding nucleotide sequences were determined in reference to the cDNA sequence and the consensus sequence of the exon-intron boundary (18). The gene encoded 513 amino acids on the6 separate regions. By precise comparison of the exonic sequences with the cDNA sequence, five nucleotide substitutions (Ado 1523 in genomic sequence for Guo in cDNA sequence, Cyd 1704 for Thd, Cyd 1722 for Thd, Thd4758 for Cyd, and Cyd 4804 for Thd), all of which were transition of nucleotide, were found. Of these point mutations, two caused changes of two amino acids, His (CAC in the genomic sequence) 137 to Arg (CGC in the cDNA sequence) and Arg (CGC) 403 to Cys (TGC)? Comparison of the leader sequence of the cytochrome P-450d mRNA in thecDNA sequence with the corresponding part of the gene shows that the exon sequence was interrupted eight base pairs upstream from the initiation codon. On the 5'-end of this sequence, a typical acceptor sequence for splicing, CAG, was present. This result suggested that the sequence coding for 5"untranslated sequence was split by one or more intron(s). We characterized the leader sequence of cytochrome P-450d mRNA in detail by primer extension. Fig. 3 shows the length and nucleotide sequence of the leader sequence that was produced by the primer extension method. The sequence was 60 bases long. In order to locate this exon in the cloned DNA, asynthetic heptadecanucleotide, AGGGGCTGGAGACTGGC,which was complementary to the sequence determined by the primer extension, was used as thehybridization probe. The exon was found ina 2.7-kb BamHI/HindIII fragment of XP-450d-1 (data notshown). Sequence analysis of this fragment showed that theexon encoding all the nucleotide sequence determined by the primer extension was located about 280 base pairs upstream from the Hind111 site. Taken together with the fact that the transcription of most eukaryotic genes begins with Ado, these results indicate that the transcription-initiation site is most probably the Ado marked by a verticle arrow at the nucleotide number 1 (Fig. 2). A putative TATA box was present 28 nucleotides upstream from the transcription-initiation site. From the comparison of the cytochrome P-450d gene sequence with the cytochrome P-450d cDNA sequence (Fig. 4), poly(A) attachment siteof the gene was located about 250 base pairs downstream from the terminationcodon, TGA. A possible and atypical poly(A) addition signal, AATAGA, was recognized 26 basepairsupstream from the poly(A) addition site. This polyadenylation signal was also reported by Affolter and Anderson (22) by the sequence analysis of the independently cloned cytochrome P-450d cDNA. There exist three copies of a middle repetitive sequence which has been reported as ID identifier or sequence by Milner et al. (23). One copy was located in the 5"flanking region of the gene and the othertwo copies were tandemly arranged in an inverse orientation in the third intron (Fig. 5). Other repeated sequences which are directly or invertedly repeated are also found inthe 5"flanking region and introns. They are marked by horizontal arrows which are numbered in parentheses (Fig. 2). Many simple repeated sequences as well exist near or within the gene. These are (AAC)9starting at -727 nucleotide, (TTCC)ll at -514, (GT)z7 a t 159 (first intron), and (AG)2sat 3108. Structural Comparison of Cytochrome P-450d Gene with Other Cytochrome P-450 Genes-As shown in Table I, sequence homologies in amino acid sequence of cytochrome P450d with those of other cytochromes were calculated. From Cyd 802 in cDNA sequence is substituted for Thd 1898 in genomic sequence. Accordingly, Thr (TCT) 262 was changed to Phe (TTT). This change was due to sequencing error in cDNA sequence.

A

1 2 3 4

FIG. 3. Estimation of the length of the cytochrome P-450d mRNA leader sequence (A) and its sequence analysis (B).A, the terminally labeled anticoding strand was preparedfrom PstI/ AuaII fragment (100 bp) by strand separation and usedas the primer (Ref. 8; Fig. 1). Extension of the primer was carried out using 10 pg of the poly(A)+RNA as the template and the products were analyzed as describedunder"ExperimentalProcedures." The size was estimated using appropriatesequencingladders as size markers. The arrow indicates the position of the extended DNA fragment. B, the extended fragment shown in A was eluted from the gel and used for sequencing. Lanes 1,2,3, and 4 are G, G A, T C, and C degradation products, respectively.The arrowhead shows the insertion site of the first intron.

+

+

this calculation, cytochrome P-450d shows statistically significantrelatedness of S values above 3 with any of the cytochromes including bacterial cytochrome P-450cam and mitochondrial cytochrome P-45O(scc).Among cytochromes P-450 listed in Table I, cytochrome P-450d is much more homologous to cytochrome P-450c, and their counterparts in mouse (P1-450and P3-450)than cytochrome P-450b and other forms of the cytochrome. The overall homologies to cytochrome P-45Oc in the amino acid and the coding nucleotide sequences are 68% and 75%, respectively. The gene organization of cytochrome P-450d is also very similar to that of cytochrome P-45Oc (Fig. 1and Fig. 2 in Ref. 9). The insertion sites of all the introns in relation to the coding nucleotide sequence or the amino acid sequence are well-conserved between the two genes and all the splicing junctions satisfy the

5028

Gene Structure for CytochromeP-45Od 1.

GCTGTGCCACGTGCTAATCTAGTTTTTGACTCAATAGATTCCAGTGAGTTATGGT

2.

GCTGTGCCACGTGCTAATCTAGTTTTTGACTC~TTTGCCAACTCTGGCTGTTTCATATIA)n

3.

GCCGCACCTCATGCTAATCTAGTTTTTGACTCAATAGATTTGCCTACTCTGGCTGTCTCA~TCGAATGAATTATG(A1n

4.

TAATAGAGAARAATCTAACTCAAGTATCCAGAAATATATAGG~CGTACCTGAGCT~~-TATTACCTGG(Aln

5.

ATGGACTCTGTATATGGTCTCAGTGCTATGTCTACAGACTTACATAGTATGTATGGTTCA~CAG~TCACAGAGTGTGTGIAIn

FIG. 4. Comparison in the nucleotide sequence around the polyadenylation signal of various cytochrome P-450 gene and cDNAs. Putative polyadenylation signals are underlined. 1, cytochrome P-450d gene; 2, cytochrome P-450d cDNA (this sequence was derived from a cDNA clone, D-4-5 (19)); 3, murine Pa-450 cDNA (data from Ref. 20); 4, cytochrome P-45Oc cDNA (data from Ref. 7); 5, cytochrome P-450e cDNA (data from Ref. 21). -1129

*

*

-1179

-----

1.

GGGGTTGGGGAT~TAGCTCAGTGGTAGAGCGCTTGCCTAGCAAGTGCAAGG 3198

2.

~GGGTTGGGGATTTAGCTCAGTGGT~GCG~TGCCPAGG~GCGCAAGGCCCTGGGTTC~GTCCCCAGCTCC~ 3478 3379 ** ~GGGTTGGGGATTTAG~CAGTGGT~GCGCTTGCCTAGGAAGCGCAAGGCCCTGGGTTCGGTCCCCAGCTCCAA~~GG~

3.

3297

4.

GGGGCTGGGGATTTAGCTCAGTGGTAGAGCGCTTGCCPAGGAAGCGCAAGGCCCTGGGTTCGGTCCCCAGCTCCCCCCAAAAAAAAAAAA

5.

GGGGTTGGGGATTTCGCTCAGTGGTAGAGCGCTTGCCPAGGAAGCACAAGGCACAAGGCCCTGGGTTCGGTCCCCAGTTCCAAAAAAAAAAAAAAAA

FIG. 5. Nucleotide sequence of three copies o f a middle repetitive sequence in cytochrome P-450d gene and the identifier sequence. Nucleotides were numbered from Ado of cap site of the P-450d gene. The upper sequence ( 1 ) was found in 5”flanking sequence and the other two sequences (2 and 3) were found in the third intron. The direction of the two repetitive sequences ( I and 3 ) was reversed to that of cytochrome P-450d gene and the other one (2) had the same orientation as the gene. The sequencing of the upper sequence ( I ) was not completed, and the first 51 bp are represented here. Two sets of ID sequence (plB308 and plB337, see Ref. 23) are Eilso represented ( 4 and 5). The nucleotides that are different from those in plB308 or plB337 in the repetitive sequences in cytochrome P-450d gene are marked by asterisk.

TABLE I Homology in aminoacid sequence between cytochrome P-450d and various cytochromes P-450 Alignment scores (left) are calculated as described previously (10). Percentage of matched residues (right) isdefined as 100 X (the number of matched sites)/(the sum of the numbers of matched, replaced, and unpaired sites). Sequence data of cytochrome P-450c, P-450b,e, P-450LM2, P-450cam, P-45O(scc), P1-450, and P3-450 were obtained from Ref. 9, 21, 24-27, and 20, respectively. S %

P-450~ P-450b P-450e P-450LM2 P-450cam P-450(scc) PI-450 P3-450

28.7 28.5 27.8

64.5 24.7 24.6 24.0 3.8 6.9 66.1 100.1

67.9

13.8 18.0 68.8 93.2

canonical GT/AG rule (18).But all the introns cytochrome in P-450d gene are somewhat longer than those of cytochrome P-45Oc gene except for the first intron. The first intron in cytochrome P-45Oc gene was much longer than that of cytochrome P-450d. In contrast with the homology in the amino acid and the codingnucleotidesequence as describedabove, no marked sequence homology between the two MC-inducibleP-450 genes was observed in the sequence encoding the 5’- and 3’untranslated regions of the mRNA or in the intron sequences. The sequence diversity appears to extend to thesequence in the promoterregions of cytochrome P-450d and P-45Oc genes, in spite of the fact that these two genes have many inducer agents in common. No noticeable long stretch of homologous DNA sequence was observed in the 5’-flankingregion of the

two genes. Closer examination of the sequences in the promoter region, however, showed that several short DNA segments were conserved between the two MC-inducible cytochrome P-450 genes (Fig. 6). These are 1) GTGGGAAAG for cytochrome P-450d gene (-340 nucleotide from the cap site) and GTGGAAAG for cytochrome P-45Oc gene (-380 from thecapsite); 2) AGTGTATTCT for P-450d (-270) and AGTGGGATTCT for P-45Oc (-280); 3) CAGAGGAGGTGG for P-450d (-40) and CAGAGAAGGAGG for P-45Oc (-60); and 4) (GT)27for P-450d (160) and (GT),, for P-45Oc (233). DISCUSSION

Mapping and subsequent sequence analysis of the cloned DNA showed that this insert DNA represents a bona fide gene for cytochrome P-450d. Although five substitutions in approximately 1900 nucleotides are found between the gene and the cDNAsequences, these differences may originate from the polymorphism in rat strains rather than sequence errors. These parts of genomic and cDNA fragments were sequenced several times in both directions to find the same nucleotidedifferences. Whenthe two MC-induciblecytochrome P-450 genes were compared, strong similarities were observed in the gene organization aswell as in protein-coding sequences. On the other hand, this gene organization is totally different from the ones for PB-inducible P-450, P-450e (21, 28) and P-450b,, as pointed out previously (9). From the rate of amino acid replacements, the divergence of the ancestor leading to PB-inducible and MC-inducible cytochrome P-450, and then totwo different forms of MC-inducible cytochrome P-450, P-45Oc and P-450d, are estimated to occur 400 and 120 million years ago (9), respectively. Accordingly, the gene Y. Suwa, Y. Mizukami, K. Sogawa, and Y. Fujii-Kuriyama, manuscript in preparation.

Gene forStructure

Cytochrome P-450d

5029

-

-420 -400 380 - 360 - 340 CAAGAGGATCTTATACACCCAATGTCCCAATGTGTTAGGAGCGCATA~TGGAAqEGAACTCCTATGATCCGGGATGGACCCCCCCCCCCCCCCCCCGCAATACAGC TCACTAGTAAAAGTACCTCCCTATCTGCCTCTGTGGGGAGCATGAGCCGTCCCTGG~GGCTGTGGGGCTAGGCATGTCCCTTGCTCATGGGTGG~GAGGCTATACT

- 320 -240

- 300 -260 -280 TTTTTAGGCTGCCCCAGAATTTTTTTTTCAAACTCCTCCCT TGTCCATCAGCTTAATTTTTACACCCGAACTCTGCCGTCTT

0

GTCCATGGAGCGCCTTGAAAGTGAGGGTGACCCCAGCCTTCACACTGTGTGT TTCAAAGAAATATACCCTGGCCCCTAAAATGTCATATTTTTTATCTCTATGG

-1 -1 80 -140 -160 20 -200 GTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGCCTGCCTGTACCTACATCAGGTGTCCCCATTTGGCTTCTTGCCTATCTCATCCATCCCCCTCCTGCCTAGAGCAC ATATTTATCTCCATGGACCCAGAACCACTTCTGACCCAAAGAGCTATTCCTAAGCCCCAAGTGAAATGTGGAACTGAGGGATCATGGCTT~GCCAAGAATTGATCCTT -20

-100

-40

-60

-80

-1 GGTGGTGCCTTCACCCTAACC GGCCCCTTGTCAGAACCCAGA

1 ATGAAGGG AGTCCTGG

20 60 80 100 GGTAGTCCTTGCAGCTTTCCCCATCCTCCCTGGGGTCCTAGAGAACACTCTTCAGTTCAGTCCTTCCTCACAGCCAAAG GTGAGTGCTTGGGGGCTCCTGGACTGCCT ACTGACTGACTCCTACAACTCTGCCAGTCTCCAGCCCCTGCCCTTCAG GTATGTCTGTGTGTCTTTCCAGACCATGAAATCCGCTTCTATTTCCTAAATACTATTAGA 120 140 160 180 200 220 TTCCTTCCCAAGAAAAGGAATTTTGGTGAAGAGTTTTACTGGGTAGCTCCTCCAAAAGACTAAGTTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCT G A T G G A G A A A C A G G G A C C C A C C T T A G C T T C T C T T T T T A A C A T C C C C G ~ T T A T A T G T T T 240

260

280 300 GAGAGAGAGAGAGAGAGAGAGAGAGAGAGAGA~~GTAAAT~TCC CTGTCTGTCTGTCCATCAGAACACCAAAGCTTATGTGTGGCTCTCAGA

~-45oc P-450d

FIG. 6. Comparison in the nucleotide sequence around the transcription-initiation site of cytochrome P-450d and P-45Oc gene. The nucleotides are numbered from Ado of the transcription-initiation site. TATA box is indicated by an enclosure. Core sequences of the enhancer element are underlined by a solid line. Short conserved sequences in the promoter region and the first intron are underlined with wavy lines. A parindromic structure is marked by horizontal arrows.

organization of the two forms of MC-inducible cytochrome their expression in a similar fashion in the hepatocyte. They P-450 was preserved during the evolutionary period of 120 have many inducers in common except for isosafrole which million years. If MC- and PB-inducible cytochrome P-450 induces rather specifically the synthesis of cytochrome Pgenes indeed arose from a common ancestor as described 450d. Thus it is reasonable to expect that DNA sequences previously (9), drastic reorganizaton of the gene structure responsible for regulation of the genes maybe conserved must have occurred after duplication of the gene. At present, between the two genes. However, no long stretch of the however, too little information concerning the origin of in- homologous DNA sequences was observed in the promoter trons is available to clarify the evolutionary mechanism for region of cytochrome P-450d and P-45Oc genes. Instead, the reorganization of the gene structure. Elucidation of the several short segments were found to be conserved at the gene structures for other types of cytochrome P-450 may help corresponding positions between the two genes. TATA us to understand theevolutionary process of the cytochrome equivalent sequences are found to be located about 30 bp upstream from the transcription-initiationsite in both genes. A P-450 gene family. Taken together with the relatedness in optical and immunological properties and the presence of purine-rich sequence underlined by wavy lines in Fig. 6 is common inducers, the two MC-inducible cytochromes P-450, conserved at -43rd and at -61st position of cytochrome P-450d and P-450c, may well be classified into a subfamily of P-450d and P-45Oc genes, respectively. Interestingly, an analthe P-450 family which is different from the one for PB- ogous sequence to thisconserved segment is also observed in inducible cytochromes P-450. The number of bands detected the genes for PB-inducible cytochrome P-450, P-450b and Pby Southern blot analysis of total DNA is much smaller in 450e.4 The thirdhomologous elements were located at -272nd MC-inducible cytochrome P-450 genes than in PB-inducible and -278th position inthe upstreamflanking region of cytochrome P-450 genes (29). All the DNA fragments which cytochrome P-450d and P-45Oc gene, respectively. They are for cytochrome P-450d and hybridized to thecytochrome P-450d cDNA were found to be AGTGTATTCT included in either of the two cloned genes of MC-inducible AGTGGGATTCT for cytochrome P-45Oc gene. It would be cytochrome P-450, P-45Oc and P-450d, suggesting that the of interest to note that the sequence between -255th and subfamily of MC-inducible cytochrome P-450 consists of two -283rd nucleotide of cytochrome P-450d gene which contains members, although the possibility cannot be excluded that the homologous sequence AGTGTATTCT in the middle has the number of the constituentswill be increased if less strin- a potential of forming stem and loop structure. A core seelement (30), gent conditions of hybridization are used. In any event, it quence of viral or cellular enhancer could be concluded that the number of the constituents are GTGGAAAG which has been found at -379th position of much smaller in MC-inducible cytochrome P-450 subfamily cytochrome P-45Oc gene is also recognized at -339th position than in PB-inducible one. of cytochrome P-450d gene, although one residue of Guo was Cytochrome P-450d and P-45Oc genewere regulated for added in the element of cytochrome P-450d gene. Besides

5030

Structure Gene

for Cytochrome P-450d

these segments, a potential Z-DNA-forming simple sequence (311, (GT),, was found to be conserved in the upstream region of the first intronof the two genes. It would be interesting to study how these short homologous DNA sequences are involved in thegene regulation of the two forms of MC-inducible cytochrome P-450, and they will provide us with the means of an experimental approach to theelucidation of mechanisms of the gene regulation. The sequence of 3"untranslated region is very variable between cytochrome P-45Oc and P-450d, but shows very close homology between cytochrome P-450d and P3-450 which is thought to be an equivalent molecule to cytochrome P-450d in mice. A modified poly(A) addition signal, AATAGA which is present 25 bases upstream from polyadenylation site, is presumably utilized as such in cytochrome P-450d gene. On the otherhand, in spiteof the presence of the same sequence at the equivalent position in cytochrome P3-450 cDNA, another modified poly(A) addition signal, TATAAA, whichis 22 bases downstream of the AATAGA sequence may be used in the cytochrome P3-450gene (20). It is notknown whether the different usage of poly(A) addition signal as described is due either to species specificity of animals or to a small number of nucleotide substitutions occurring around the signal sequence. Interestingly, atypical poly(A) addition sequences are frequently utilized in cytochrome P-450 gene family whose sequences have been so far determined. These aresummarized in Fig. 4. Other interesting points concerning cytochrome P450d gene sequence are as follows. There are many repeated sequences in the intronswhich are shown by numbered arrows (Fig. 2). Some are direct repeats in tandem arrangement and others are inverted repeats. Of these repeated sequences, the sequence known as ID or identifier sequence is found in or in the vicinity of cytochrome P-450d gene. It has been reported (23) that this sequence is present in the rat genome in 1 to 1.5 x IO5 copies and expressed specifically in neural tissues. Biological roles of this sequence such as aregulatory element of tissue-specific expression or some kind of mobile elements have been proposed, but only speculatively. The present results show clearly that this sequence is also expressed in rat liver as a partof intervening sequence, but not limited to the neural tissues. Its functional role is still to be seen. Acknowledgments-We thank Drs. L. L. Jagodzinsky and J. Bonner for a kind gift of a rat gene library. We are indebted to Dr. M. Nobuhara (Mochida Pharmaceutical Co. Ltd., Tokyo) for a kind supply of the synthetic oligonucleotide.

REFERENCES 1. Thomas, P. E., Reik, L. M., Ryan, D. E., and Levin, W. (1983) J. Biol. Chem. 258,4590-4598 2. Sato, R., and Omura, T. (1978) Cytochrome P-450, Kodansha Tokyo, Academic Press, New York

3. Lu, A.Y.H., and West, S. B. (1980) Pharmacol. Rev. 3 1 , 277295 4. Ryan, D. E., Thomas, P. E., and Levin, W. (1980) J. Biol. Chem. 255, 7941-7955 5. Kamataki, T., Maeda, K., Yamazoe, Y., Matsuda, N., Ishii, K., and Kato, R. (1983) Mol. Pharmacol. 2 4 , 146-155 6. Kawajiri, K., Gotoh, O., Tagashira, Y., Sogawa, K., and FujiiKuriyama, Y. (1984) J. Biol. Chem. 269,10145-10149 7. Yabusaki, Y., Shimizu, M., Murakami, H., Nakamura, K., Oeda, K., and Ohkawa, H. (1984) Nucleic Acids Res. 12,2929-2938 8. Kawajiri, K., Sogawa, K., Gotoh, O., Tagashira, Y., Muramatsu, M., and Fujii-Kuriyama, Y. (1984) Proc. Natl. Acad. Sci. U. S. A. 81,1649-1653 9. Sogawa, K., Gotoh, O., Kawajiri, K., and Fujii-Kuriyama, Y. (1984) Proc. Nutl. Acad. Sci. U. S. A . 8 1 , 5066-5071 10. Gotoh, O., Tagashira, Y., Iizuka, T., and Fujii-Kuriyama, Y. (1983) J. Biochem. (Tokyo)93,807-817 11. Benton, W. D., and Davis, R. W. (1977) Science 196,180-182 12. Southern, E. M. (1975) J. Mol. Biol. 98, 503-517 13. Maniatis, T., Hardison, R. C., Lacy, E., Lauer, J., O'Connell, C., Quon, D., Sim, G. K., and Efstratiadis, A. (1978) Cell 16,68770 1 14. Messing, J., Crea, R., and Seeburg, P. H. (1981) Nucleic Acids Res. 9,309-321 15. Sanger, F., Nicklen, S., and Coulson, A.R. (1977) Proc.Natl. Acad. Sci. U. S. A . 74,5463-5467 16. Maxam, A. M., and Gilbert, W. (1977) Proc. Natl. Acad. Sci. U. S. A . 74,560-564 17. Nagata, S., Mantei, N., and Weissmann, C. (1980) Nature 2 8 7 , 401-408 18. Sharp, P. A. (1981) Cell 2 3 , 643-646 19. Kawajiri, K., Sogawa, K., Gotoh, O., Tagashira, Y., Muramatsu, M., and Fujii-Kuriyama, Y. (1983) J. Biochem. (Tokyo) 9 4 , 1465-1473 20. Kimura, S., Gonzalez, F. J., and Nebert, D.W. (1984) Nucleic Acids Res. 12,2917-2928 21. Mizukami, Y., Sogawa, K., Suwa, Y., Muramatsu, M., and FujiiKuriyama, Y. (1983) Proc. Natl. Acad. Sei. U. S. A . 80, 39583962 22. Affolter, M., and Anderson, A. (1984) Biochem. Biophys. Res. Commun. 118,655-662 23. Milner, R. J., Bloom, F. E., Lai, C., Lerner, R. A., and Sutcliffe, J. G. (1984) Proc. Natl. Acad. Sci. U. S. A . 8 1 , 713-717 24. Heinemann, F. S., and Ozols, J. (1983) J. Biol. Chem. 258,41954201 25. Tarr, G. E., Black, S. D., Fujita, V. S., and Coon, M. J. (1983) Proc. Natl. Acad. Sci. U. S. A . 80,6552-6556 26. Haniu, M., Armes, L. G., Tanaka, M., Yasunobu, K. T., Shastry, B. S., Wagner, G.C., and Gunsalus, I. C. (1982) Biochem. Biophys. Res. Commun. 105,889-894 27. Kimura, S., Gonzalez, F. J., and Nebert, D.W. (1984) J. Biol. Chem. 259,10705-1071 28. Atchison, M., and Adesnik, M. (1983) J. Biol. Chem. 258,1128511295 29. Mizukami, Y., Fujii-Kuriyama, Y., and Muramatsu, M. (1983) Biochemistry 22,1223-1229 30. Gluzman, Y., and Shenk, T. (eds) (1983) Enhancers and Eukuryotic Gene Expression, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 31. Wang, A. H.-J., Quigley, G.J., Kalpal, F. J., Crawford, J. L., Van Boom, J. M., Vander Marel, G., and Rich, A. (1979) Nature 282,680-689.

5031

Gene Structurefor Cytochrome P-450d

ExperimentalProcedures

Shuzo Co. IXyOtO, Japan), New England Biolabs IBeveriy. H A 1 and BetheEda Research Laboratories IROCkville. Hol. Escherichia DNA polymerase I (large fragment) and T4 DNA llgase were o b t a i n e d from Takara Shun0 Co. Reverse tranSCI t a s e was purchasedfrom$*feSciencesInc. 1st. P e t e r s b y p . FLI . l ~ - ~ ~ S ~ d A T P l 6 5 O CLm-~ / ~PldCTPl5,000Cl/mmol), ~l~, and Lr- PI ATP13,000C~/moll were purchased from The Radiochemical Centre Ameraham, England.

---

A rat Of Charon 4A Plaque Screening gene llbrary bacteriophage whlch was made from p a r t l a 1 H a e l I I d i g e s t s Of embryonic DNA was genegourly provlded by Drs. L . 1. J a g o d r E k y and J. Bonner. Approxunately 1x10 phage plaques were screened by theprocedure Of Benfon and D a v ~ s 1 1 1 ) w l t ht h e cytochrome P-450d CDNA i n s e r t from PCP-45Omc-3 (see r e f . 8 ) as t h e h y b r l d l a a t l o np r o b e .F l l t e rh y b r i d i z a t i o n was performedovemlght a t 65OC in 50 mM TrIE-HC1 b u f f e r , pH 7.5. c o n t a i n l n g 1 M NaCl, 10 mM EDTA. 0.1 E sodium dodecyl Sarcosmate, 0.2 a p o l y v l n y l p y r r o l l d o n e , 0 . 2 % P1co11, and 0.2e bovlne serum albumin. The f i l t e r s were washed t w x e with0.lxssc 1ssc: 0.15 M NaCl c o n t a i n l n g 0.015 M sodium c i t r a t e ) contamlng 0.1 i sodium dodecyl S u l f a t e . at 6 5 ' C f o r 30 m u . I n some cases where t h e genom~c library appearedtolackspecial DNA regions f o r cytochrome P-45Od gene. t o t a l DNA of r a t l i v e r s was d i g e s t e d e i t h e r by EcoRI or by Hind111 andthensubjectedto t h e sire-fractionation by s u c r o s s e n s i t yg r x e n t centrl'ugatioon 110-40 e Sucrose 1" 1 H NaC1, 10 mM EDTA and 50 mI4 T r I S - H C l , pH 7.5). Enriched f r a c t i o n s f o r a p p r o p r l a t e DNA fragments whlch hadbeendete-mined by DNA b l o t a n a l y s i s 1121 were cloned I n t o lgtWES f o r t h e ECORI d i g e s t r o n or Charon 2 8 u s m g E. c o l i ED8767 as h o F c e l l s . Recombinant phagea f otrh = e dIII. carrying cytochrome P - 4 5 0 d T e n G r e s e l e c t e d a s describedabove. Phage DNAs Were p r e p a r e e ds s e n t i a l l y as d e s c r i b e p dr e v i o u s l y 1131. Xost Of genomic fragments weresubcloned I n t o plasmld pBR322 t o f a c i l i t a t e sequence a n a l y s i s .

---

DNA BlotHybridrzation High molecular weight r a t genomic DNA I10 pgl or cloned phage DKA I1 p g ) was d l g e s t e dw i t hr e s t r i c t i o n enzymes, and thentheproducts were s u b j e c t e d to agarose gel electrophoresis f o r S o u t h e r n were h y b r i d i z e d t o t h e n i c k - t r a n s l a t e d CDNA probe b l o t analysis. The fi1:ers at 65'c overnrghtand k-ashed w i t hl x s s cI f o rt o t a l DNA1 or with 0.lxSSC I f o r cloned DNA) as d e s c r i b e s i n t h e p r e c e d i n g s e c t i o n .

---

CloningofRatcytochrome P-45Od Gene Fig. 1 1 t h a th y b r i d i z e dw i t ht h e

a clone 1.P-45Od-1,

From t h e r a t gene h b r a r y , CDNA probe was Obtalned

DNA-by uslng Cytochrome P-450d CDNA as a prob; (Fig.lb. ?anel C . and as t o see c e f , ( 9 ) 1 . A l l these cloned genomlc D N A s Showed the HvldIII fragment, Spanned p a r t i n y overlapping r e g i o n s Wlth eachOtherandaltoge-.berthey Since all t hhey b r l d l z a t l obnm d s approxlmately 21.8 Xb m contmuum. d e t e c t e d ~n t h b e l oat n a l y s i s were contained m these cloned DNAs, t h e s t r e t c h Of DNA whlch was r e p r e s e n t e d by them i s expecEed zo span t h e entire l e n g to hf cytochrome P-450d gene. R e a t r l c t i o n maps of these overlap?mg clones were Shorn i n Flg. l a . A 0.8 Kb mR1 fragmenr seen i nt h et a x a 1 DNA b l o t( F i g .l b , panel C) was p r e s e n t ~n t h e 7 . 0 Xb Hind111 fragment.(The 5.5 Kb e R I fragmentshorn i nF i g .l b , panel C was d o x e t . One of thedoublet dreviouaiy19)). bands Originated from cytochrome P-45Oc gene as r e p o r t s p These r e s u l t s also i n d i c a t e t h a t t h e o r g a n l z a t l o n of t h e c:oned cytochrome P450d gene r e f l e c t s its n a t l - S t r u c t u r e i n t h e chromosome 3YA.

a

'E

1

3B E

HB

-4

EE

H

"""" """

I, I 1

"

H E I

E

b A

B

C

---

DNA Sequence An+lySiS Restriction fragments containing cytochrome P-450d gene and t h ef l a n k i n g reglonr. were s u b c l o n e di n t o M13 mplo Or mpll 1 1 4 1 f o r sequencing by 3Sham-terminatlon method I151 u s i n g M13 s i n g l e stranded templates and [DL- SI dATP. Chemicpl d e g r a d a t i o n method 116) was auxxl~arlly used f o r some fragments. A l l thefragmentsthat c o n t a i n e d exon sequences and most of o:her fragments were sequenced i n b o t h directions.

---

PrimerExtension P s t I / AI1 ~ fragment IlOObpl was kinatedwith polynucleotide klnase " s i n 7 [XPlATP a f tter re a t m eonft alkaline phosphatase. The t e r m i n a l llya b e l eadn t i - c o d i nsgt r a n d was prepared by S t r a n d s e p a r a t m n on polyacrylamide gel e l e c t r o p h o r e s i s , a n d was used as t h e primer was c a r r i e d Out u s i n g 10 pg otfh e prmmer. The extensLon of the pOlylAl* RNA a c c o r d m g fO theprocedure Of Nagata e t a 11 1 7 ) . The extended DNA fragment was e l u t e d from t h e gel and "Sed f o r sequencing by thechemical degradatron method.

---

P r e p a r a t i o n pf RNA T o t a l RNA was prepared fromHC-treated rat 11VeTS and pOlyIA1 RNA was I s o l a t et d herefrm by o l i g o l d T l - c e l l u l o s e chromatography as d e s c r i b e d 1 6 ) .

Pig. 1. ReLtZictlDnrepsofoverlapping genomic clones and organizationof of t h e t h er a t cytochrome P-45Od gene la) a n db l o t - h y b r i d i z a t i o na n a l y s i s cloned g ~ m i DN6 c and t o t a l DNA l b l . a. a clone llP-450d-1) was screened from about 1x10 leComjLnantphageswlththe =DNA i n s e r t from plasmld PCP45Omc-3 18) as a probe. Other t h r e e clones were o b t a m e d by cloning approx-tely 6 Kb f r a g m e n t sf r a c t l o n a t e d from thecompletelydlgested ECORI IXP-45Od-2 and -41 and & d I I I le-45Od-31 fragments Of t o t a l DNA X n g gtWES andCharon28, r e s p e c t i v e l y , as t h e c l o n i n g v e c t o r . E. c o l i ED8767 was used as hostbacteria.1.2P-450d-1; 2. lP-4506-2; 3. + P - 4 3 e 7 4 . + P - 4 5 O d 4. The linkage map 0: overlapping cloned genomlc D N A s 15 r e p r e s e n t e d by a b a r below thecloned phage DNAs. The Exons Of Cytochrome P-45Od gene are Shorn by c l o s e d boxes on t h em a p i f l e dl i n k a g e map. E, ECORI; 8 , BamHI; and H, X b d I I I . b. phage D N A s I1 p g ) [A and BI was dzgelfed i 3 h EcORITP-450dor U a d I I I 10-450d-31 f oer l e c t r o p h o r e s i s m agarose and then 1,-2,-4,1 t r a n s f e r r e dt on i t r o c e l l u l o r e filters. T o t a l DNA IC) was d i g e s t e dI t h ECORI f o r DNA b l o th y b r l d l z a t l o n . The f l l t e r s were Incubatedwiththe 3yP-la€&d Cytochrome P-450d cDNA probe as describedunder"ExperimentalProcedures". A. the ethidium bromide-stamed agarose g e l ; B. an alltoradlogram Of t h e DNAs; C. an auroradmgram of t hdei g e s t etdo t a l DNA. dlgested cloned ArrOwheadS shows t h e band d e t e c t e d . The smallest fragment I800 bpl c o n t a i n i n gt h e4 t h exon of Cytochrome P-450d gene was notobserved m the paper IFlg. 1 i n Ref. 91 where more S t r i n g e n t autoradiogramoftheprevious washing Conditions were used. The l e n g t(hi n K b l Of size markers are indicatedattheleft Of t h e panel. The 2.5 Kb EcoRI fragment seen m panel A (lane 41 was p r o b a b l y c l o n e d a d v e n t i t m u s l y due= artlflczal l~gat~on.

Gene Structure forCytochrome P-450d

Flg. 2. Complete nucleotide sequence of the rat cytochrome P-450d gene and its flanklng sequence. The nucleotides are numbered from A d o of the cap SLte ~ e l o vthe nucleotide sequence 1 5 the which IS Indicated by a vertlcal arrow. predicted ammo acld sequence shorn. The nucleotide sequence whlch encodes leader sequence of cytochrome P-450d m A was underlined by a salld Ilne. The sequence. TATandAATAGA i n t h e 5' and 3 ' noncodrng r e g i o n , respectlvely. are lndicated by enclosures. v a r i o u ~ krnds of repeated sequences are indicated by horrrontal arrows whlch are numbered in Of ~ T I O Y S lndlcate the 5' to 3' parentheses. The dxrectzon and the length Orlentation andthe size of the sequence, respectively. Sequencing was carrred Out by chainterminationmethod ( 1 5 1 u s l n g H13 slngle-stranded templates.