Rat Liver UDP-glucuronosyltransferase - Semantic Scholar

6 downloads 0 Views 3MB Size Report
and Ida S. Owens for helpful discussions and support during the course of this work. ... W., Oesch, F., Pfeil, H., and Platt, K. L. (1980) Biochem. Phurmacol.
THEJOURNALOF BIOLOGICAL CHEMISTRY

Vol. 261, No. 30, Issue of October 25, pp. 14112-14117,1986 Printed in U.S.A.

Rat Liver UDP-glucuronosyltransferase cDNA SEQUENCE AND EXPRESSION OF A FORMGLUCURONIDATING3-HYDROXYANDROGENS* (Received for publication, May 5, 1986)

Peter I. Mackenzie From the Laboratory of Developmental Pharmacology, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland20892

A cDNA clone, pUDPGT,-4, encoding a form of rat UDP-glucuronosyltransferase has been isolated from a SV40 expression library. Sequence analysis revealed that the cDNA is 1970base pairs in length and encodes a protein of 530 amino acids, which has amino- and carboxyl-terminal sequences characteristic of signal peptide and transmembrane segments, respectively. There is one potential asparagine-linked glycosylation site. Transfection of UDPGT,-4 cDNA into COS cells resulted in the glucuronidation of etiocholanolone, androsterone, and lithocholic acid in a transient expression assay. Several other common substrates of UDPglucuronosyltransferase were not conjugated by the UDPGT,-4 enzyme. UDPGT,-4 cDNA is identical in sequence over a common 1.7 kilobase-region of overlap to UDPGT,-1, a cDNA previously isolated in this laboratory (Mackenzie, P. I., Gonzalez, F. J., and Owens, I. S. (1984) J. Biol. Chern. 259,12153-12160). UDPGT,-4 cDNA, however, contains a shorter 3‘-untranslated region. Northern analysis showed that the poly(A) RNA counterparts of UDPGT,-4 and UDPGT,1 cDNAs are approximately 2.3 and 3.0 kilobases in length,respectively. The steady-statelevel of UDPGT,-4 poly(A) RNA in the liver is 20-fold higher than that of UDPGT,-1 poly(A) RNA. These data indicate that the UDPGT,-4 enzyme is a 3-hydroxyandrogen UDP-glucuronosyltransferase which isencoded by two distinct species of mRNA transcribed from the same gene.

transferase’ specific for these steroids and the development of suitable cDNA and genomic probes. Several forms of transferase have been purified from rat liver (3-6). One form ( M , z 52,000) is capable of glucuronidating the3-hydroxyl moietyof androsterone and etiocholanolone and was therefore designated3-hydroxyandrogen UDPglucuronosyltransferase. This enzyme also glucuronidatesthe 3-hydroxylgroup of bile acids (7).This form is inactive towards the 3-hydroxyl position of the aromatic A-ring of estrogenic steroids, such as P-estradiol and estrone, and the 17-hydroxyl position of testosterone andP-estradiol (7). Glucuronidation of the 17-hydroxyl moiety of the latter steroids is catalyzed by a second purified form ( M , 50,000) (7). In earlier work from this laboratory, various cDNA clones hybridizing to three distinct forms of transferase mRNAwere isolated (8). One clone, pUDPGTr-2F, wassequenced and demonstrated by transfection experiments to encode a form of transferase active in the glucuronidation of 4-methylumbelliferone (9). The level of itscomplementary mRNA is increased&foldby phenobarbital (8). Twoother clones, pUDPGT,-1 and pUDPGT,-3, were also isolated (8). Their mRNA counterparts share extensive homology, are not affected by phenobarbital, and appeared to be transcribed from closely related genes. In the present study, the insert of pUDPGT,-1 was used to isolate a clone,pUDPGT,-4, from a cDNA expression library. The relationshipbetween UDPGT,-1 and UDPGT,-4cDNAs was investigated by sequencing andNorthern analysis. UDPGT,-4 cDNA under regulationby SV40 transcription signals was transfected into COScells, which were subsequently assayed for their capacity to glucuronidate 11 subThe androgenic steroids (testosterone, 5a-testosterone, and strates. Sa-androstane-3P,17@-diol)regulate thedevelopmentand metabolic processes of several tissues and organs, in addition MATERIALS ANDMETHODS to their critical role in the maintenance and function of the Isolation ojpUDPGT,-4”The PstI insert of pUDPGT,-1 (8) was prostate and seminal vesicles (1).Their androgenic potency is greatly decreased by conversion to various 17-oxosteroids labeled by nick translation following the protocol of the supplier (Amersham Corp.) and used to screen an Okayama and Berg cDNA such as androsterone and etiocholanolone, which are conju- expression library constructed from transferase-enriched mRNA (8, gated and rapidly excreted in the bile and urine. Although 9). Clones complementary to UDPGT,-1 cDNAwere isolated, dilacking in androgenic activity, etiocholanolone is a potent gested with BamHI, and subjected to electrophoresis in 1% agarose gels (10). The clone with the largest insert was designated pUDPGT,pyrogen and hasbeen shown to induce porphyrin biosynthesis in cultured liver cells. The glucuronide of this androgen does 4. Expression of UDPGT,-4 cDNA”pUDPGT,-4 was digested with not possess these properties (2).Conjugation with glucuronic BamHI and thelarger vector fragment was isolated and blunt-ended acid is therefore important in regulating the levels of free with T4 DNA polymerase (10). The smaller cDNA fragment was steroids and,hence, their biological effects. An understanding digested with PstI followed by partial digestion with S a d , and the of the molecular mechanisms controlling glucuronidation of fragment containing the complete coding region, but which lacks the androgensisdependentupon identifying theforms of 3’-untranslated and poly(A) sequences, was isolated and blunt-end-

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in this paperhas been submitted to the GenBankTM/EMBL Data Bank with accession number 502589.

ligated into thevector fragment. After transformation of Escherichia coli HB101, plasmids containing the cDNA in the correct and opposite orientations with respect to SV40 transcription signals were isolated and transfected into COS cells as described previously (11).The cells The abbreviations used are: transferase, UDP-glucuronosyltransferase; bp, base pair; kb, kilobase.

14112

3-Hydroxyandrogen UDP-glucuronosyltransferase Sequence were harvested 70 h after transfection and assayed for their capacity to glucuronidate etiocholanolone, androsterone (5@-androsterone-3@ol-17-one), lithocholic acid (5@-cholanicacid-3a-ol), estrone, @-estradiol, testosterone, 1-naphthol, 4-methylumbelliferone, 4-nitrophenol, phenolphthalein, and chloramphenicol (Sigma). The radiometric assay conditions were as detailed elsewhere (9, 12). All aglycones were added to a final concentration of 10 p ~ Glucuronides . labeled with [14C]glucuronicacid were analyzed by thin-layer chromatography and @-glucuronidasetreatment (12). Protein was determined by the BCA method (Pierce Chemical Co.). Other Procedures-DNA sequencing was carried out as described previously (9) using M13 cloning protocols (13) and the dideoxy sequencing procedure (14). Sequence data were analyzed using standard computer programs (15).Northern analysis, subcloning, and the preparation of microsomes from adult male Sprague-Dawley rats are described elsewhere (8,10,16). Slot-blotting of RNA was done following the procedures of the manufacture (Schleicher & Schuell). RESULTS

Nucleotide Sequence of UDPGTr-4-The clone pUDPGT,4 was selected from an expression library enriched in transferase-specific coding sequences by hybridization to the labeled insert of pUDPGT,-1. UDPGT,-4 cDNA is 1970 bp in length and contains a coding region of 1590 bp bounded by 32 and 348 bp of 5‘- and 3”untranslated regions, respectively (Fig. 1). Aputative poly(A) addition signal (AATAAA) is present 23 bp upstream from a poly(A) tail of more than 30 residues. Deduced Amino Acid Sequence of UDPGTr-4 cDNA-The open reading frame of UDPGT,-4 cDNA encodes a protein of 530 or 532 amino acids, depending on which ATG triplet is the initiatorcodon. Alignment of this sequence with sequences of two other transferase cDNAs would suggest that thesecond in-frame ATG triplet is theinitiator codon (Fig.2).2 In common with most initiator codons (17), the second ATG triplet is preceded by a purine residue (A) three nucleotides upstream. The first in-frame ATG has a pyrimidine residue (T) in the corresponding position. However, the possibility that UDPGT,-4 encodes a precursor longer than that described here cannot be formally excluded because there is no in-frame stop codon 5’-ward to the presumptiveinitiator codon in pUDPGT,-4 cDNA. Potential signal and transmembrane regions are present at theamino and carboxyl termini, respectively (Fig. 1,underlined). The latterregion is followed by a highly basic segment of 20 residues. A potential site for asparagine-linked glycosylation is indicated (Fig. 1, filled triangle). A comparison of the deduced amino acid sequences of UDPGT,-4 cDNA and UDPGT,-2 cDNA, which encodes another form of transferase active in the glucuronidation of 4methylumbelliferone (9), is shown in Fig. 2. The two amino acid sequences are 62% similar but contain a region of lesser homology near their amino termini. Both proteins contain common potential amino-terminal signal and carboxyl-terminal transmembrane regions (Fig. 2, highlighted in black). The replacement of a glutaminefor a serineresidue in position 136 of UDPGT,-4 abolishes the sitefor potential glycosylation which is present in UDPGT,-2. However, the glycosylation site at Asn-316 is present in bothsequences. Comparison of UDPGT.4 and UDPGT,-1 DNA Sequences-The insert of pUDPGT,-1 is 2444 bp long and has two putative poly(A) addition signals, the second of which is 17 bp from the terminatingpoly(A) tract (Fig. 3, highlighted in black). The first 1680 bp of UDPGT,-1 cDNA are identical in sequence between residues 290 and 1970 of UDPGT,-4 cDNA.However, the 3”untranslated region of UDPGT,-1 cDNA with a total length of 1112 bp is considerably longer

* P. I. Mackenzie, unpublished data.

14113

than thatof UDPGT,-4 cDNA. Assuming that poly(A) tracts typically contain 250-300 residues, the mRNA corresponding to UDPGT,-4 would be approximately 2.3 kb long, while that corresponding to UDPGT,-1 would be about 3.0 kb in length. This prediction was confirmed by Northern analysis. The Size of UDPGT,-1 and UDPGT,-4 mRNAs-In order to obtaina probe for UDPGT,-1 mRNA, the cDNA was digested with PvuII and Hind111 (sites show in Fig. 3) and the resultant fragments subcloned into pBR322 (Fig. 4C). Probe 1 (residues 2318-2734,Fig. 3), which contains a sequence unique to UDPGT,-1 cDNA(Fig. 4B), hybridized predominantly to a low abundant mRNA with a length of approximately 3 kb (Fig. 4A,probe 1).Probe 2(residues 12672292, Fig. 3), which contains sequence information common to both cDNAs, also hybridized to thismRNA (Fig. 4A,probe 2). In addition, it hybridized to three other mRNA species: an abundant mRNA approximately 2.3 kb long, and two less abundant mRNAs of about 3.5 and 4.5 kb in length (Fig. 4A, probe 2). The level of UDPGT,-4 mRNA in the liver was approximately 20-fold greater than thatof UDPGT,-1 mRNA (Fig. 4A). This was also demonstrated by the slot-blotting technique (results not shown). Catalytic Expression of UDPGTr-4 cDNA-The above sequence analysis suggested that only UDPGT,-4 contained a complete coding region. The cDNA was therefore selected for transfection into COS cells, in order to determine if it encoded a catalytically active transferase. The glucuronidation of etiocholanolone and androsterone was detected in cells transfected with vector containing UDPGT,-4 cDNA in thecorrect A 4.4 Kb -

B Probe 1

Probe 2 pUDPGTr -1 pUDPGTr-2

2.3 Kb

pUDPGTr-4

2.0 Kb -

pBR322

Probe 1 Probe 2

C

pI’ I

-

P y II

”in;

111 P s L

pUDPGTr-1 Probe 1 Probe 2

FIG.4. Northern analysis of UDPGT,-4 and UDPGT,-1 mRNAs. A , total liver polysomal poly(A) RNA was electrophoresed in 1%agarose. The separated mRNA wastransferred to nitrocellulose and hybridized with 32P-labeledprobe 1 and probe 2. B , the recombinant plasmids, pUDPGT,-1, pUDPGT,-2, pUDPGT,-4 and the control plasmid pBR322 (0.5 pg of each), were bound to nitrocellulose using the slot-blotting procedure of the vendor, and hybridized with 32P-labeledprobe 1 and probe 2. C, line diagram illustrating the regions of UDPGT,-1 cDNA unique to probe 1and probe 2. Hybridizations between probes and mRNAs or plasmids were detected by autoradiography.

14114

3-Hydroxyandrogen UDP-glucuronosyltransferase Sequence

FIG. 5. Catalytic expression of UDPGTr-4 cDNA in COS cells. Flasks (175 cm) of semiconfluent monolayers of COS cells were transfected with 20 pg of expression vector containing UDPGT,-4 cDNA insert in the correct orientation (lanes 1 and 2 ) and opposite orientation (lane 3) with respect to transcription from the SV40 promotor. The cells were harvested after 70 h and 20 pg of total cell homogenate protein or 5 pg of rat liver microsomal protein (lane 4 ) were assayed as described under "Materials and Methods." The assay mixture was incubated for 10 h a t 25 "C in the presence or absence of aglycone as indicated. The reaction mixture was subjected to thin-layer chromatography and the silica gel plates exposed to x-ray film for 3 weeks. The glucuronides of etiocholanolone, androsterone, and lithocholic acid are shown as radioactive spots above other metabolites of ['4C]UDP-glucuronic acid. Only radioautographs of the regions containing the glucuronides of other aglycones are shown in the upper part of the figure.

d

:

*

1

2

A

t

31

42

31

42

3

4

which is mediated by the second poly(A) addition signal. An alternative explanationmay involve the synthesis of different length primarytranscripts from the same gene. Transcription may terminate in mostcases between the two poly(A) addition signals and, hence, only the first signal is utilized to form UDPGT,-4 mRNA. Run-on transcription beyond the second poly(A) addition signal may occur occasionally to yield UDPGT,-1 mRNA upon subsequent processing. In this respect it is interesting to observe that UDPGT,-1 cDNA contains an element (aTTTTTTTTcTGTa) 40 bp beyond the first poly(A) addition signal which is characteristic of eukaryotic transcription-termination signals (18). The possibility that UDPGT,-1 cDNA is complementary to a precursorRNA containing an intron canbe discounted for the following reasons. First, the clone was isolated from a cDNA library prepared from polysomal poly(A) RNA, immunoenriched for transferase-coding sequences. Contamination by nuclear preDISCUSSION cursor RNA would be minimal. Second, the putative intron In thiswork two clones, pUDPGT,-1 and pUDPGT,-4, have (residues 1971-2734, Fig. 3) is not bounded by the consensus been sequenced. Both contain3'-terminal poly(A) tracts and splice signals, GT/AG. The formation of polyadenylated an identical 1680-bp region in common. Northern analysis of mRNAs of different lengths but identical coding potential liver polysomal poly(A) RNA demonstrated that the mRNA from the same gene has been observed before (19). Restriction mapping andSouthern analysis of genomic counterpart of UDPGT,-1 is approximately 3 kb long. Thus, UDPGT,-1 cDNA is apparently a truncated version of its DNA has demonstrated that UDPGT,-1 and UDPGT,-3 mRNA, rather thana full-lengthcopy as previously proposed RNAs are transcribed from separate genes which are closely (8). The 5'-untranslated region and some coding sequence is related (8).Sequence analysis has also confirmed that the3.5is missing. A complete coding region, however, is present in kb RNA, which cross-hybridizes to probe 2 (Fig. a), UDPGT,-4 cDNA which is complementary to a more abun- transcribed from a third, closely related gene? These three genes, therefore, are members of a multigene subfamily of dant mRNA of approximately 2.3 kb. A comparison of these two cDNA sequences strongly sug- transferases. The transferase form encoded by UDPGT,-4 cDNA catagests that theircomplementary mRNAs are transcribed from the same gene. A single primary transcript encompassing the lyzes the transfer of glucuronic acid from UDP-glucuronic sequences of both cDNAs may be preferentially (20:l) cleaved acid to the androgens, etiocholanolone, androsterone and, to and polyadenylated near the first poly(A) addition signal to the bile acid, lithocholic acid. This was demonstrated in a form UDPGT,-4 mRNA. The longer UDPGT,-1 mRNA transient expression system after transfection of the cDNA would be formed by a cleavage and polyadenylation process into COS cells. The rate of glucuronidation in this system is

orientation with respect to SV40 transcription and polyadenylation signals (Fig. 5, lanes 1 and 2). Slight activity toward lithocholic acid was also observed. Treatment with p-glucuronidase in the presence and absence of D-saccharic acid 1,4lactone and comparison to mobilities of authentic steroid glucuronides confirmed the authenticityof these glucuronides (data not shown). No activity was detected in cell homogenates when aglycone was omitted from the reaction mixture (Fig. 5) or after transfection withvector containing UDPGT,4 cDNA in the opposite orientation (Fig. 5, lanes 3 ) . UDPGTr-4-mediated glucuronidation of testosterone, estrone, chloramphenicol, phenolphthalein, 4-methylumbelliferone, 4-nitrophenol, and 1-naphtholwas not detected in this system. The glucuronidation of these substrates was detected, however, in microsomes prepared from male Sprague-Dawley rat liver (Fig. 5, lanes 4 ) .

3-Hydroxyandrogen UDP-glucuronosyltransferaseSequence dependent on many factors which include transfection efficiency, factors regulating the rates and fidelity of transcription and translation, and any post-translational modifications. Experiments are in progress to analyze these factors in order to increase the glucuronidation capacity of this system. Nevertheless, although the glucuronidating activities are low, UDPGT,-4 transferase, in comparison to the repertoire of transferases present in microsomes, has a clear preference for etiocholanolone and androsterone as substrates, when compared to estrone, b-estradiol, testosterone, and xenobiotics such as 1-naphthol,chloramphenicol, 4-methylumbelliferone, 4-nitrophenol, and phenolphthalein. This form also has some activity towards lithocholic acid. Glucuronidation of other bile acids, however, was difficult to assess in this system. These data demonstrate that the UDPGT,-4 enzyme is a 3-hydroxyandrogen UDP-glucuronosyltransferase as defined previously (7). Whether UDPGT,-4 cDNA, which was prepared from male Sprague-Dawley rats, encodes an enzyme identical to that previously purified from female rats of the same species (7) or to a closely related form will be resolved when the amino acid sequence of the purified form is determined. During the preparation of this manuscript, Jackson and Burchell (20) published the sequence of a cDNA identified as encoding androsterone UDP-glucuronosyltransferase by Northern analysis of RNA from livers of normal and genetically deficient Wistar rats. Based on comparisons to theamino acid compositions of purified forms of transferase, these authors concluded that the sequence containeda full-length coding region. However, a comparison tothe UDPGT,-4 deduced amino acid sequence in this report demonstratesthat their sequence is missing the first 30 amino acids from the amino terminus, but otherwise is identical to the UDPGT,-4 amino acid sequence of Sprague-Dawley rats except for three conservative changes: Asp'59 + Glu, AlaZs + Ser, and + Ile.

14115

Acknowledgments-I thank Daniel W. Nebert, Frank J. Gonzalez, and Ida S. Owens for helpful discussions and supportduring the course of this work. The expert secretarial assistance of Ingrid E. Jordan is gratefully acknowledged. REFERENCES 1. Gower, D. B. (1984) in BiochemistryofSteroid Hormones (Makin, H. L. J., ed) 2nd Ed, pp. 170-206, Blackwell Scientific Publications, Oxford 2. Kappas, A., and Granick, S. (1968) J. Biol. Chem. 243,346-351 3. Roy-Chowdhury, J., Roy-Chowdhury, N., and Arias, I . M. (1984) Biochem. SOC.Trans. 12,81-83 4. Falany, C. N., and Tephly, T. R. (1983) Arch. Biochem. Biophys. 227,248-258 5. Bock, K. W., Clausbruch, U. C. V., Kaufmann, R., Lilienblum, W., Oesch, F., Pfeil, H., andPlatt, K. L. (1980) Biochem. Phurmacol. 29,495-500 6. Burchell, B., and Blanckaert, N. (1984) Biochem. J. 223, 461465 7. Kirkpatrick, R. B., Falany, C.N., and Tephly, T. R. (1984) J. Bid. Chem. 259,6176-6180 8. Mackenzie, P. I., Gonzalez, F. J., and Owens, I. S. (1984) J. Biol. Chem. 2 5 9 , 12153-12160 9. Mackenzie, P. I. (1986) J. B i d . Chem. 261,6119-6125 10. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) in Molecular Cloning:A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11. Guan, J.-L., and Rose, J . K. (1984) Cell 37, 779-787 12. Bansal, S. K., and Gessner, T. (1980) Anal. Biochem. 1 0 9 , 321329 13. Messing, J., Crea, R., and Seeburg, P. H. (1981) Nucleic Acids Res. 9,309-321 14. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc.Natl. Acud. Sci. U. S. A. 74,5463-5467 15. Staden, R. (1980) Nucleic Acids Res. 8. 3673-3694 16. Mackenzie, P. I.; and Owens, I . S. (1983) Biochem. Pharmacol. 32,3777-3781 17. Kozak, M. (1984) Nucleic Acids Res. 1 2 , 857-872 18. Birnstiel, M. L., Busslinger, M., and Strub, K. (1985) Cell 4 1 , 349-359 ." 19. Setzer, D. R., McGrogan, M., and Schimke, R. T. (1982) J. Biol. Chem. 257,5143-5147 20. Jackson, M. R., and Burchell, B. (1986) Nucleic Acids Res. 1 4 , 779-795 ~~~

Continued on next page.

14116

3-Hydroxyandrogen UDP-glucuronosyltransferase Sequence

APPENDIX

-30 +1 AACC T G T G A C CGAA A G G A C T TGTA T G T T C A ACACACGAAATGOAGTTTCGTCCTTTGT C M

30 P

R

K

W

I

S

A

L

60 CTG CTA CAG ATAGTAC TOTTC AAA TCG T GG CAC TOG T GG AAG GTG TTG GTG TGG CCG L L Q I S Y C F K S G

H

C

G

K

V

L

V

W

P

120 ATG GAC TTT AGT CAC TGG ATG AAT ATA AA TA TC CTT GAT GAA CTT GTA CAG AGG GGC M D F S H W M N I K I

I

L

D

E

L

V

Q

R

G

F

F

L

D

P

K

K

S

180 CAG T AG GTC ACG T TC T TG AAA CCTCG GCTAC TTTTC T TG T AC T CG AAAAA TCG TCT H E V T V L K P S A Y

S

K

D

E

L

Q

N

H

300 P

R

D

T

C

L

S

Y

360 Y

F

Y

L

S

I

C

K

Q

E

S

K

F

D

480 G CG A AC T CG T T G C TT C T CGG T G G AC TTG A TG A CG T AC ATG CTCAA C TC T C TT T C TTG A D P V A S C G D L

I

A

E

L

L

H

I

540 TAC AGC T TA TGC TTC TCC T CA GGC CAC AAA CTG T AAAG TCC A TG 1 GAAA TTA TTA CTC Y S L S F S P G H K L

E

K

S

I

G

K

V

P

F

L

F

I

F

L

L 6 3 0

L

A

G

K

M

T

F

1

660

0

690

AGO GTA AIA AAT ATG ATA TGT ATG CTT AT TC GAC TTT OG TTC GAG AGA CTT AGA CAC R V K N M I C M L Y F D

F

W

F

E

R

L

R

H

750

7 2 0 AAG GAT AG G GAC ACG T TT A C A GG T AG A TT T G GOAGG CCC ACC ACC GTG A AG T AG ACA K E W D T F Y S E I L

G

R

P

T

T

V

D

E

T

7 8 0 ATG AGC AAA GTA GAATA TOG CTA TTA TGA TCC TATGG GATTG AAA TTC TCCACCA M S K V E I W L I R S

Y

W

D

L

K

F

P

H

P

8 4 0 ACT AT C A C A AG T TG T AC T AA T TG T GG A OC A T C AT G C AAC A CG T CA TAC ACC TTG CCT T L P N V D Y I G G L

H

C

K

P

A

K

P

L

900 AAG G AA TTG GAA GAA T TG T TCAG AGC T CG T GA GAG CAC G GG T TG GTG GTG T TT CC TTG K D M E E F V Q S S G

E

H

G

V

V

V

F

S

960 GGG T CA T G T A CG A CA A CT G A CG A AG A AA A G C A CA G C C A TG T CT AO G C C TG TCC G S M V S N M T E E

P 9 3 0 L

990 K

A

N

A

I

1020 CAG ATC T CA CAAAG GTC T TTGG AAA TTG T AG T GC AAACC CA OCACA TTA GGA CCC Q I P O K V L W K F D

A

W

A

L

A

G

K

T

P

1080 AAA TCC AGA GTC TAC AAG TOG CTC CG CAG AAG T AC TC TG GGC T AC CAAACC AAA N T R V Y K W L P Q N

A

T

L

G

P

D

L

L

1140 GCT CTG T T A CC T AG T GG T GG ACA C AG TGCTT CAG T A G CA T T CAC T AG T GA T C C T A F V T H G G A N G

G

H

P

K

T

K

Y

E

A

I

Y

H

G

I

P

A

L

1200 ATG AT1 GGC ATT CCT CTG TTT GOA GAT CAA CCT GAT AAT ATT GCC CAC ATG GTG GCC AAA M I G I P L F G D Q P D

N

I

1260 G OG A CG A CG T T TT C T AT G A AA TT A CG G A CA T G T CA A G T TG A A TT T C TTA C GO T CC ATG G A A V S L N I R T

A

H

V

A

K

L

S

A

L

M

S

K

L

N

V

M

L

1380 C AG T ACAG C CA TTG AAG CCCTG GAC AGG A CG T TC TTC TGG A TG T AG T TA TTC ATG CGC H D Q P M K P L D R A

V

F

W

I

E

F

I

1440 CAC AAG A GG CC AAG CACTG AGC A CC A TG T GC A AA T A C TC TCC TOG TACAG TACAC H K G A K H L R P L G

H

N

L

P

W

Y

Q

Y

H

1500 TCT CTG GAT GTG AT1 GOA TTC CTG CTC ACC TOTTT OCA GTC AT1 OCA GCT CTT ACT GTA S L D V I G F L L T C

F

A

V

I

A

A

L

T

V

K

K

E

D

M

1320 GAG GAA GTC ATA GAC AAC T CG TTC TAA TAAAAAG T TA TTG TTG TTG TCACC ATC T AC E E V I D N P F Y K K

1560 AAA TGT CTC TTG TTC ATG TAC CGA TTC TTT GTA AG AAG GAA AG AAA TG AAG AAT GAG K C L L F M Y R F F V

K

F

L

K

S

M

1620 W A G C T C A TT G A C A A T G C AC T A C A G A A A TG A A A T T T C A GC G T C A T T C T AA T T T A T G A A CC A C C T T C T A AA A 1710 1680 ATTACTAA TTTTTTATCAGGTAGATAA CCTTTGTAGG AAGACATATACTCCGTGAA TACTGATATG TACT 1740 CAAAAA TCCATCATTTTAAATTTTA AACCACTTA TGTAAAAGTACATTGTAG AAAAATGTGC AGAATA

1860

D

450

420 G CG T TT C A A C AAG CAG CTC ATG ACAAA CTA CAG GAA TCC AAG T TG T AG T TC T TT T C A V S N K Q L M T K L

600 CCC T CA TCTAG T TG CCG T T A TT 1TG TCG GOA CTG GCG T GC AAATG ACA TTC ATA GAC P P S Y V P V I L S G

1770

S 390

CCT ATC CTT CAA AT CTA GTTAT GAA TTTCTATTTAT CTA GT ATT 1OT AAA GAC P I L Q N L V Y E F S

1650

F 3 3 0

ATAAA CTTTG GAG T TG TGG ACTAG T AG TTG CCAGA GAA TCA TOTTG TCA TATCT I K L L D V W T Y E L

and theputative poly(A) addition signal are highlighted in black. The amino- and carboxyl-terminal hydrophobic segments areunderlined and theasparagine residue which is a potential site for glycosylation is noted with a filled triangle.

S

270

240 GAC CTT AAG TTT GAATTTTCT ACA TCT ATC AGT AAA GAT GAG CTG CAAAT CATTC D L K F E I F S T S I

FIG. 1. Nucleotide anddeduced amino acid sequence of pUDPGTr-4 cDNA. The initiator codon, stop codon,

F

90

1830 AAATATTCTTGAT AGAGTCCAAA TAATCAAAGT ATTAACCTTA AATATTTGA TAGTGTCCA TTAGCTTC

1890 TTGTCTAATAC TGAATCTGTA GCTTTCATAC

B

A

T

1920 G T

AGATAACTTG TACACGATA

T

K

I

M

N

H

R

E

3-Hydroxyandrogen UDP-Glucuronosyltransferase Sequence

280

20 40 60 a0 PGACU(VL~TEYSMINIKIILNELAQRGHEVTVLVSSASILIEPTKESSINFEIYSVPLSKSDLEYSFAI(WIDEWT" S H M)F M D V YFFKP D K SF D L K TSI DE ONH I L L V YE

MSMOT PR WI

FIG. 2. Comparison of UDPGT.-4 and UDPGT.-2 deduced amino acid 180 sequences. The complete deduced amino acid sequence of UDPGT,-2 is presented. Residues in UDPGT,-4, 260 which are different to thecorresponding residue in UDPGT,-2, are shown below the UDPGT,-2 sequence. The aminoand carboxyl-terminal hydrophobic sequences are highlighted in black. The potential asparagine-linked glycosyla440 420 tion sites unique to UDPGT,-2 and common to both sequences are also highlighted in black.

14117

120

160

140

- - R D100 FET-LSIWTYYSKM-~VFNEYSDVVENLCKALI~M(KLQOSaFDVILADAVGPCGELLAELLKTPLVYSLRFCPGYRCEKFSGGLPLPPSY LP C P I L NLVYY FFY L S I DAVS TL FK E P AS D I HI F L S S K L SI KFI

--

240

----

220 200 VPVVLSELSDRMTFVERVKNML~LYFD~QP~KEK~QFYSDVLGUPTTLTE~U(ADI~IRT~LEFPHPFLPNFDFVOGLHCI(P*I(PLPREME DT ERLRH E El VD T S VE SY K T V YI KD I G A N ID IC

E F V Q300 S S G E H G V V V F S L G S M V ~ E3K2A 0 N V V A S A L A O I P ~3V4~0F D U ( K P D T L G S N T 3R 60 LY~IPQNDLLGHPKTKAFVAHGGTN 380 GIYEAIYHGIPIVGI S M

A l W

T

P

V

L

T

A

L

MI

400 PLFAWPDNINHMVAKGAAVRVDFSILSTTGLLTALKIVMNDPSYKENAMRLSRIHHWPVKPLDRAVFWlEYVMRKGU(HLRSTLHDLSQYHSL G A SLNIRTM KLDF S EE KIDN F V L T M FI PLG N P Y

7

520 FCLFCCRKTANWKKKE CL MY FFVKKE M NE

60

40

20

ACACTGTGACCAGAAGGACTTTGATGTTCAAA~CCAGAAAGTGGATTTCTGCTCTGTTCCTGCTACAGATAAGTTACTGTTTCAAATCTGGGCAC~G la0

160

120

140

TGGGAAGGTGTTGGTGTGGCCGATGGACTTTAGTCACTGGATGAATATAAMATAATCCTTGATGAACTTGTACAGAGGGGCCATGAGGTCACTGTTCTG 260

240

220

AAACCTTCGGCTTACTTTTTTCTTGATCCGAAAAAATCGTCTGACCTTAAGTTTGAAATTTTTTCTACATCTATCAGTAAAGATGAGCTGCAAAATCATT GCAAAATCATT 360

340

320 TCATAAAACTTTTGGATGTGTGGACTTATGAGTTGCCAAGAGATACATGTTTGTCATATTCTCCTATCCTTCAAAATCTAGTTTATGAATTTTCTTATTT TCATAAAACTTTTGGATGTGTGGACTTATGAGTTGCCAAGAGATACATGTTTGTCATATTCTCCTATCCTTCAAAATCTAGTTTATGAATTTTCTTATTT

460

440

420 TTATCTAAGTATTTGTAAAGACGCTGTTTCAAACAAGCAGCTCATGACAAAACTACAGGAATCCAAGTTTGATGTTCTTTTCGCAGATCCTGTGGC~TCC TTATCTAAGTATTTGTAAAGACGCTGTTTCAAACAAGCAGCTCATGACAAAACTACAGGAATCCAAGTTTGATGTTCTTTTCGCAGATCCTGTGGCTTCC

540

520 560 580 TGTGGGGATCTGATAGCTGAACTGCTCCACATTCCTTTTCTGTACAGTCTTAGCTTCTCTCCAGGCCACMACTTGAAAAGTCCATTGGAAAAT~TA~~C TGTGGGGATCTGATAGCTGAACTGCTCCACATTCCTTTTCTGTACAGTCTTAGCTTCTCTCCAGGCCACAAACTTGAAAAGTCCATTGGAAAATTTATAC

640

620 660 680 TCCCTCCATCTTATGTGCCTGTAATTTTGTCGGGACTGGCTGGCA~TGACATTCATAGACAGGGTMAAAATATGATATGTATGCTTTATT~CG~CTT TCCCTCCATCTTATGTGCCTG~AATTTTGTCGGGACTGGCTGGCAAAATGACATTCATAGACAGGGTAAAAAATATGATATGTATGCTTTATTTCGACTT

740

720 760 780 TTGGTTCGAGAGACTTAGACACAAGGAATGGGACACGTTTTACAGTGAGATTTTGGGAAGGCCCACCACCGTAGATGAGACAATGAGCAAAGTAGAAAT~ TTGGTTCGAGAGACTTAGACACAAGGMTGGGACACGTTTTACAGTGAGATTTTGGGAAGGCCCACCACCGTAGATGAGACAATGAGCAAAGTAGAAA~~

860

840

820

TGGCTTATTAGATCCTATTGGGATTTGAAATTTCCCCACCCAACATTACCAAATGTTGACTATATTGGAGGACTCCATTGCAAACCTGCTAAACCC~~GC TGGCTTATTAGATCCTATTGGGATTTGAAATTTCCCCACCCAACATTACCAAATGTTGACTATATTGGAGGACTCCATTGCAAACCTGCTAA~CCCTTGC 940

920

960

980

CTAAGGATATGGAAGAATTTGTCCAGAGCTCTGGAGAGCACGGTGTGGTGGTGTTTTCTCTGGGGTC~TGGTCAGCAACATGACAGAAGAAAAGGCC~~ CTAAGGATATGGAAGAATTTGTCCAGAGCTCTGGAGAGCACGGTGTGGTGGTGTTTTCTCTGGGGTCAATGGTCAGCAACATGACAGAAGAAAAGGCC~A 1080

1180

1060

1040

FIG. 3. 1160 Comparison of UDPGT,-4 I40 and UDPGT,-1 cDNA nucleotide sequences. The two sequences are aligned to illustrate their region of identity. The 1240 sequence of UDPGT,-4 cDNA is written aboue that for UDPGT,-l. Initiation1380 and termination codons and putative poly(A) addition signals are highlighted in bhck. The PuuII and Hind11restriction sites, as depicted in Fig. 4, are indicated.

1580

1560

1020

CGCAATTGCATGGGCCCTTGCCCAGATTCCACAAAAGGTTCTTTGGAAATTTGATGGCAAAACCCCAGCAACATTAGGACCCAATACCAGAGTCTAC~AG CGCAATTGCATGGGCCCTTGCCCAGATTCCACAAAAGGTTCTTTGGAAATTTGATGGCAAAACCCCAGCAACATTAGGACCCAATACCAGAGTCT~C~AG 1 1120 TGGCTCCCGCAGAATGACCTCCTGGGTCACCCAAAAACCAAAGCCTTTGTAACTCATGGTGGAGCCAATGGCCTCTATGAGGCAATCTATCA~GG~~~CC

TGGCTCCCGCAGAATGACCTCCTGGGTCACCCAAAAACCAAAGCCTTTGTAACTCATGGTGGAGCCAATGGCCTCTATGAGGCAATCTATCATGGA~~CC 1220 '260 izao CTATGATTGGCATTCCTCTGTTTGGAGATCAACCTGATAATATTGCCCACATGGTGGCCAAAGGAGCAGCTGTTTCATTGAATATCAGGACAATGTCAA~ CTATGATTGGCATTCCTCTGTTTGGAGATCAACCTGATAATATTGCCCACATGGTGGCCAAAGGAGCAGCTGTTTCATTGAATATCAGGACAATGTCAAA P m 1320 1360 1340 GTTAGATTTTCTCAGTGCACTGGAGGAAGTCATAGACAATCCGTTCTATAAAAAAAATGTTATGTTGTTGTCAACCATTCACCATGACCAGCCTATGAAG GTTAGATTTTCTCAGTGCACTGGAGGAAGTCATAGACAATCCGTTCTATAAAAAAAATGTTATGTTGTTGTCAACCATTCACCATGACCAGCCTATGAAG 1460

1420

1440

1.180

CCCCTGGACAGAGCTGTCTTCTGGATTGAGTTTATCATGCGCCACAAAGGGGCCAAGCACCTGAGACCACTTGGACATAACCTTCCCTGGTACCAGTACC CCCCTGGACAGAGCTGTCTTCTGGATTGAGTTTATCATGCGCCACAAAGGGGCCAAGCACCTGAGACCACTTGGACATAACCTTCCCTGGTACCAGTACC

1540

1520

ACTCTCTGGATGTGATTGGATTCCTGCTCACCTGTTTTGCAGTCATTGCAGCTCTTACTGTAAAATGTCTCTTGTTCATGTACCGATTCTTTGTAAAGAA ACTCTCTGGATGTGATTGGATTCCTGCTCACCTGTTTTGCAGTCATTGCAGCTCTTACTGTAAAATGTCTCTTGTTCATGTACCGATTCTTTGTAAAGAA

1780

1760

1740

1720

AATTTTTTATCAAGGTAGATAACCTTTGTAGGAAGACATATAACTCCGTGAATACTGATATGTACTCAAAAATCCATCATTTTTAAATTTTAAAACCACT AATTTTTTATCAAGGTAGATAACCTTTGTAGGAAGACATATAACTCCGTGAATACTGATATGTACTCAAAAATCCATCATTTTTAAATTTTAAAACCACT 1840

1960

1820 1860 laso TAATGTAAAAGTTACATTGTAGAAAAATGTGCAGAATAAAATTATTCTTGATAGAGTCCAAATAATCAAAGTATTAACCTTAAAATATTTGAATAGTGTC TAATGTAAAAGTTACATTGTAGAAAAATGTGCAGAATAAAATTATTCTTGATAGAGTCCAAATAATCAAAGTATTAACCTTAAAATATTTGAATAGTGTC

1940

2oao

1920

2060

2020

1980 TGTAGATAACTTGTACACGATAAAAAAAAAAAAAAAAAAAAAAAA 2040

TAGACCTAATTACTTGTTGTCTGGGATAAAGTGTGGTTAGTTTGGATATCAGTTTTAAAGGAATCTGTTGGCTCTTTTGTTCCTCTAACATAATATGCTA 2180

2160

2140

2220 2260 2240 CAGAGGAGGAAGCTCACTTGTCACATCTCAACCATATATGGAAAGCAGAGAGTGAGCAAGAAGTAGGGCTAAGCTATGAACTTTCAAAGCTTAGTCGAGT HIND111 2340 2320

2280 2380

2120 ACATAGAGCATCTCAAGGGAAAAGCAGAACAGTTTATTTGGAGTTGTGACCACAGTTCCTTAAAATGTAGAGTAGCGGAGAAGGCAGGGCAGCAGTCATC

-

2360

GACGTACTTCATCCAAAAAGCTTTAACCACTTTAATTAAGATTCCATAACCTCAATCTGCACTACAAATTTTGGAAAATGTGCT~AAGTTCATAAACCTA HIND111 2460 2440 TCTGAAACATTTTACATTCAAACCAATAACAGGAAATCATTCTGTTTCTTAACTTTTTTAGCACACCTTATGCTGTTAACATTCTAGTCCTCCAAGTTCT

2480

2580

2560

2540

2520

AACACACCACCATAGAACACCTGTACTTCTTCAGTATACTTTAACTATGATCATGAGGAATAATGTTCATTGGAAATGAAACTTATGAACAGGGAACAAA 2720 TTGAAGCCCT~TAAACTTAGTATTGAGAAAAAAAAAAAAAAAAAAAAAAAAA