Sharp and Dohme Research Laboratories, West Point, Pennsylvania 19486 ..... G M M T A M m C r r , T C I G M G A C ~ C C A G G C A C G G M ~ C ..... GACcrcA WX AGG G CCG C CALXIAG. UGA. OXrnxrI. c c G I . -520. . -500. . -460. . -440.
Vol. 262, No. 15, Issue of May 25, pp. 7321-7327,1987 Printed in U S A .
THE JOURNAL OF BIOLOGICAL CHEMISTRV 0 1987 by The American Society of Biological Cbemista. Inc.
Delineation of the Intronless Nature of the Genes for the Human and Hamster ,&-Adrenergic Receptor and TheirPutative Promoter Regions* (Received for publication, October 17, 1986)
Brian K. Kobilka$§, Thomas FrielleS, Henrik G. DohlmanS, Mark A.Bolanowski$, Richard A. F. Dixony, Paul KellerlI, MarcG. Caron$, and Robert J. LefkowitzS From the $Howard Hughes Medical Institute, Departments of Medicine (Cardiology), Biochemistry, and Physiology, Duke University Medical Center, Durham, NorthCarolina 27710 and the WDepartment of Virus and Cell Biology Research, Merck Sharp and Dohme Research Laboratories, West Point, Pennsylvania19486
The &-adrenergic receptor is the first adenylate cyclase-coupled receptor to be cloned. We provide here a detailed characterization of its complete gene in both the human and hamster which reveals severalunusual and provocative features. The genes are present in a single copy, are intronless, and are bounded by homologous 18-bp (base pair) direct repeats. These findings suggest that the ,&-adrenergic receptor may have arisen as a processed gene for another related gene. Genomic Southern blots done at reduced stringency in fact reveal additional weak signals. The humanand hamster gene sequences 5’ to the principal site of transcription initiation are highly homologous and share many characteristics of promoters for housekeeping genes. Moreover, there is present in the human genome a long (777 bp) open reading frame which is in frame with the &adrenergic receptor coding block and which ends only 234 bp 5’ to the initiator methionine of the receptor. An unusual cDNA has been found, transcribed from a putative second more 5‘ promoter which contains the 5‘ half of the &adrenergic receptor as well as 1065-bp 5’ to the receptor coding region, including the entire upstream long open reading frame (sufficient to encode a putative protein of M, 28,000).
-
Thetechniques of molecular cloning haveled torapid progress over the past several years in elucidating the structures of severalgrowth factor receptors (Ishii et al., 1985; Yarden et al., 1986a), and receptors involved in mediating endocytosis of specific ligands (Russellet al., 1984). However, until very recently, no comparable advances had been made inthestudy of thatbroadclass of receptors coupled to adenylate cyclase. Of these, the mostheavily studied hasbeen the@-adrenergicreceptor which m3diates thestimulatory effects of catecholamines on theenzyme (Stiles et al., 1984). We have recently reported the sequence of cDNAs for the human (Kobilka et al., 1987) and hamster &adrenergic receptor (Dixon et al., 1986), as well as that portion of the hamster gene containing the &adrenergic receptor coding region and3‘untranslated regions. The sequence for the
* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. §To whom correspondence should be sent. The nucleotide sequence(s) reported in this paper has been submitted to theGenBankTM/EMBLDataBankwith accession number(s) 502728.
avian P-adrenergic receptor has also been reported (Yarden et al., 1986b). The &adrenergic receptor is thus the first of the adenylate cyclase-coupled receptors to be cloned and the study of these clones has revealed several interesting features about the receptor protein and gene. The receptor consists of 7 hydrophobicdomains analogous to the opsin family of visual pigments. Severalof these domains sharesequence homology with the comparable domainsof select members of the opsin family. Unlike the opsin gene family (Nathans and Hogness, 1983, 1984; Nathans et al., 1986), however, the hamster p2adrenergic receptor gene appears to be intronless throughout the coding sequence and 3’ untranslated region. To further our understandingof the structure of this gene and how its expression isregulated, we have now cloned and sequenced the entire human gene and the 5‘ region of the hamster gene. We have also mapped the site of initiation of transcription in the human gene and have done a detailed comparison of the putative promoter regions of the human and hamster receptor genes. The analysis of the promoter region in two different speciesprovides ourfirst look at potentially important regulatory elements required for transcription of thisintronless gene. We describe as well an interesting and unusual cDNA clone for the human &-adrenergic receptor which may be transcribedfrom a second, more 5’ promoter. The structureof the gene also raisesprovocative questions about its evolutionaryorigin. EXPERIMENTALPROCEDURES
Isoletion of Genomic and cDNAClones-A human genomic library in EMBL3 was purchased from Clontech Laboratories (Palo Alto, CA). This library contains 12-20 kb’ fragments of human genomic DNA partially digested with Sau3AI. A h g t l l cDNA library made from term placenta poly(A+) RNA and containing 5 X IO6 recombinants was provided by Dr. Evan Sadler (Washington University Medical Center, St. Louis, MO). These libraries were screened according to the method of Benten and Davis (1973). Duplicate filters were lifted, baked at 80 “C under vacuum, and prehybridized in 10 X Denhardt’s solution, 5 X SSC (1X SSC, 150 mM NaCl, 15 mM sodium citrate, pH 7.0), 10 mM EDTA, 0.1% sodium pyrophosphate, 0.1% sodium dodecyl sulfate; and 10 pg/ml denatured salmon sperm DNA for 2 h at 65 “C. Hybridization was done in the same solution using the 1.3-kb HindIII hamster genomic fragment labeled by the random hexamer priming method (Feinberg and Vogelstein, 1983). Hybridizations were carried out at 55 “C for 2-3 times the calculated Cotllz for the probe. Filters were washed in 2 X SSC at 65 “C. Genomic Southern Blot Analysis-High molecular weight genomic DNA wasextracted from cultured human fibroblasts or from hamster lung using methods previously described (Maniatis, 1982). DNA was The abbreviations used are: kb, kilobase pair; bp, base pair; PIPES, 1,4-piperazinediethanesulfonic acid; ORF, open reading frame; pol, polymerase.
7321
7322
The Gene for the Human and Hamster P2-AdrenergicReceptor
digested with appropriate restriction endonucleases, the fragments weresize fractionated on 1%agarosegels, and transferred onto nitrocellulose paper. Filters were baked at 80 "C under vacuum then 5' 3' prehybridized for 2 h at 42 "C in 30% formamide, 5 X SSC, 0.1% SDS, 5 X Denhardt's, and 100 pg/ml denatured salmon sperm DNA. Hybridizationwas carried out overnight at 42 'C in the same solution V S S S P S K except that 1 X Denhardt's was used and dextran sulfate was added pHDZ to 7% (w/v). Clones pTF2 and pHD2werelabeledby the random 7 pTF3 hexamer priming method (Feinbergand Vogelstein, 1983)and added S P S K L v to the hybridization mixture at a concentration of 5 X IO6 cpm/ml. I O k b [genomic) Filters were washed using conditions indicated inthe figure legends. 05 kb (cDNA) DNA SequenceAnalysis-Selected restriction fragments of geFIG. 1. Restriction map of human genomic clone XRDl and nomic and cDNA clones were cloned into M13 mp18and 19 orfurther cDNA clones pHD2 and pTF3. Restriction enzymes: B, BamHI; digested with HaeIIIand AluI and cloned into M13 mplO. Sequencing H , HindIII; K, KpnI; L, BglII; P, PstI; R, EcoRI; S, SmaI; V, PuuII. was done using the dideoxy chain termination method (Sanger et al., Enzymes which cut more than once are given numerical subscripts. 1977) primedwith the universal primer or clone-specificprimers The heauy line on the cDNA clonesdenotes the @,-adrenergic receptor made on the Applied Biosystems oligonucleotide synthesizer. Both coding sequence. strands of DNA were sequenced. RNA Isolation-RNA was prepared by the guanidinium thiocyanate method (Chirgwin et al., 1979). Term human placenta was quicklibrary a 15-kb clone (XRD1) was obtained and the restriction frozen and homogenized in liquid nitrogen then denatured in 4 M map is shown in Fig. 1. The cDNA clones and3.5 kb of the guanidinium thiocyanate. Lysates were layered over a 5.7 M CsCl genomic clone were sequenced in both directions. The secushion and centrifuged at 126,000 X g for 18 h. The RNA quence of the gene is shownFig. in 2. The base pair numbering was fractionated by oligo(dT)-cellulosechromatography to obtain is relative to the initiator methionine for the &adrenergic poly(A+)RNA. Preparation of Single-stranded D N A Probes-Antisense DNA receptor. The sequence of the cDNA clone pHD2 is colinear probes wereprepared by cloning the appropriate restriction fragments with the genomic clone and extends from position -1065 to from the genomicclone into anM13vector.Clone-specific 17-bp +606 in the sequenceof the human gene. Clone pHD2 thereoligonucleotides were used to prime the synthesis at the desired fore appears to be a partial cDNA clone for the human & location on the insert in single-stranded M13 DNA. The probe was adrenergicreceptorcontaining a long5'untranslatedseextendedwith the Klenow fragment of DNApolymerase in the quence. Two identicalcopies of this clonewere obtained from presence of [cY-~*P]~ATP and unlabeled dTTP, dCTP, and dGTP. 5 X lo6 recombinants in the placenta library. An attempt was The DNA was then digested with the appropriate restriction enzyme and the labeled single-strandedfragment was isolated by preparative made to obtain additional clones from the placenta library using the PuuII(l)-SrnaI(2) fragment from pHD as a probe, alkaline-agarosegel electrophoresis. S1 Nuclease Mapping"Poly(A') RNA (30 pg) was co-precipitated however, no new clones were isolated. in ethanol with 5 X lo5 cpm of a uniformly labeled antisense, singleExamination of the sequence of the gene reveals several stranded DNA probe representing the SrnaI(3)-SmaI(4)human ge- features of interest. The sequence for pHD2 and the previnomic fragment (Fig. l).Hybridization was performed by dissolving ously describedhumancDNA clone pTF3arecontained, the precipitate in 10 p1 of hybridization buffer (40 mM PIPES, pH without interruption, in the gene suggesting that this gene is 6.4, 1 mM EDTA, 400 mM NaCl,80%deionizedformamide)and incubating for 18h at 51 "C.The DNA-RNA hybrid was diluted into intronless. The5' region of clone pHD2 containsa continuous open reading frame (ORF), extending from -1011 to -234, 300 pl of S1 nuclease buffer (280 mM NaCl, 30 mM NaOAc, pH 4.4, 4.5 mM ZnClp) containing20 fig of denatured calf thymus DNA and coding for 259 amino acids. The translated sequence of this 3000 units of S1 nuclease and incubated at 37 "Cfor 1h. The protected putative protein is shown in Fig. 3. The likelihood of an ORF fragments were recovered by ethanol precipitation, denatured, and of this length occurring by chance is approximately 1 in 6 X analyzed on a 6% sequencing gel. Primer Extension Analysis-Poly(A+) RNA (30pg) was hybridized lo5. Statistical analysis of the codon usage by this putative with 5 X IO6 cpm of a uniformly labeled, single-stranded,antisense protein coding sequence reveals that it does not differ signifDNA probe representing the genomic fragment 11-136 (Fig. 2) using icantly from published humancodon bias data (Lathe, 1984). the same procedure as described for S1 mapping above. The hybrid Therefore, this ORF may representor all a portion of a protein was recovered by ethanol precipitation and extended by dissolving coding sequence. This ORFis in frame with the &adrenergic the precipitate in 20plof reaction mixture containing 50 mM Tris, receptor coding block, but is separated by one termination pH 8,8 mM MgCl,, 2 mM dithiothreitol, 2 mM sodium pyrophosphate, 1 mM each dATP, dCTP, dGTP, and dTTP, and 1.5 units of avian codon. No homology is observed between thiscoding sequence file in GenBank. myeloblastosis virus reverse transcriptase. The reactions were incu- and the protein sequences on Northern blot analysis using the PuuII(l)-SrnaI(3) genomic bated at 42 "C for 90 min. Sampleswere denatured and analyzed on a 6% sequencing gel. fragment as a probe and primer extension using the SrnaI(2)SrnaI(3) genomic fragmentastheprimer failed todetect RESULTS mRNA species for pHD2 in human placenta poly(A+) RNA. We used a 1.3-kb HindIII fragment from the hamster p2- Using clone pTF3 as a probe, a 2.2-kb message was detected adrenergic receptor genomic clone (Dixon et al., 1986), which by Northern blot analysis with human placenta poly(A+) RNA a (Kobilka et al., 1987). S1 nuclease analysisusing the genomic encodesvirtuallytheentirereceptorprotein,toscreen human placenta cDNA library in Xgtll and a human EMBL3 SrnaI(3)-SrnaI(4) fragment is shown inFig. 4.4. Several proobserved, thelongest being 230 bp, genomic library. Wepreviously reported the isolationof clone tectedfragmentsare pTF3 from the human placenta library (Kobilka et al., 1987). putting the origin of the longest transcript at -219. This position was confirmed by primer extension analysis (Fig. This clone contains 200 bp of 5' untranslated sequence, the 4B).This data also suggests several closely spaced sites for entire codingregionfor the human &-adrenergic receptor, -219. These and allof the 3' untranslatedregion. Five copies of this clone transcription initiation beginning at position major transcript begins around were obtained from 5 X lo6 recombinants. We now report the experiments indicate that the isolation of a 1.6-kb clone (pHD2) which overlaps with 606 -219 and that thegene coding for this sequenceis intronless 5' untranslated region. bp of the 5' coding region for the ,&adrenergic receptor and in both the coding sequence and the T o gain a better understanding of the significance of the extends 1065 bp 5' to the coding region for the &-adrenergic look to receptor. The restriction mapfor this clone and the previously long ORF 5' to the human &-adrenergic receptor and to this ORF (the distal reported cDNA clone are shown inFig. 1. From the genomic for potential promoter elements 5'
1
The Gene for the H u m a n and H a m s t e r &Adrenergic Receptor .
-1580.
.
-L560.
-1540.
C C ~ C M G A G A ~ C C T C C ~ A ~ C T A C A f f i I ,
.
-1520.
.
-1500.
C ! X GTC AIC ACA GCC A1T GCC MG TIC G f f i CGT Cn UG ACG GTC ACC M C TAC TIC AIC Leu Val IleIhr Ala Ila Ala LysPhe Glu Arg Leu Gln Ihr Val Ihr A m Tyr Phe Ile 276
.
-1480.
7323
ACTTCACIGGCCTCIGCIGATCIGCTCAmGGCCIGCCAGTGGIGCCCTlU%GCCGCC rn Ser Leu Ala Cy0 Ala Acp Leu Val k t Gly Leu Ala Val Val Pro Phe Gly Ala Ala
-1460.
A C G P G C C A C C A U C C I U ~ C T M - A ~ ~ A ~ C M G A ~ A C A C U I ~ C A ~ ~
336
.
-U0.
.
-400.
-340.
.
C
.
-320.
1116
.
-380.
G C a T C " r ; C ( . T .a C jC m -
-360.
C
G
G
c
~
C
C
~
1116
.
-300.
GGC TAC TCCAGC M C Gu: M C A U GGG GAG CAG ACT GGA TAT CAC GTG G M CAG GAG AM Gly Tyr S.rS.r M n Gly M n Ihr Gly Glu Cln Ser Gly Tyr H i s Val Glu Gin Glu Lys
-280.
G M M T A M m C r r , T C I G M G A C ~ C C A G G C A C G G M ~ C ~ G T G G G C C A T C M ~
Glu Asn Lys Leu Leu Cy. 61u Asp Leu Pro Gly Ihr Glu M p Phe VI1 Gly H i s Gln Gly
mc C ~ c x n T A C " -160.
.
1236
.
-240.
-220.8
mXUMCC*GRCCm"+~GM"-m~& -180.
.
-160.
.
* -140.
.8
ACT GIG CCT AGC GAT M C Am GAT T U CM GGG AGG M I IGT ACT ACA AAT GAC TCA CIG Thr Val P m Ser Asp Asn Ile Asp Ser Gln Glyk g Asn Cyr ser Ihr A m ASP Ser Leu
-200.
e
.
.
.
-100.
.
-80.
.
-60.
.
_"
+l.
,
14 4
-20.
.
1320. ~
-40.
" ~ C c ~ ~ c m ~ c C c U
.
.
1260.
.
1280.
1300.
CIG T M A ~ A ~ A M ~ C C C C C C C C C C C C C M ~ C A C T A M C A ~ C T A ~ ~
-120.
n % M ~ c a ; c G M ; c c ~ ~ ~ ~ ~ c c ~ c U C A c ~ U c Leu C A c
T
36
C
.
M
G
216
~
T
~
~
~
A
M
:
G
~
. ~
A
~
T
A
C
~
~
A
T C 1580.
~
. 1360. . ~ ~ M A . 1440. .A ~ A G ~ T I . 1520. . T A ~ ~ G A G T ~ . 1600. .
1380.
T
1500. C
.
1560.
T
m
1420.
1480.
G C U G E W T l X C C C C ATG GCG C M CCC GCG M C GGC AU: GCC ITCTK CIG Het Gly Gln Pro Gly Asn Gly Ser Ala Phe Leu Leu
1340.
~
C
~
C
~
C
C
C
~
~
T
1460.
-
A
A
~
A
~
C
~
M
C
A
1540. A
A
I
A
~
~
~
~
C
C
A
~
1620.
~
~
C
T
A FIG. 2. Nucleotide sequence of the human &-adrenergic receptor gene and deduced amino acid sequence of the &-adrenergic receptor. Underlined sequence I indicates a potential steroid receptor binding hexamer. The boundaries of clone pHD2 are indicated by square brackets. The boundaries of the long open reading frame 5' to the &adrenergic receptor are indicated by round brackets. Underlined sequence 2 indicates the first ATG codon in the 5' long open reading frame. Potential sites for initiation of transcription as determined by S1 nuclease analysis are indicated by solid circles below the sequence, and those determined by primer extension analysis are indicated by solid squares above the sequence. Underlined sequence 3 is an ATG codon in the 5' untranslated region of the &adrenergic receptor transcript. It is followed by a 19-codon open reading frame and a termination codon 4. Three possible polyadenylation signals are underlined 5. The location of the poly(A) tail in clone pTF3 is indicated by the arrow.
promoter) and 5' to the major site of transcription initiation (the proximal promoter),we sequenced the analogous region of the hamster gene. Computer-assisted alignment of these sequences was done by the ALIGN program (Orcutt et al., 1984) and is shown in Fig. 5. Optimal alignment required 18 breaks over the 1200 bp analyzed, each break was given a penalty of 6. The final alignment score of 36 SD units indicates highly significant homology. The overall homology over
G
A
a 1.2-kb sequence 5' to the initiator methionine for the p2adrenergic receptor is 65%. Examination of the human gene sequence 5' to the site of transcription initiation at -219 reveals several notable features (Fig. 5). The sequence extending from -220 to -295 shares 80% nucleotide homology with the analogous region of the hamstergene. Within thisregion the sequence CACATAA beginning at -238 and CCTAAA beginning a t position -252
C
C
C
C
A
The Gene for the Human and Hamster &-AdrenergicReceptor
7324
SHLLCSCQMFEREYTGLPGVCWEGSIISARVRQVRSTQME
-684) and hamster (-844,-694, -686) genes. There are 2 CCGCCC hexanucleotides a t positions -407 and -382 in the human gene. This sequence andits reverse complement, LDRGDFVPDGFCVRARASVHVGELFFCVSVSMAVVRYKSE GGGCGG, are the spl binding site for the SV40 promoter HVCQGVFVPVCACLGGHSRFLPNVGQCRCAALCLETSSRA (Dynan and Tjian, 1983). There isonly one of these GC boxes GAQGRQVAATEEPKAPGLAGKDTTSSFSPLGPARVAGKQW in the hamster gene a t position -384. The sequence WPALQGAVGPRPGQPQEKEGEGRGGKGEECLAPSRLPACH GGGAGGGAAAGGGG found at position -343 in the human WPKVPVRHGEPSSPKVLCT is also found in the same relative position in the promoter FIG.3. Translation of the long open reading frame (-1011 region of the human hypoxanthine-guanine phosphoribosylto -234) found 5’ to the &-adrenergic receptor coding region et al., 1986). While thesignificance of transferase gene (Pate1 in the human gene. these sequences remains to be determined, those sequences shared by both the human and hamster genes are more likely A. B. A B , C , to have functional significance inregulating &adrenergic receptor gene expression. Wehave not yet been able to locate the origin of the transcript for clonepHD2 by primer extension orS1 nuclease -220 analysis. While there is significant nucleotide homology be-210 tween the region of the humangene coding forthe long ORF -200 5’ to the &-adrenergic receptor codingsequence and the analogous region of the hamster gene, a homologous ORF of comparable length is not found in the hamster gene. The -220 genomic sequence 5’ to clone pHD2 contains potential pro-2 10 moter elements for pol I1 transcription. These elements can -200 be foundin Figs. 2 and 5. In the human gene there is a consensusTATA box at position -1202 and a consensus CAAT box a t position -1289. Furthermore, there isa consensus steroid receptor bindinghexamer (TGTTCT) at position -1438. This finding isof interest in light of the physiological effects of steroids on P2-adrenergicreceptorexpression in I *AA A various tissues (Davies and Lefiowitz, 1984). The comparable c AAA Y I S 1 AMV Reverse Transcriptase region of the hamster gene also hasa potential TATA box a t I SI NUCLEASE AAA position -1125, however, the steroid binding hexamer and CAAT box are not found. 5’ to these elements in the human FIG. 4. A, S1 nucleaseanalysis. S1 nucleasedigestionwas per- gene (-1590 to -1400) the 3‘ portion of an Alu repeat canbe found. This is notfound in the hamstergene. formed on DNA-RNA hybrids which were made by annealing the uniformly labeled antisense strand of the genomic restriction fragTo evaluate thepossibility that the intronless P2-adrenergic ment SmaI(3)-SmaI(4) with either 30 pgof Escherichia coli ribosomal receptor gene may represent a pseudogene, we looked for RNA (laneA ) or 30 pg of human placenta poly(A+) RNA (laneB). A (Fig. 6) and hamster dideoxy sequencing ladder(lanesunder C ) is used to size the protected homologous sequences in both the human (Fig. 7) genomes by Southern blot analysis of genomic DNA fragments. The 5’ extent of the protected fragment relative to the initiator ATG for the &adrenergic receptor is shownat theright. B, digested with various restriction enzymes. Hybridization perprimer extension analysis. A uniformly labeled antisense strand of formed a t high stringency using clones pTF3 and pHD2 as the human genomic clone (base pairs +I1 to +136 in Fig. 2) was probes reveals only one hybridizing species with a restriction hybridized with 30 pg of E. coli ribosomal RNA (lane A ) or 30 pg of pattern identical to the genomic clone. A similar finding is human placenta poly(A+) RNA ( l a m B ) then extended with avian cDNAclone is hybridized to myeloblastosis virus reverse transcriptase. A dideoxysequencing lad- observed whenthehamster der from the human genomic clone primed a t base pair +136 and hamster genomic DNA. This suggests a single copy gene and confirmed by gene dosage experiment, also extending 5’ is shown in lanes under C (ATGC). The 5’ position of this observation is the extended fragments relative to the initiator ATG for the B2- shown in Figs. 6 and 7. When the Southern blot hybridization adrenergic receptor is shown at theright. is done a t reduced stringency on human genomic DNA, additional weak signals are seen (Fig. 8). We suspect this repare the closest approximations to the consensus TATA box resents hybridization to a related gene such as that for one of (Breathnach and Chambon, 1981) in this region. These se- the other adrenergicreceptors. We are currently cloning this quences are found in the same relative position in the genomic hamster fragment for further characterization. gene a t -247 and -261, respectively. At position -293 in the DISCUSSION human gene and -298 in the hamster gene the sequence ATTGGC is found. This fits the consensus sequence for the Intronless Gene-We previously reported the cloning of the reverse complement of the CAAT box. hamster P2-adrenergic receptor gene and showed that there Extending 5’ to this region there are20-60-bp stretches of were no introns in the coding and 3’ untranslated region. the human gene which are highly homologous with the com- Here we present thesequence of the entire human gene, define parable regions of the hamster gene interspersed with areas the origin of transcription, and show that the humangene is oflow homology or breaks. These homologous regions are intronless throughout. While intronless genes have been renoted in Fig. 5. Within this sectionof the gene there aretwo ported (Schaffneret al., 1978; Nagata et al., 1980; Lawn et al., direct repeatswhich are found in both the human and hamster 1981; Ninomiya et al., 1986), they are uncommon in eukargenes. The sequence GGGAGGG is seen4 times in the human yotes. Studies of expression of eukaryotic genes indicate that gene (-947, -700, -348, -343) and 2 times in the hamster introns may be required for optimal RNA processing (Volgene (-445, -347). The sequence GTGTCT is found3 times ckaert et al., 1979; Hamer and Leder, 1979). This may in part in the same relative positions in the human (-885, -692, explain the rarity of the &adrenergic receptor mRNA that TSVSVSLWMPPSQRVFRFCVCHHVFVLLGASVFVSGRVSV
-
-
The Gene for the Human and Hamster &Adrenergic Receptor
7325
0
- 6.6
- 4.4 m-
-- 2.3 2.0
FIG.6. Human genomic Southern blots. Gene dosage analysis: the 5 lanes at theleft represent the gene dosage analysis. A Southern blot was made of varying amounts of a4.7-kblinearized pUC18 plasmid containing clone pTF3. The numbers above the lanes indicate the represented gene dose for a 2.0-kb fragment in 20 pg of genomic DNA. Cellular DNA blot: the middle panel is a Southern blotof total cellulargenomic DNA isolated from human leukocytes. Each lane -860. . -840. -820. represents 20 pg of genomic DNA digested with the indicated enlnK4n " " " -800. .G -820. zymes. Genomic clone blot: the panel at the right is a Southern blot m " " of EMBL3 clone XRDl digested with theindicated enzymes. Restricmcrc 1 m m -N G GCCI A IXEA ACCG tion enzymes: B, BamHI; H,HindIII; K, KpnI; L,Bg1II; P, PstI; R, .-800. . -780. -760. 2 u . a x m &EcoRI; S , SmaI; V , PuuII. Molecular weights in kilobase pairs are A " n . -720. -180. .t -760. . -740. indicated at theright. All blots were hybridized with the labeled clone c " w C l 7 m P PTF3 asindicated under "Experimental Procedures" and washed in OOmoN ClTXGGG ~ " I G C C C m C n ; lTCCAC ffi "GI C 0.2 X SSC a t 65 "C. -740. . -720. -700. -680.
.
" " " " " " "
.
.
.
we have observed (Kobilka et al., 1987). We also show that the &-adrenergic receptor geneis a singlecopy gene indicating . -620. . -600. . -640. p that we have not cloned a pseudogene for the &adrenergic c " . -660. . -640. . -620. receptor. " FXWlTXC ATM(;ICPCACCATGIC CKXCATTTCPCCICI~ G The absence of introns in the &-adrenergic receptor gene isinteresting considering thetendency for intronsto be . -600. . -580. . -560. " inserted into thegenome between exons encoding functional -600. . -580. . -560. . -540. AGU " " " domains. Examples of this type of gene structure include the CTGCT C C C T C T T T TCGAATC CT GTGIU: CCCI C GA low density lipoprotein receptor, rhodopsin, and 3-hydroxy. -480. -540. . -520. . -500. P U P G P A 3-methylglutaryl-CoA reductase (Russell et al., 1984; Nathans . -500. -480. . -520. OXrnxrI C W ~ ~ - and Hogness, 1983; Liscum et al., 1985). Interestingly, the cc G I GACcrcA WX AGG G C C G C CALXIAG U G A human mas oncogene (Young et al., 1986) and the STEB and . -460. . 4 40. . -420. STEB gene ( a factor and N factor receptors) (Nakayama et ~ A A A W W -4 40. G G -420. A C . -400. al., 1985) of yeast are intronless.Like the &adrenergic recepA ~ C A C C A G " C GTGC GACACACCCAGlTI nCn;CCCC C GCIAGCGU;AGACKCICGC tor, the productsof these genes have 7 hydrophobic domains. At present theonly other mammaliangenes which are known to lack introns are those for histones (Stein, 1984), CY and p interferon (Nagata et al., 1980; Lawn et al., 1981), and the G CCTCC GGGAGCA C U CCCCACCAG A type X collagen gene (Ninomiya et al., 1986). The origin and functionalsignificance of introns isa subject of debate (Gilbert et al., 1986). The primordial cell may have resembled prokaryotes and introns may represent an evolu-200. tionarilyrecentadaptationtofacilitate geneticdiversity. A -220. Thus, the&adrenergic receptor gene may represent one of a CACC" CGCAGCGCmTUG few eukaryotic genes which failed to acquire introns in the . -180. . -160. . -140. process of evolution. However, the study of the evolution of C A G U " ~ . -180. . -160. . -140. -200. the gene for triose-phosphate isomerase (Gilbert et al., 1986), (;rTCMCCn;CTCIUCCUTCrn" which is found in both eukaryotes and prokaryotes, suggests G C Usrc T GCAlX%ACccocAGCCCC GCACCC AC A c c f i . A C n ; n ; w l a ; A f f i G K C that intronsmay have been present in the firstcells and were -120. . -100. -80. -60. lost in the evolution of prokaryotes into organisms capable of -120. . -100. -80. C A G C A U W U C C I A C Uccocnr ~ rapid cell growth and division. Accordingly, the &adrenergic U UCA C U C C A U G C " P 3 CCTTCUffi GIC GC ar;CC GC C GCCCffi ffi receptor gene may have evolved from a functionally related - 4 0 . +1. -20. ~GCUKXCCC------------"--M;CCAG intron containing gene, perhaps as a processed gene. In this +l. -60. context it is worth re-emphasizing that sequence homology n" "d ~
"
"
"
"
"
"
"
"
"
"
"
"
"
CICGCPCMiCCCGCC
" "
-40.
u;I
C CCCVXGAG
CC
AGCC GTCCGCI
" " " "
ACCPGC C cn;C GC
A X
5. Nucleotide alignment of human and hamster genes 5' to the initiator ATG for the &-adrenergic receptor. Potentially important promoter regulatory elements are boxed (if present in both genes) or underlined. These elements are discussed in the
text. The arrow indicates the5' most siteof initiation of transcription as determined in the human by S1 nuclease and primer extension analysis.
7326
The Gene for the H u m a n and Hamster &-Adrenergic Receptor
have a closely related peptide map (Stileset al., 1983) and to cross-reactwith ,&-adrenergic receptor-specific antibodies (Dixon et al., 1986). This homologous gene may represent the M parent genefromwhich the &adrenergic receptorhas 5’ to the human &adrenergic evolved. The long ORF seen 23 9d receptor coding sequence might be a remnant of a functional 6.6 codingsequence of the parent gene. We may gain further ” 4.4 insight into this question aswe characterize the homologous fragment that we have observed in human genomic Southern blots (Fig. 8). The only evidence other than the lack of introns that the &-adrenergic receptor might have arisen asa processed gene 1.3 1.1 is a short direct repeat bordering both human and hamster genes. These elements are shown in Fig. 9. Processed genes can arise after transcriptsfrom pol I1 or pol 111promoters are FIG.7. Hamster genomic Southern blots and gene dosage analysis. Gene dosage analysis: the 4 lunes a t the left represent the processed to remove introns, thenundergo reverse transcripgene dosage analysis. A Southern blot was made of varying amounts tion to form a DNA-RNA hybrid which becomes inserted into of a 5.2-kb EcoRI fragment from the EMBL hamster genomic clone the genome (Vanin, 1985). Direct repeats are frequently found XBAR5. The numbers above the lunes indicate the represented gene bordering these processed genes and are thought to result dose for a 5.2-kb fragment in 20 pg of genomic DNA. Cellular DNA from the filling in of the recessed ends of the disrupted gene blot: the middle panel is a Southern blot of t o t a l cellular genomic DNAisolatedfrom hamster lung. Each lanerepresents 20 pg of at the siteof the DNA-RNA hybrid insertion. There are no genomic DNA digested with the indicated enzymes. Genomic clone homologies with retroviral long terminal repeats. blot: the panel at the right is a Southern blotof EMBL3 clone XBAR5 Two Potential Promoters-Preliminary evidence suggests digested with the indicated enzymes. Restriction enzymes: B, BamHI; that the &-adrenergic receptor may be transcribed from two H, HindIII; K , KpnI; L, BglII; P, PstI; R, EcoRI; S, Smal; V , PuuII. promoters. The distal (or5’) promoter would presumably be Molecular weights in kilobase pairs are indicated at the right. All blots were hybridized with the labeled 1.3-kb HindIII fragment from responsible for producingthe transcriptrepresented by cDNA clone pHD2 whichbegins a t position -1066. However, S1 hamster genomiccloneXBAR5 as indicated under “Experimental nuclease mapping and primer extension indicate that most Procedures” and washed in 0.2 X SSC at 65 “C. transcripts originate from a more proximal, or downstream, promoter lying just 5‘ to position -219. This promoterwould I be responsible for clonepTF3. Comparison of the human and hamster genesin this region revealssignificantnucleotide homology. 23.1 Within this region of both human and hamster genes we .9.4 find a CAAT box and approximations of TATA boxes. This .6.7 region also containsseveral features common to the promoters .4.4 et al., 1986) for housekeeping genessuch as the human (Pate1 and mouse (Melton et al., 1986) hypoxanthine-guanine phosphoribosyltransferase genes, and the genes for the human epidermal growth factor receptor (Ishii et al., 1985), 3-hydroxy-3-methylglutanyl coenzyme A reductase (Reynolds et al., 1984), dihydrofolate reductase (Crouse et al., 1985), and adenosine deaminase(Valerio et al., 1985). The promotersfor these genes are G + C rich; they commonly contain one or more copies of the hexanucleotide sequence GGGCGG or its FIG.8. Southern blot analysis of human genomic DNA per- reverse complement CCGCCC, the spl binding site (Dynan formed at low stringency. Human genomic DNA (prepared from and Tjian, 1983); they often lack CAAT boxes and if TATA lymphocytes obtained from three different individuals)was digested boxes are present they frequently do not fit the published with the indicated enzymes (R,EcoRI; L, BglII; H , HindIII; V, PuuII), consensus sequence (Reynolds et al., 1984; Ishii et al., 1985; the fragments were resolved on a 1%agarose gel and Southern blot analysis was performed. The blotwas hybridized as described under Valerio et al., 1985); transcription from these promoters fre“Experimental Procedures” and washed in 1 X SSC at 65 ‘C. The quently occurs a t multiple sites which may span up to150 bp areconstitutively autoradiogram was exposedfor 72 h. Theexpected&adrenergic (Ishii et al., 1985).Housekeepinggenes receptor-specific genomic fragments are seen as very darkbands. The transcribed a t low levels in all cells and regulation appears to arrows indicate signals of much lower intensity which are removed be largely post-transcriptional (Leys et al., 1984; Piechaczyk by a more stringent wash in 0.2 X SSC at 65 “C. et al., 1984). The mode of regulation of the &-adrenergic receptor expression remains to be determined. Aswe have between the mammalian (Dixon et al., 1986; Kobilka et al., previously reported (Kobilka et al., 1987; Dixon et al., 1986), 1987) and avian (Yarden et al., 1986) P-adrenergic receptors, both human and hamster transcripts contain AUG codons 5’ vertebrate (Nathans andHogness, 1984; Nathans et al., 1986) to the initiator methionine for the receptor. In some genes et al., 1985), and the vertebrate and invertebrate opsins (Zuker HUMAN -1279 ACTGAAGAAATTGTTTGA muscarinic cholinergic receptor (Kubo et al., 1986) suggest HAMSTER -11168 ATTGAAGAAATTGTTTGA that a common ancestral gene may have given rise to all of HUMAN *I7711 AGTAAATAAAATGTTTGA HAMSTER *I787 AGTAAATAAATTGTTTGA these. Nonetheless the opsingenes contain introns. A T A A A A A TGTTTGA COMMON Southern blot analysis of genomic DNA a t low stringency FIG.9. Direct repeats bordering both human and hamster reveals only one related gene sequence. This genomic fraggenes. The 18-bp direct repeats found at the indicated positions in ment may represent a portion of the gene for arelated protein the human and hamster gene are shown. The 14 bp common to all such as the &-adrenergic receptor which has been shown to four sequences are shown at the bottom.
The Gene for
Human the and Hamster &-Adrenergic
such as the yeast GCN4 gene, upstream AUG codons are felt tobe involved inpost-transcriptional regulation of gene expression (Thireos et al., 1984). The second, distal promoter would presumably be responsible for the transcription of clone pHD2. It should be stressed that the significance of this clone and the long ORF remain to be determined. Attempts to detect transcripts with the sequence of the long ORF by Northern blot analysis, S1 nuclease analysis, and primer extensionhave failed. Thus we are unable to define the origin of this rare transcript, and therefore cannot exclude the possibility that this is an aberrant transcript without biological significance. However,analysis of the gene 5 ’ . to this cDNA reveals several potential promoter elements in the human gene and in the analogous region of the hamster gene. These include a TATA box in both the human and hamster gene and a steroidbinding hexamer and CAAT box in the human gene. Transcription from a second, 5’, pol I1 promoter has also been observed in the mouse dihydrofolate reductase gene (Crouse et al., 1985). This clone may also represent a pol I11 transcript, possibly originating in the Alu repeat lying approximately 400 bp 5’ to this cDNA. Low level transcription of the human @-globin gene from a pol I11 promoter has been observed (Carlson and Ross, 1983). These pol I11 transcripts appeared to be polyadenylated and spliced. Perhaps transcription of the &adrenergic receptor from a distal promoter might be induced by certain developmental, metabolic, or hormonal signals. We are currently searching for tissues in which this transcript is produced at higher levels. Acknowledgments-We wish to thank Dr. Theresa Yang Feng and Dr. Uta Francke of Yale Medical School for performing some of the Southern blotting experiments, Mark Leader and Tong Sun Kobilka for excellent technical assistance, Dr. Russ Kaufman for advice and reading the manuscript, and Donna Addison for preparation of the manuscript. REFERENCES Benten, W. D., and Davis, R. W. (1973) Science 1 9 6 , 180-182 Breathnach, R., and Chamhon, P. (1981) Annu. Reu. Biochem. 50,349-383 Carlson, D. P., and Ross,J. (1983) Cell 34,857-864 Chirgwin, J. M., Przyhyla, A. E., MacDonald, R. J., and Rutter, W. J. (1979) Biochemistry 18,5294-5299 Crouse, G. F., Leys, E. J., McEwan, R. N., Frayne, E. G., and Kellems, R. E. (1985) Mol. Cell. Biol. 5 , 1847-1858 Davies, A. O., and Lefkowitz, R. J. (1984) Annu. Rev. Physiol. 46, 119-130 Dixon, R. A. F., Kohilka, B. K., Strader, D. J., Benovic, J. L., Dohlman, H. G., Frielle,T.,Bolanowski, M. A,,Bennett,C. D., Rands, E., Diehl,R. E., Mumford, R. A., Slater, E. E., Sigal,I. S., Caron, M. G., Lefkowitz, R. J., and Strader, C. D. (1986) Nature 3 2 1 , 75-79 Dynan, W. S., and Tjian, R. (1983) Cell 3 5 , 79-87
Receptor
7327
Feinherg, A. P., and Vogelstein, B. (1983) Anal. Biochem. 132,6-13 Gilhert. W.. Marchionni. M.. and McKnieht. G. (1986) Cell 4 6 . 151-154 Hamer,’D. H.,and~Lede;, P.’(1979) Cell f8,’1299-1302 Ishii, S., Xu, Y.-H., Stratton, R. H., Roe, B. A., Merling, G. T., and Pastan, I. (1985) Proc. Natl. Acad. Sei. U. S. A. 8 2 , 4920-4924 Kobilka, B. K., Dixon, R. A. F., Frielle, T., Dohlman, H. G., Bolanowski, M. A., Sigal, I. S., Yang-Feng, T. L., Francke, U., Caron, M. G., and Lefkowitz, R. J. (1987) Proc. NQtl.A ~ a dSei. . U. S. A. 8 4 , 46-50 Kubo, T., Fukuda, K., Mikami, A,, Maeda, A,, Takahashi, H., Mishina, M., Haga, T., Haga, K., Ichiyama, A., Kangawa, K., Kojima, M., Matsuo, H., Hirose, T., and Numa, S. (1986) Nature 323,411-416 Lathe, R. (1984) J. Mol. Biol. 1 8 3 , 1-12 Lawn, R. M., Adelman, J., Franke, A. E., Houck, C. H., Gross, M., Najarian, R., and Goeddel, D. V. (1981) Nucleic Acids Res. 9 , 1045-1052 Levs. E. J.. Crouse, G. F.. and Kellems, R. E. (1984) J. Cell Biol. 99, 180-187 Liicum, L.’, Finer-Moore; J., Stroud, R . M., Luskey, K. L., Brown, M. S., and Goldstein. J. L. (1985) J. Biol. Chem. 260. 522-530 Maniatis, T., Fritsch, E . F., and Samhrook, J. (1982) MolecularCloning, A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY Melton, D. W., McEwan, C., McKie, A. B., and Reid, A. M. (1986) Cell 4 4 , 319-328 Nakayama, N., Miyajima,A,, and Arai, K. (1985) EMBO J. 4,2643-2648 Nagata, S., Mantei, N., and Weissmann, C. (1980) Nature 2 8 7 , 401-408 Nathans, J., and Hogness, D. S. (1983) Cell 34,807-814 Nathans, J., and Hogness,D. S. (1984) Proc. Natl. Acad. Sci. U. S. A. 81,48514855 Nathans, J., Thomas, D., and Hogness, D. S. (1986) Science 2 3 2 , 193-202 Ninomiya, Y., Gordon, M., van der Rest,M., Schmid, T., Linsenmayer, T., and Olsen, B. R. (1986) J. Biol. Chem. 261,5041-5050 Orcutt, B. C., Dayhoff, M. O., George, D. G., and Baker, W. C. (1984) PIR Report ALI-1284, National Biomedical Research Foundation, Washington, D. C. Patel, P. I., Framson, P. E., Caskey, C. T., and Chinault, A. C. (1986) Mol. Cell. Biol. 6 , 393-403 Piechaczyk,M.,Blanchard, J. M.,Marty,L.,Dani,C.,Panahieres, F., El Sahouty, S., Fort, P., and Jeanteur, P. (1984) Nucleic Acids Res. 12, 69516963 Reynolds, G. A., Basu, S. K., Oshorne, T. F., Chin, D. J., Gil, G., Brown, M. S., Goldstein, J. L., and Luskey, K. L. (1984) Cell 3 8 , 275-285 Russell, D. W., Schneider, W. J., Yamamoto, T., Luskey, K. L., Brown, M. S., and Goldstein, J. L. (1984) Cell 37,577-585 Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74,5463-5467 Schaffner, W., Kunz, G., Daetwyler, H., Telford, J., Smith, H. O., and Birnstiel, M. L. (1978) Cell 14,655-671 Stein, G. S. (1984) Histone Genes Structure, Organization, and Regulation, John Wiley and Sons, New York Stiles, G. L., Strasser, R. H., Lavin, T. N., Jones, L. R., Caron, M. G., and Lefkowitz, R. J. (1983) J. Biol. Chem..258,8443-8449 St+:, G. L., Caron, M. G., and Lefkowltz, R. J. (1984) Physiol. Reu. 6 4 , 661~~~~~~~
143
Thireos, G., Penn, M. D., and Greer, H. (1984) Proc. Natl. Acad. Sci. U. S. A. 81,5096-5100 Vaierio, D., Duyvesteyn, M. G. C., Dekker, B. M. M., Weeda, G., Berkvens, Th. M., van der Voorn,L., van Ormondt, H., and van der Eh,A. J. (1985) EMBO .I_ . A~ _ ,A27-Ad2 Vanin, E. F. (1985) Annu. Reu. Genet. 1 9 , 253-272 Volckaert, G., Feunteun, J., Crawford, L. V., Berg, P., and Fiers, W. J. (1979) J. Virol. 30. 674-682 Yarden,~Y.,Escobedo,-J. A., Kuang, W.-J., Yang-Feng, T. L., Daniel, T. O., Tremble, P. M., Chen, E. Y., Ando, M. E., Harkins, R. N., Francke, U., Fried, V. A,, Ullrich. A,. and Williams. L. T. (1986a) Nature 323. 226-232 Yarden; Y., Rodrigies, H., Wong, S. K.-F., Brandt, D. R., May, D. C., Burnier, J.. Harkins, R. N.. Chen. E. Y.. Ramachadran. J.. Ullrich., A,., and Ross. E. M . (1986h)’Proc. Natl. Acad. Sci. U. S. A . 83,6795-6799 Young, D., Waltches, G., Birchmeier, C., Fasano, O., and Wigier, M. (1986) Cell 4.5. - - , 711-719 . - - . -Zuker, C. S., Cowman, A. F., and Ruhin, G. M. (1985) Cell 40,851-858 _
”
I~