Nancy E. Cookeg, Doris Coit, John Shines, John D. Baxterj, and Joseph A. Martial 1). From the Howard Hughes Medical Institute Laboratories, Departments of ...
’rHt. .JOURNALOFBIOLOGICAL
CHEMISTRY
V o l 256. No 8, I s u e of April 25, pp. 4W7-401fi. 1981 Prrnted m U S A
Human Prolactin cDNA STRUCTURAL ANALYSIS AND EVOLUTIONARY COMPARISONS* (Received for publication, September 8, 1980, and in revised form, December 1, 1980)
Nancy E. Cookeg, Doris Coit, JohnShines, John D. Baxterj, and Joseph A. Martial 1) From the Howard Hughes Medical Institute Laboratories, Departments of Medicine, Biochemistry, a n d Biophysics and the Metabolic Research Unit, University of California, Sun Francisco, California94143
Prolactin (Prl), growth hormone, and chorionic som- Nilson et aL, 1980) have given considerable additional support matomammotropin form a set (the “Prl set”) of hor- to the hypothesis of common ancestry. mones whichis thought to have evolved from a common In this paper we report the structure of the hPrl cDNA. ancestral gene. This assumption is basedon several This provides the first opportunity to compare the coding lines of evidence: overlap in their biological and imgene sequence for Prl, GH, and CS within one species. We munological properties, similarities in their amino acid have applied an improved method for the calculation of disequences, and homologies in thenucleicacid se- vergence (Perler et al., 1980) between coding sequences of the quences of their structuralgenes. In the current study Prl family to obtain a more accurate description of evolutionwe report the cloning, amplification in bacteria, and ary relationships within this family. The sequence data permit sequence analysis of DNA compIementaryto Prl mRNA us to report the complete signal peptide structure and to isolated from human pituitary Prl-secretingadenomas. revise the published amino acid sequence of hPrl. The revised The cloned DNA contains 914 bases, which includes the entirecodingsequence of human prehl as well as amino acid structure has been partially substantiated by the portions of the 5- and 3’-untranslated regions of the amino acid sequence of hPrl from normal pituitaries.2 This mRNA. The aminoacid sequence predicted by our data cloned hPrl cDNA should be useful as a probe to map and isolate the corresponding hPrl genomic sequences, to deterdiffers from a previously reported amino acid sequence mine the chromosomal location of the Prl gene (Owerbach et in 8 positions. With theresults of this study we can now compare in onespecies the nucleotide sequencesof the al., 1980b), to synthesize hPrl in bacteria,:’ and to study the structural gene coding for eachof the hormonesof the regulation of Prl gene expression. The latter is of particular Prl set. The sequence divergence at replacement sites interest, since the mechanisms controlling Prl gene expression is used to establish an evolutionary clock for the Prl may differ from those for GH. Whereas prolactin is mainly set of genes. Using this clock, we postulate that the under negative dopaminergic regulation (Takahara et al., chromosomal segregation of humanPrlandhuman 1974; MacLeod, 1976; Ben-Jonathan et al., 1977),GH may be growth hormone occurred about392 million years ago predominantly under positive control (Krulich et al., 1968; and that growth hormone and chorionic sommatomamFrohman et al., 1971). motropinunderwentanintrachromosomalrecombiEXPERIMENTALPROCEDURES nation within thelast 10 million years. mRNA Preparation-HumanpituitaryPrl-secretingadenomas were collected by transphenoidal hypophysectomy, frozen immediately,andstored in liquidnitrogen. When six tumorshad been Prolactin, growth hormone, and chorionic somatomammo- collected, RNA was prepared from each according to the method of tropin (placental lactogen) appear to have evolved from a Chirgwin et ul. (1979) withmodification as describedpreviously single ancestral gene by duplication and sequence divergence. (Cooke et al., 1980). The RNA purification included enrichment by The common ancestry of these proteinswas indicated initially chromatography on oligo(dT)-cellulose (P-LBiochemicals, Type 7). mRNA Translation and Immunoprecipitation-The polyadenylby similarities in their amino acid sequences (Catt et al., 1967; ated RNA was translated in an in vitro rabbit reticulocyte lysate Li et al., 1969; Sherwood, 1967; Niall et al., 1971) and by translation system in the presence of [““Slmethionine (Amersham, overlap in their immunological and biological properties (Niall 600-1300 Ci/mmol) as previouslydescribed (Pelham and Jackson, et al., 1973).Comparisons of the mRNA sequences of rat GH’ 1967). Human prePrl synthesized in the cell-free system was immu(a gift of H. Friesen, (Seeburg et aZ.,1977a) and ratPrl (Cooke et al., 1980; Gubbins noprecipitated with rabbit antiserum to hPrl et al., 1979), human GH (Martial et al., 1979; Roskam and Department of Physiology, University of Manitoba). Control immu(Antibodies,Inc., Rougeon, 1979)and humanCS (Shine et al., 1977),and bovine noprecipitation using rabbitantiserumtohGH Davis, CA.) or normal rabbit serum was also carried out. Immune GH (Miller et al., 1980a) and bovine Prl (Miller et al., 1980b; complexes were precipitated with protein A of the Cowan strain of Staphylococcus aureus (Kessler, 1975; Martial et al., 1977). ”‘S-la* The costs of publication of this article were defrayed in part by beled proteins were electrophoresed on sodium dodecyl sulfate-lO%’ the payment of page charges. This article must therefore be hereby polyacrylamide slab gels (Laemmli, 1970) at 20 mA/gel, fixed, washed, marked “aduertisement” in accordance with 18 U.S.C. Section 1734 vacuum dried, andexposed to x-ray Fim (Kodak, NS-2T) by standard methods (Cooke et al.,1980). solely to indicate this fact, cDNA Synthesis and Construction of Recombinant Plasmids$ T o whom correspondence should be addressed. 8 Present address, Department of Genetics, Research School of The mRNA from four of the tumors (about 19 pg) was pooled and Biological Sciences, Australian National University, Canberra, Aus- used as templateforsynthesis of cDNA as describedpreviously (Seeburg et al., 1977a).Avianmyeloblastosis reverse transcriptase tralia. was received from Research Resources, National Cancer Institute, 11 An Investigator of the Howard HughesMedical Institute. I/ Recipient of National Institutes of Health Grant GM 25549. ’ The abbreviations used are: Prl, prolactin, GH, growth hormone; M. John, H. Friesen, and H. Niall, personal communication. CS, chorionic somatomammotropin, h, human; r, rat. Manuscript in preparation.
4007
Human
4008
Prolactin
Bethesda, MD, and S-l nuclease from Miles Laboratories. The double-stranded cDNA was tailed with dCMP and the Pst I-cleaved pBR322 was tailed with dCMP (Roychoudhury et al., 1976) using deoxynucleotidyltransferase (P-L Biochemicals). The annealing of the tailed cDNA and plasmid were as described (Cooke et aI., 1980). Restriction enzymes were obtained from New England Biolabs, Beverlv. MA. and Bethesda Research Laboratories. Bethesda. MD. %czn.sformation and Colony Screening. An aliquot (20%) of the double-stranded cDNA was annealed to the Pst I-cleaved and dGMPtailed pBR322. Transformation of Escherichia coli x1776 with the recombinant plasmids was carried out in compliance with NIH guidelines. There were 232 colonies, of which 69 were ampicillin-resistant. All of the colonies were replica-plated onto Whatman 540 paper (Grunstein and Hogness, 1975) and screened with a ‘“P-labeled cDNA probe (Cooke et al., 19801, made using mRNA from tumor 5 as template. Sixteen prominently hybridizing colonies were selected, plasmids were isolated from them, cleaved with Pst I, and analyzed by agarose gel electrophoresis and Southern hybridization (Southern, 1975) using the same cDNA probe. The colony that contained the largest hybridizing insert and that was selected for nucleotide sequence analysis was both tetracyclineand ampicillin-resistant. After positive identification of the clone by sequence analysis, the hPrlcontaining plasmid was isolated and transformed into E. coli RRl. DNA Sequence Determination-After restriction enzyme mapping of the hPrl cDNA clone, appropriate restriction fragments were selected and labeled at their 5’-ends with T-4 polynucleotide kinase (P-L Biochemicals) and [y-‘r’P]ATP (ICN, 4200-4500 Ci/mmol) after removal of the 5’-phosphate groups with bacterial alkaline phosphatase (Enzo Biochemicals. Inc., New York, NY), or at their 3’-ends with the Klenow fragment of DNA polymerase I (New England Biolabs) and the appropriate [a-‘r’P]dNTP (Amersham, - 2500 Ci/ mmol). The chemical cleavage technique of Maxam and Gilbert (1977) was used with one change. The adenine modification reaction contained 25 al of formic acid (Mallinckrodt) and 10 al of “2P-labeled DNA restriction fragments. Reaction was for 10 min at 19 “C!. The reaction was stopped by addition of 200 al of 0.3 M sodium acetate, 0.1 mM EDTA, and 25 pg/ml of tRNA. DNA fragments were separated on polyacrylamide thin gels (8 and 15%) containing 7 M urea (Sanger et al., 1977).
cDNA 12
3456
7
1 2
A
3
4
B kd - --
---
----
-46-
- 14.3 FIG. 1. Autoradiographs of SsS-labeled cell-free translation products after electrohporesis on 10% sodium dodecyl sulfate polyacrylamide gels. A, translation products of polyadenylated RNA isolated from six human pituitary prolactinomas (Z-6) and the background with no added RNA (7) are shown. The approximate locations of %-labeled molecular weight markers, expressed in kilodaltons (kd) are indicated by arrows. B, immunoprecipitates of the translation products using antiserum to human Prl (Z), antiserum to hGH (Z), normal rabbit serum (3), and total translation products with no added antibody (4) are shown. Experiment B was performed with 6 times less material (cpm) than Experiment A.
RESULTS
Isolation and Identification of hPrl-enriched mRNARNA was isolated from six human pituitary P&secreting adenomas; 100 to 270 ag of RNA were obtained from each tumor. The total RNA from each tumor was passed through oligo(dT)-cellulose to enrich for polyadenylated species. This yield was not quantitated in order to minimize losses. The relative abundance of hPr1 mRNA from each tumor was analyzed by cell-free translation, polyacrylamide gel electrophoresis, and autoradiography of the protein products. A prominent band at about 26,000 daltons was seen in each translation that contained tumor mRNA (Fig. 1A). Other prominent bands synthesized in the translation were endogenous to the reticulocyte lysate system. The band at 26,000 daltons was specifically immunoprecipitated by antibody to hPr1 (Fig. 1B). This band was not precipitated by antisera to hGH or by normal rabbit serum, indicating that the predominant protein produced by the polyadenylated RNA from these pituitary tumors was human prePr1. Double-stranded cDNA Synthesis and Cloning-The mRNA from four of the six tumors (1 through 4, Fig. 1 A) was pooled and used as template for the synthesis of a singlestranded DNA. Polyacrylamide gel electrophoresis of the DNA indicated that a high molecular weight species had been produced (Fig. 2A). When the single-stranded cDNA was cleaved with Hae III as described previously (Seeburg et al., 1977b), three prominent bands were produced (Fig. 2A). This suggested that the synthesized single-stranded cDNA reflected an enrichment for a unique mRNA species in the tumors, presumably hPr1. The second strand of DNA was synthesized using reverse transcriptase. Following this, the resulting hairpin loop was digested by S-l nuclease. Fig. 2B shows the Hae III digestion
,I li
-
I
FIG. 2. Autoradiographs of 32P-labeled singleand doublestranded cDNA to hPr1 mRNA after electrophoresis on 5% nondenaturing polyacrylamide gels. A, single-stranded cDNA intact (1) and cleaved with Hae III (2) are shown. The arrow indicates the location of the predominant single-stranded cDNA product. B, double-stranded cDNA was cleaved with Hae III without (I) and with S-l nuclease digestion (2). Molecular size marker (3) is ‘r’Plabeled pM2 digested with HindIII. From top to bottom the sizes of these marker bands are: 2.2, 1.0, 0.5, 0.45, 0.25, and 0.1 kilobases.
of this double-stranded DNA. Comparison of lanes 1 and 2 (before and after S-l nuclease digestion) confirms that most of the DNA was double-stranded. The single-stranded background decreased with S-l digestion, but the major bands were resistant to S-l digestion and their sizes are consistent with the Hae III restriction fragments of the final cloned cDNA (Fig. 3B).
Human Prolactin cDNA A strand of dCMPs was added to the3"ends of the cDNA, and it was then annealed to pBR322 which had been previously digested with Pst I and tailed with dGMPs. When a portion of this recombinantplasmid was transformed into the E. coli strain ~ 1 7 7 6 232 , tetracycline-resistant colonies were produced. The 16 colonies hybridized prominently witha cDNA probe made from the mRNA of tumor 5 (Fig. 1A).
4009
Plasmid DNA wasisolated from each of these colonies, cleaved with Pst I, and analyzed by gel electrophoresis. The largest insert, which contained 959 base pairs (including synthetic dGMP/dCMP tails), was selected for nucleotide sequence analysis. Nucleotide Sequenceof hPrl cDNA-The sequencing strategy used is shown in Fig. 3A. Bothmessage and anti-message
A Pst I
Avo II EcoR II Toq I EcoR I PVU II
Xbo I
Pst I
Taq I
Avo II
Pst I
I
I
A 8 1 -/c 18/3'
5' /G2
1
I
0
100
I 300200
I
I
I
I
I
1
400
500
600
700
800
" "
900
BASE PAIRS
FIG.3. Sequencing strategy and restriction map of hPrl cDNA. A, restriction fragments were labeled a t the site of the solid circles and DNA sequencing proceeded in the direction of the arrowheads for the distance indicated by the length of the arrow. The 5'end of the cDNA was sequenced in the 5' to 3' direction after labeling at the Alu I site in pBR322 (indicated by dashed line). The dotted bar indicates the coding region of the secreted hormone, the white bar indicates the signal peptide, and the solid lines indicate the
untranslated regions of the corresponding Prl mRNA. B , the restriction sites predicted by the DNA sequence were checked by restriction enzyme digestion followed by sizing on polyacrylamide gels. In some instances restriction sites indicated in A were not copied onto B to minimize crowding, although the digestions were performed. Both A and B are drawn to scale as indicated by the calibration in base pairs at the bottom of the figure.
B
A GGAT4
c
G WT+C C
GG+AT*CC
FIG.4. Autoradiographs of "P-labeled DNA fragments generated by Maxam-Gilbert cleavagereactions.
A G C A
G C A
G G A
G
A, two representative 15% polyacrylamide sequencing gels are shown in which the DNAwas 3' labeled at the Aua I1 site that occurs between amino acids-19 and -18of the signal peptide. The bottomband of the leftsequence begins with the third nucleotide of amino acid-19 and proceeds in the 5'-direction on the coding strand, while the right sequence begins a t the second nucleotide of amino acid-18 and proceeds in the 3'direction on the noncoding strand. B , a segment of an 8% polyacrylamide sequencing gel generated fromDNA labeled at the Puu I1 site shows the f m t base of the codon for amino acid 79 and continues through amino acid 95.
cDNA
Prolactin Human
40 10
-28, then the primary translation product of hPrl mRNA is a protein of 25,880.03 daltons. The sequence of227 amino acids includes the28 amino acid signal peptide of hPrl and an at position 86, whichwas not additional amino acid, Ser, reported previously (Shome and Parlow, 1977). The sequence includes all of the 3"untranslated region, the AAUAAA seen in most eukaryotic mRNAs, and a part of the poly(dAMP) tail. Our DNA sequence, however, is lacking most of the 5'untranslated region, possibly due to incomplete synthesis of the fist strand orto digestion of the cDNA corresponding to this region with S-1 nuclease following preparation of the second strand. By comparison, the 5"untranslated region in A 1 2 B 1 2 3 4 5 6 7 8 the analogous rPrl mRNA is a t least 51 base pairs in length (Cooke et al., 1980). The codon selection for hPrl is nonrandom, as it is in rPrl. In contrast to rPrl, however, a 63% preference is shownfor dGMP or dCMPin the third position of the codons. The orientation of hPrl in the plasmid was determined by cleavage of the plasmid with HindIII. There is one HindIII site in the insert and one in the plasmid. The fragments predicted for an orientation of 3 to 5' (3801 and 1529 base pairs) were observed (data not shown). This places the hPrl cDNA within and in the same transcriptional orientation as the p-lactamase gene. Comparisons of Prl, GH, a n d CS-The sequences of the related human polypeptide hormones Prl, GH, and CS were aligned to emphasize their homologies. As shown in Fig. 7, FIG.5. Restriction digests of hPrl cDNA displayed on a 5% arbitrary gapswere introduced into the sequencesmaximize to polyacrylamide gel stained with ethidium bromide. A , the plastheir nucleotide and amino acid identity and also the number mid containing the hPrl cDNA cleaved with Pst I ( 1 ) is displayed next to the molecular size marker pBH322 cleaved with Hpa I1 (2). of conservativeamino acid replacements (Dayhoff, 1978). highly homologous, only those positions From top to bottom the marker fragments are: 622,527,403,309,242, Since CS and GH are 238,217.201,190,180,160,147,122,and 110 base pairs. B, the plasmid at which CS differs from GH are indicated. No attempt was containing hPrl cDNA wascleaved with Pst I and the two hPrl cDNA made to introduce gaps in the untranslatedareas. Fig. 8 shows fragments were isolated from a 5% polyacrylamide preparative gel. a similar alignmentof hPrl and rPrlin which identical amino These two fragments were pooled and electrophoresed on a 5% polyacrylamide analytic gel without ( 1 ) and with further cleavage by acids havebeen shaded toemphasize homologous areas.Since Bg1 I1 (2),Bst E11 (3).Hha I ( 4 ,HindIII (5),Tac I, (6).or Bgl I (7). rat prePrl isonly 225 amino acids in length, two amino acids shorter than human prePrl, Hpa 11-cleaved pBR322 was usedas size marker (8). a gap was introduced after amino
strands were completely sequenced. Representative sequencing gels areshown in Fig. 4. The presence of restriction endonuclease sites predicted from the nucleotide sequence wasconfirmedby restrictionmappingwiththeindicated enzymes (Fig. 3B), and no missing or additional restriction sites were detected with the enzymes used. A characteristic in Fig. 5. restriction map of the cDNA insert is shown The hPrl mRNA sequence and the derived amino acid sequence are shown in Fig. 6. Ifthis assignmentof amino acids is correct andif translation begins at the Metcodon a t position
-2 8 -2 0 met asn ile lys gly ser pro trp l y s g l y ser leu leu leu leu leu
-10 Val ser am leu leu l e u cys gln ser Val ala pro
AAIY:AU;ARCWCAAAa;RUCG~U1;AAAmcrxl:ar,cvccvccvccvcax:~ARCCUGcvcCUGUGCCIY;Pa:C;UGCa3CCC 1 10 leu pro ile cys pro gly gly ala ala arg cys g l n
20 Val
thr leu arg asp leu phe asp arg ala Val
Val
30 leu ser h i s t y r ile h i s
uuGCCCAu3ux1CCC(xx3cxx;axrca:a;A~cAG(u;prcalucGAcvccuGcGA(xx:cGcGa3Ga3CUGcuGczTuzy:Auccw
60 40 50 asn leu ser ser glu m e t phe ser glu phe a s p l y s a r g t y r thr his gly arg gly phe ile thr l y s a l a ile asn ser cys h i s t h r ARC(M:cuGucAGAAAuGu[I:Pa:GAAuu3GwAAAax;vwAcccw~ax;mcuu3AuvAa:ARGcGcAu3ARCIyx:ua3czTAN
70 80 ser ser leu ala thr pro glu asp lys glu gln ala gln gln met asn gln lys asp phe l e u ser leu ile V a l ser i l e l e u arg ser ucu~aKIcGc(u;CCCGAAGpTARGGAGcAA~cAA~~AwcAAAAAcGAuuucu;Pa:cu;AuAGa3Pa:AuAuuGca:Ilcc 100 110 I20 t r p am glu pro leu tyr h i s l e u V a l thr glu V a l arg gly mt g l n g l u a l a p r o q l u a l a i l e leu ser l y s a l a Val glu ile glu uX;AwGAGccU(M;WcW~Ga3~GAA~~octrAuGcAAGAAcGcaX;Gpr,~Au3~cuGAAAGcuGuAGIy;AuuGpr, glu gln thr lys arg leu leu glu gly
130 met glu leu ile
140 Val
ser g l n
Val
h i s pro glu thr lys glu asn glu ile tyr pro Val t r p
~cAA(u;AAAax;cuJ~GAG(xx3Au;GAGcvcAuA~wccIy;(;uucWca)GAApa3AAAGAAAw~Aucuzy:(xxIGucuGG 160 170 ser gly leu pro ser leu gln met ala asp glu glu ser arg leu ser ala tyr tyr asn l e u leu his cys l e u arg arg asp ser h i s uI;a;Rprc~uI:~cIy;AuG~GwGAAGAGuuccx:aKIucu~uwuwARC(M;(M:cAca;A~(xx:pa:GAuu3Acw 190 l y s ile asp a m tyr l e u l y s l e u l e u l y s cys arg ile ile h i s asn asn asn cys 02 ~ A u 3 G p T A w W ~ ~ ( M : c v c ~ ~ c
199
a
:
A
u
3
~
~
A
R
C
A
FIG. 6. Nucleotide sequence and predicted amino acid sequence of t h e mRNA coding for hPrl. The amino acid sequence is deduced from the mHNA sequence using the genetic code. OC, the ochre terminator.
R
C
Human Prolactin cDNA
401 1
% m a n Prolactin
Human Growth 90-e Human Ciwrimic S c m a t a m m t r o p i n
Ala
H i s Asp Ser Asp Ser 70 80 Leu Ala Thr Pro Glu Asp Lys Glu G l n Ala G l n G l n ,Xet Asn G l n Lys Asp Fhe Leu Ser Leu Ile
Ile
Thr
50
Cys H i s Thr SerSer ~cET~~u1)cuuGccIy3ccccGAAGpc~GAGcAAaxcAAcpGALR;AAucAAAAAGpcuuucu;pGccuGAuA
muxcuc~uu3ucAGAGIx3uAuuccGFcA@xuocApcpGGGAGGAAmcAAcFGAAAuL?cA y lc luAGAGcuGcuccGcAucux Ser Leu Cys Phe Ser Glu Ser I l e Pro !l%r Pro Ser Asn Arg Glu Glu Thr G l n GlnLys
50
mm
Gpc
Phe Cys
Asp
60
m
Au; ,Yet
Ser Asn Leu G l u Leu Leu A r g Ile Ser
70
%hr
Aw Asn
110 100 V a l Ser I l e Leu A r g Ser Trp Asn Glu m o Leu Tyr H i s Leu V a l Thr Glu V a l Arg Gly M e t Gln Glu Ala Pro Glu Ala Ile Leu Ser 90
G u 3 I y x : A u A u u G c G A u 1 ) u x A A u G R G c c u c u ; ~ C A U c u G G u 3 p I J G G A A c w L c G u c G u m c A A G A A G c c a K ; ~ G c u A u c c u A cuGcuGcucAuccFGmuGGcuGGFGccc Leu Leu Leu
mcPGmcucpGGAGuGxuucGccAAcpGccuGmuIy3GasGcc
Ile Gln Ser Trp Leu Glu Pro
V a l G l n Phe Leu Arg Ser Val Phe Ala Asn Ser Leu Val Tyr Gl;r Ala 80 GFG 30 ax m m UAUGwm Glu Ar9 M e t Asn Tyr Asp ”hr 140 130 120 Lys Ala Val Glu Ile Glu Glu Gln Thr Lys Arg Leu Leu GluGly Met Glu Leu I l e Val Ser G l n Val His Pro Glu Thr LysGlu Asn
~mGuAGpGAuuGAGGFGcAApccAAAax;~cuAGpGaxmGpGcucAuAGu3pa:cAGGuucwczlLIGAAFa:AAAGAAAAu
u 3 u G A c p c c A A c ~ u A u G p c c u c c u A ~ W c u A G A G G A A a x A u c c A A F a ; c u ; A u ; G G G p G G c u G G A A G w ~ p G c @ x c c c p c u Ser Asp Ser Asn V a l “yr Asp Leu LeuLys Asp Leu Glu Glu Gly Ile Gln Thr Leu Met Gly A r g Leu Glu Asp Gly Ser Pro Arg “,hr
Gww
UCG
cpc
Asp Asp
Ser
120
His
GAc
ax:
ASP
ArY
170 150 100 Glu Ile Tyr Pro V a l Trp Ser Gly Leu Pro Ser Leu Gln M e t Ala Asp Glu Glu Ser A r g Leu Ser Ala ‘Iyr Tyr Asn LeuLeu
H i s Cys
GAG~uIu3~Gu3uxucGQ;Acwa3AuI:cuGcpGmGcuGAuGAAGAG~cGccuuucuocuuwuwAlu=cuGcuccpc~ GcGcpGALR~ApG~mIw3AGcAAGuuc~mApcu3AcpcAIy3GAuWccAcuAcucAAGApcuIy3GGGcoGcucuIy3ucc Gly Gln Ile PheLysGln 3-u Q r Ser Lys Phe Asp Thr Asn Ser H i s Asl Asp Asp Ala Leu Leu Lys Asn Tyr Gly Leu Leu Tyr Cys cu3 140 UCG cw cu; 160 uuu Leu
Ser
Phe
His
Leu
190
180
139 Leu Arg A r g Asp Ser H i s Lys Ile Asp Asn Tyr Leu Lys Leu Leu Lys Cys Arg Ile I l e H i s Asn Asn Asn Cys
cuAcccpGGGwucAcAuAAAALlcWAAuuAucu:AFGcuccuGApGuGccGAAu3ALlccpcApcApc~ucc”-
~
~
~
~
~
Phe Arg Lys Asp Met Asp LysValGlu 170
A
u
;
~
’Ihr Phe Leu A r g Ile V a l Gln
m 180 [“let
FIG. 7. Comparison of the nucleotide sequences and predicted amino acid sequences of the mRNAs coding for human Prl, GH, and CS. The mRNA sequence for hPrl is taken from Fig. 6, for hGH from Martial et al., (1979), and for hCS from Shine et al., (1977) and Goodman et al., (1980). The amino acid sequences were
acid 8 of the rat sequence. Analysis of the homologies seen in these alignments is presentedunder the “Discussion.” DISCUSSION
We have presented the nucleotide sequence of a 914 base pair-cloned DNA fragment containing thehPrlstructural gene sequence. The nucleotide sequence of this cDNA enabled
~
~
Cys A r g Ser Val GluGly
~
~
cc
-
Ser CIS Gly Pr,e 190 191
UAA
Tw
~
~
deduced from the mRNA sequences using the genetic code. The hCS sequence is indicated only where it differs from the hGH sequence. The ochre terminator is indicated by OC, and the amber terminator by AM. The gaps were arbitrarily introduced to maximize homology.
us to predict the amino acid sequence of the human preprolactin signal peptide and to revise the known sequence of the secreted protein. Comparison of the hPrl sequence to the other related human hormones, GH and CS, has permitted us to make some evolutionary speculations concerning the protein-coding areas of these three genes. RNA Isolation, cDNA Cloning, and Nucleotide Sequence
~
Human Prolactin cDNA
4012
QIZ n r m " a W
13Lt AAALIAAARpLzlGAaxxulpoly
(A)
FIG. 8. Comparison of the nucleotide sequences and predicted amino acid sequences of the mRNAs coding for human and rat Prl. The mHNA sequences are taken fromFig. 6 and Cooke et al. (1980).Shaded rectangles indicate identical amino acids. A single gap was introduced in the rat sequence to maximize afignment.
Analysis-Polyadenylated RNA was isolated from hPrl-se- both Prl mRNAs were translated with equalefficiency in the creting pituitary adenomas. The cell-free translation products cell-free system. In the translation of mRNA from human of this mRNA were highlyenriched in prePrl. This enrichment pituitary tumors, no hGH was seen either in the total transof hPrl mRNA in pituitary tumors was remarkable in that it lation or by immunoprecipitation using hGH antiserum. Alappearedto exceed that which we were abletoproduce though it is possible that the presence of some preGH is previously in ratsby estrogen implantation and hypothalamic obscured by the width of the prePrl signal or by the limited ablation (Cooke et al., 1980). This assumes, of course, that exposure of the gel to x-rayfilm, these data may indicate that
Prolactin Human
cDNA
the tumoris not synthesizing GH. In earlier work the mRNA from a Prl-secreting tumor was translated and did contain a small amount of detectable GH mRNA, although it was not determined whether this GH mRNA originated in the tumor tissue or from adjacent normal pituitary tissue (Martial et al., 1979). The mRNA from these human tumors, naturally enriched in Prl mRNA, yielded a cDNA highly enriched in Prl sequences. Initial attempts toscreen the bacterial colonies containing recombinant plasmids with a probe madefrom cloned rPrl cDNA (Cooke et al., 1980) were unsuccessful. Even under conditions of low stringency of hybridization, rPrl cDNA did not cross-hybridize with hPrl sequences. In retrospect, this result was unexpected, since the nucleotide sequence homology of rat and human Prlis 73%. The accuracy of the sequence was documented by analyzing both strands of the hPrl cDNA completely and by verifying the presence of most of the predicted restriction sites. Errors in reverse transcriptase synthesis of the first DNA strand (Gopinathan et al., 1979; Fagan et al., 1980) cannot be excluded. The clone containing hPrl cDNA that was ultimately studied was both tetracycline- and ampicillin-resistant (Villa-KOmaroff e t a l . , 1978).Resistance toboth antibiotics was retained after retransformation of the hPrl-containing plasmid into the RR1 strain of E . coli. Since only one plasmid species was visible on gel electrophoresis, these results suggest that the ampicillin resistance was not due toco-transformation with a wild type plasmid carrying an intact p-lactamase gene. Restriction endonuclease mapping showed that the cDNA was inserted in the transcriptional direction of p-lactamase. Attempts tosequence from the Alu I sitejust 5’ to the Pst I site of the p-lactamase gene into the inserted cDNA suggested that the hPrl cDNA was not in the p-lactamase reading frame. However, determination of the exact length of the long 5’dGMP tail was difficult due tobase stacking and compressions on the gels. Immunoprecipitation with hPrl antiserum of a 45,000-daltonprotein from the extracts of E . coli infected with the hPrl recombinant plasmid suggests that a hybrid of plactamase and hPrl is produced by this clone.” Thus, the retention of ampicillin resistance by the hPrl recombinant plasmid might be explained if the fusion protein retained plactamase activity. The signal peptide sequenceof hPrl hasnot been previously reported. Our sequencing data (Fig. 4A) has shown that it is 28 amino acids inlength. It is identical with the signal peptide of rPrl in 16 positions (Fig. 8), including a hydrophobic block of 4 Leu residues. The secreted protein contains 199 amino acids, one more than previously reported by amino acid sequencing techniques (Shome and Parlow, 1977). In addition, these sequences disagree at eight positions. At positions 82, 83, 85, 86, 105, 144, 162, and 163, we report Ser, Leu, Val, Ser (the additional amino acid)(Fig. 4B), Met,Asn, Glu, and Ser, while Shome andParlow (1977)report Val, Ser, Leu, no amino acid, Asx, Asp, Ser, and Glu, respectively. Partial revision of the amino acid sequence of hPrl is in complete agreement with our results at positions 105, 144, 162, and 163; however the chymotryptic peptide at 81-89 was not isolated or resequenced, so the discrepancies at positions 82-86 were not resolved.’ The reported nucleotide sequence represents a single hPr1 molecule produced by aPrl-secreting pituitary adenoma, whereas the amino acid sequence represents the predominant Prl sequence in pools of normal human pituitaries. Consequently, the degree of identity between these sequences is striking and might suggest that bothadenomatous and normal pituitaries can produce identicalor nearly identical Prl mole-
-
401 3
cules. We cannot determine, however, if the molecule that we sequenced was produced by the adenoma or by the surrounding normal pituitary tissue, although adenomatoustissue predominated in the samples used. The codons used in hPrl mRNA are nonrandom (data not shown) as found in all eukaryotic mRNAs sequenced to date. The codon choices of hGH (Martial et al., 1979) and hCS (Shine et al., 1977) mRNAs are more similar to each other than either of them is to hPrl mRNA. It seems unlikely that differences in codon choice reflect evolutionary pressures to adapt to different tRNA populations. First, almost all of the codons are used in all of the genes of the Prl set. Second, GH and Prl genes are expressed in the sametissue in vivo and can be expressed by the same cell (the ratGHs cell line; Dannies and Tashjian,1973). The differences in codon choice between GH and Prlcontrast with the similarities in the codon choices of GH and CS, which are expressed by different tissues. An analogous situation exists in the globin genefamily. All of the globins have a nonrandom and unique distribution of codon choices, yet are expressed either sequentially (y,/3) or simultaneously (a&)in the sametissue. Consequently, no relationship between expression and tRNA availability has been shown in this family (Wilson et al., 1980). In the Prl set of genes there is a spectrum of preference for guanosine plus cytosine in the third position of the codons: bovine GH, 82% (Miller et al., 1980a); hCS, 80%; hGH, 76%; rGH, 74%; hPrl, 63%; and rPrl, 50% (Cooke et al., 1980).Codons that end with dUMP or dCMP and codons that end with dAMP or dGMP code for the same amino acid (Met and Trp are exceptions). Consequently, a bias in the use of dGMP and dCMP in the third position of codons cannot be explained on the basis of evolutionary constraints againstamino acid subsiitution. This suggests that synonymous codons might be expected to be neutral in terms of evolution, and as discussed, this appears to be the case. Molecular Evolution of the Prolactin Set of mRNAs-A direct comparison of the sequences of the Prlfamily of genes (Table I),aligned as shown in Figs. 7 and 8, shows a greater nucleotide than amino acid sequence homology in each case. In addition, very similar amino acid and nucleotide homologies were obtained in the interspecies comparisons of Prl and GH (z.e. hPrl with rPrl,and hGH with rGH).Thesame parallelism in homologies was observed in the intraspecies comparisons of Prl andGH (i.e. rPrl with rGH, and hPrl with hGH). Thereis a very high degree of homology between hGH and hCS, as hasbeen noted previously (Martial et al., 1979). There can be two types of nucleotide differences between genes when they are compared codon by codon. The first is a difference that results inan amino acid replacement (replacement substitutions); the second results in synonymous a codon (silent substitutions). Thepercentage of silent substitutions in each comparison in Table I (two identical nucleotides/codon) is greater than the25% that would be expected when comparing randomsequences (Jukesand King,1979). This may reflect the evolution of these genes from a common ancestor. Surprisingly, the hGH and hCS cDNAs, which have the highest nucleotide sequence homology of the set (92%),also have the lowest ratio of silent to expressed single base substitutions, 0.45 (31%/69%,Table I). In contrast, the hPrl and rPrl genes with only 73% nucleotide homology have a silentto expressed ratio of 1.2 (54%/46%,Table I). These results are consistent with the hypothesis that positive selective influences have caused rapid fixation of replacement substitutions in the genes of these recently diverged hormones (hGH and hCS). To improve the estimates of divergence among the members of the Prl family of genes, the methods of Perler et al. (1980)
4014
Human Prolactin cDNA TABLEI
Amino acid and nucleotide sequence comparison within the Prl family The results are derived from the sequence alignments in Figs.7 and 8. Codons correspondingto gaps were excluded from the calculations, as were the termination codons. Onlv codingseauences have been comDared. ~~
Amino acids Identical Comparison %J
No.
Nucleotides
bases
Conservative" differences %
No.
o/n
No.
Codons differing by one base
Identical codons %
Silent
No. %J
Expressed No.
%'
No.
Conservative amino acid substitutions were calculated according to Dayhoff (1978) and include only the most common two categories of substitutions. Cooke et al. (1980). 'Martial et al. (1979). Seeburg et al. (1977a). 'Shine et al. (1977). 'Goodman et at. (1980). and Efstratiadis et al. (1980) were used. This technique asTABLEI1 sumes thatnucleotide substitutions occur in a Poisson distri- Percentage of corrected divergence among mRNAs of the prolactin family bution and that transitions (pyrimidine to pyrimidine or purine to purinesubstitutions)andtransversions(purineto The percentage of corrected divergence for each sequence pair was pyrimide or pyrimidine to purine) are equallyprobable. It calculated as described under "Discusion." Codons corresponding to attempts to determine the mutation rate more accuratelyby gaps were excluded from the calculations. considering all of the possible ways that codon differences Comparison Replacement Silent sites sites could have arisen and assigns a probability to each of these. 20.8 Human Prl/rat 76.6 Prl The possibility of multiple events a t a single site enables the 19.4" 71.0" Human GH/rat GH divergence to be greater than100%.The resultsof this correc194.0 87.0 Human Prl/human GH tion on both silent and replacement sites are displayed in 240.0 88.7 Human Prl/human CS Table 11. 2.3 11.5 Human GH/human CS T o establish an evolutionary clock for the Prl family of " Results are in agreement with data from Perler et al., 1980. genes, the corrected divergence at replacement sites for each pair of genes has been plotted against divergence time estimated from the fossil record (Fig. 9). Silent site divergences 1979), and would date the divergence long after the mammawere similarlycalculated and plotted. The divergence between lian radiationbegan, and long after the first presumed requirehuman and rat hormones was assumed to occur during the ment for specific placental hormones. This paradox mightbe mammalian radiation about85 to 100 million years ago (Rom- explained as follows. A mechanism for correcting one gene ero-Herrera, et al., 1973; McKenna, 1969). The divergence against another during evolution has been postulated based between Prl- and GH-like genes (including CS) was assumed on a structural analysis of the human fetal y-globin genes to occur about 400 million years ago, when fish and tetrapods (Slightom et al., 1980). A homologous but unequal cross-over diverged, because the existence of distinct Prl-like and GH- may occur between two such mutuallycorrecting genes ("conet nl., 1980), and acalculated 1976). The certed evolution,"Zimmer like hormones has been dated to this time (Acher, origin and the replacement divergence time will date this correction event insteadof the straight line ( R )drawn through the genes sites servesas the molecular clock for the Prl genefamily. Its original gene duplication. Since the human GH and CS slope gives the rate of fixation of replacement substitutionin reside on chromosome 17 (Owerbach et ab, 1980a), and since this family. The reciprocal of this slope corresponds to a unit a human GH gene has been shown to be linked to a CS gene a human DNA library evolutionary period (UEP),i.e. the time in millions of years on a single DNA fragment isolated from (Goodman et al., 1980), it might be postulated that the 10 for a 1% sequence divergence to arise between homologous exchange of DNA sequences (Dickerson, 1971; Wilson et al., 1977) of 4.5 for the million year divergence timedatesan CS loci onsisterchromatidsduring Prl family. Substantiation of this UEPwill require additional between the GH and meiosis rather than the original gene duplication. Dating of pointsonthecurvesuchasmightbeobtainedfromthe the original geneduplication bycomparison of these two examination of Prl-like genes from birds, reptiles, or fish. Use of this UEP would predict that Prl and GH diverged sequences would be more difficult and perhaps impossible if about 392 million years ago, and Prl and CS about399 million this is true. Since hPrl and hGH are located on separate chromosomes years ago; this results in a minimal revision of our previous estimate of 380 million years ago (Cooke et al., 1980), which (6 and 17, respectively), they areless likely to exchange DNA was based ona Prl UEPderived from aminoacid comparisons in this manner. Therefore, the 392 million year divergence time between these two genes may more accurately reflect (Wilson et al., 1977). Applying the evolutionaryclock and UEP for the Prl family their initial duplication than does the 10 million year diverto the human GH and CS comparison, we arrive at a diver- gence between hGH and hCS. The rate of silent site divergence in the Prl family may be gence time of 10 million years ago. This is very different from previous estimates of 56 million years ago (Martial et al., described by a biphasic curve (Fig. 9) as has been reported for
Prolactin
Human
cDNA
4015
silentreplacements.Theseexamples suggest that positive selection can rapidlyfix replacement mutations withoutbeing accompanied by random silent or neutral mutations.Finally, in an unequal cross-over between two correcting genes, dating a? 200based on sequencehomology will also reflect the extentof the correction. The derived divergence data may therefore be a composite of the datesof divergence and of last correction of Yl the various portions of the genes. z a The 3'-untranslated area of hPrl, when compared without gaps to the 3"untranslated area of rPrl, is identical in 43 of t the 94 bases compared (54.3% divergence). The area containing the sequenceAAUAAA found in most eukaryotic mRNAs and its surrounding sequences is not available in the rPrl sequence for comparison. Homologyin the3"end of the adult /&globin family of genes (Efstratiadis et al., 1980) has been highest near this signal, so our 3'-end divergence may be an underestimate. One might speculate that some constraint on drift within this areaexists for the Prl molecule. However, in et general, no such constraints have been found (Efstratiadis 0 I00 200 300 400 al., 1980). Comparing hGH and rGH directly withoutgaps, a T I M E (MY1 61.8% divergence is found. It is clear by inspection that the FIG. 9. Curves for divergence of coding sequences among the Prl set of genes. The corrected percentage of divergences at homology can be increased by creatingappropriate gaps. diverged 87.556, hPrl and hCS 87.2%, silent (S) and replacement ( R ) sites for comparisons between hPrl Human Prl and GH have while hGH and hCS only 6.4% in the 3'-untranslated area. It and rPrl (O),hPrl and hGH (m), hGH and rGH (O),and hGH and hCS (0)are takenfrom Table 11.These are plotted againstdivergence is impossible to compare these results to the silent replacesite times in millions of years ( M Y )which are minima1 estimates derived ments within the coding portions of the molecule, since the from the fossil record (see text for discussion and references). The correction for multipleevents within eachsitecannot be solid lines ( S and R ) were drawn through the points. Line R was performed. However, compared to the nucleotide identity in passed throughthe origin. The reciprocal slopeof R gives anestimate in each of the unit evolutionary period (UEP, the time in millions of years Table I, it would appear that the 3"untranslated areas for a 1%sequence divergence to arise; Dickerson, 1971; Wilson et al., case (except in the comparison of hGH and hCS) have di1977) for replacement substitutions within the PrI set of genes. Line verged more than the coding areas. This is consistent with S doesnotextrapolatethrough the origin.Inanalogy with the results found in comparisonsof other eukaryoticgenes (Heilig preproinsulin (Perler et al., 1980) and globin(Efstratiadiset al., 1980) et al., 1980; van den Berg et al., 1978; Bell et al., 1980). gene families, we presume that the silent sitesfixed mutations with a higher rate (dashed line) forapproximately85to 100 millions of REFERENCES years. The arrows indicate the derived timeof last correction between hGH and hCS (see text). Acher, R. (1976) Ciba Found. Symp. 41,31-59 Bell, G. I., Pictet, R. L., Rutter, W. J., Cordell, B., Tischer, E., and Goodman, H. M. (1980) Nature 284,26-32 preproinsulin (Perler et al., 1980) and globin (Efstratiadis et Ben-Jonathan, N., Oliver, C., Weiner,H. J., Mical, R. S., and Porter, al., 1980) genes. Theactualbreakpoint in thePrlcurve J. C. (1977) Endocrinology 100,452-458 cannot be determined without more datapoints. However, it Catt, K. J., Moffat, B., and Niall, H. D. (1967) Science 157, 321 is clear that the silent sites diverge rapidly a t fist. They may Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J., and Rutter, W. J. then continue to diverge a t a slower rate. The nature of the (1979)Biochemistry 18, 5294-5299 subset of rapidly diverging sites is not known. It should be Cooke, N. E., Coit, D., Weiner, R.I., Baxter, J. D., and Martial, J. A. (1980) J.Biol. Chem. 255,6502-6510 borne in mind that the accuracy of both replacement and silent site curves is no better than thefossil record on which Dannies, P. S., and Tashjian, A. J., Jr. (1973) in TissueCulture: Methods and Applications(Kruse, P. F., and Patterson, M. K., Jr., they are based. eds) pp. 561-569, Academic Press, New York There are several problems with this typeof evolutionary Dayhoff, M. 0. (1978) Atlas of Protein Sequence, Vol. 5, Suppl. 3, analysis. First, the assumption that transitions and transverNational Biomedical Research Foundation, Washington, D. C. sions areequally probable is not necessarily valid. For exam- Dickerson, R. E. (1971) J . Mol. Evolution 1,26-45 ple, in certain human globin variants, adenine-guanine tran- Efstratiadis, A., Kafatos, F. C., and Maniatis, T. (1977) Cell 10,571585 sitions are observed more frequently than would be expected Efstratiadis, A., Posakony, J. W.,Maniatis,T.,Lawn, R. M., on a random basis (Fitch, 1971). Second, the divergence of O'Connell, C., Spritz, R. A., DeRiel, J. K., Forget, B., Weissman, S. two molecules does not necessarily coincide with and may M., Slightom, J. L.,Blechl, A. E., Smithies, O., Baralle,F. E., antedate the divergence of species. Consequently, the diverShoulders, C. C., and Proudfoot, N. J. (1980) Cell 21,653-668 gence dates plotted are crude approximations. Third, rapid Fagan, J. B., Pastan, I., and de Crombrugghe,B. (1980) Nucleic Acids Res. 8, 3055-3064 and slow periods have been detected in the evolution of the globins (Dickerson, 1971) and when rapid evolution is taking Fitch, W. M. (1971) Syst. 2001.20, 46-50 place,a proteinmay bein the process of assuming new Frohman, L. A., Maran, J. W., and Dhariwal, A. P. S.(1971) Endocrinology 88, 1483-1488 functions. Because of such fluctuations, a linear evolutionary Gopinathan, K. P., Weymouth, L. A,, Kunkel, T. A,, and Loeb, L. A. clock may not be accurate. The high rate of replacement (1979) Nature 278, 857-859 substitutions compared to silent substitutions in the hGH Goodman, and H. M., DeNoto, F., Fiddes, J. C., Hallewell, R. A,, Page, G. S. Smith, S., and Tischer, E. (1980) in Mobilization and ReassemhCS divergence may be an exampleof such an accelerationof bly of Genetic Information (Scott, W. A,, Worner, R., Joseph, D. the clock. Another example may be seen in the rabbit,&globin R., and Schultz,J., eds) pp. 155-179, Academic Press, New York alleles (sequenced by Efstratiadiset al., 1977; Hardison et al., Grunstein, M., and Hogness, D. S.(1975)Proc. Natl. Acad. Sci.U. S. 1979; van Ooyen et al., 1979; and discussed by Perler et al., A . 72, 3961-3965 1980) in which there are four nucleotide differences, each Gubbins, E. J., Maurer, H. A., Hartley, J. L., and Donelson, J. E. resulting in an amino acid changewithno accompanying (1979) Nucleic Acids Res. 6 , 915-930
-
4016
Human Prolactin cDNA
Hardison, R. C., Butler, E. T., 111, Lacy, E., Maniatis, T., Rosenthal, N., and Efstratiadis, A. (1979) Cell 18, 1285-1290 Heilig, R., Perrin, F., Gannon, F., Mandel, J. L., and Chambon, P. (1980) Cell 20,625-627 Jukes, T. H., and King, J. L. (1979) Nature 281, 605-606 Kessler, S. W. (1975) J.Zmmunol. 115, 1617-1624 Krulich, L., Dhariwal, A. P. S., and McCann, S. M. (1968) Endocrinology 83, 783-790 Laemmli, U. K. (1970) Nature 227,680-685 Li, C. H., Dixon, J. S., Lo, T. B., Pankov, Y. M., and Schmidt, K. D. (1969) Nature 224,695 MacLeod, R. M. (1976) in Frontiers in Neuroendocrinology (Martini, L., and Ganong, W. F., eds) Vol.IV, pp. 169-194, Raven Press, Pubs., New York Martial, J . A., Baxter, J . D., Goodman, H. M., and Seeburg, P. H. (1977) Proc. Natl. Acad. Sci.U. S. A . 74, 1816-1820 Martial, J. A., Hallewell, R. A., Baxter, J., and Goodman, H. (1979) Science 205,602-607 Maxam, A. M., and Gilbert, W. (1977) Proc. Natl. Acad. Sci. U. S. A . 74,560-564 McKenna, M. C. (1969) Ann. N . Y. Acad. Sci. 167,217-240 Miller, W. L., Martial, J. A., and Baxter, J . D. (1980a) J.Biol. Chem. 255, 7521-7524 Miller, W. L., Thirion, J. P., and Martial,J. A. (1980b)Endocrinology, 107,851-854 Niall, H. D., Hogan, M. L., Sayer, R., Rosenblum, I. Y., and Greenwood, F. C. (1971) Proc. Natl. Acad.Sei. U. S. A . 68, 866-869 Niall, H. D., Hogan, M. L., Tregar, G. W., Segre, G . V., Hwang, P., and Friesen, H. (1973) Recent Prog. Horm. Res. 29, 387-416 Nilson, J . H., Thomason, A. R., Horowitz, S., Sasavage, N. L., Blenis, J., Albers, R., Saker, W., and Rottman, F.M. (1980) Nucleic Acids Res. 8, 1561-1573 Owerbach, D., Rutter, W. J., Martial,J. A., Baxter, J. D., and Shows, T . B. (1980a) Science 209,289-292 Owerbach, D., Rutter, W. J., Cooke, N. E., Martial, J. A., and Shows, T. B. Science (1980b), in press Pelham, H. R. B., and Jackson, R. J . (1976) Eur. J. Biochem. 67, 247-256
Perler, F., Efstratiadis, A., Lomedico, P., Gilbert, W., Kolodner, R., and Dodgson, J. (1980) Cell 20,555-566 Romero-Herrera, A. E., Lehman, H., Joysey, K. A,, and Friday, A. E. (1973) Nature 246,389-395 Roskam, W. G., and Rougeon, F. (1979) Nucleic Acids Res. 7, 305320 Roychoudhury, R.,Jay, E., and Wu, R. (1976) Nucleic Acids Res. 3, 863-877 Sanger, F., Nickien, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A . 74,5463-5467 Seeburg, P. H., Shine, J., Martial,J. A., Baxter, J. D., and Goodman, H. M. (1977a) Nature 270,486-494 Seeburg, P. H., Shine, J., Martial,J. A., UlLrich, A., Baxter, J. D., and Goodman, H. M. (1977b) Cell 12, 157-165 Sherwood, L. M. (1967) Proc. Natl. Acad. Sci. U. S. A . 58,2307-2314 Shine, J., Seeburg, P. H., Martial, J. A., Baxter, J. D., and Goodman, H. M. (1977) Nature 270,494-499 Shome, B., and Parlow, A. F. (1977) J.Clin. Endocrinol. Metab. 45, 1112-1115 Slightom, J. L., Blechl, A. E., and Smithies, 0. (1980) Cell 21, 627638 Southern, E. M. (1975) J.Mol. Biol. 98,503-517 Takahara, J., Arimura, A., and Schally, A. V. (1974) Endocrinology 95,462-465 van den Berg, J., van Ooyen, A., Mantei, N., Schambock, A,, Grosveld, G., Flavell, R. A., and Weissman, C. (1978) Nature 276, 37-44 van Ooyen, A,, van den Berg, J., Mantei, N., and Weissman, C. (1979) Science 206,337-338 Villa-Komaroff, L., Efstratiadis, A., Broome, S., Lomedico, P., Tizard, R., Naber, S. P., Chick, W. L., and Gilbert, W. (1978) Proc. Natl. Acad. Sci. U. S. A. 75, 3727-3731 Wilson, A.C., Carlson, S. S., and White, T. J. (1977) Annu. Rev. Biochem. 46,573-639 Wilson, J. T., Wilson, L. B., Reddy, V. B., Cavallesco, C., Ghosh, P. K., deRiel, J. K., Forget, B. G., and Weissman, S. M. (1980) J.Biol. Chem. 255,2807-2815 Zimmer, E. A., Martin, S. L., Beverly, S. M., Kan, Y. W., and Wilson, A. C. (1980) Proc. Natl. Acad. Sci.U. 5'. A. 77,2158-2162