Chicken Protamine Genes Are Intronless - Semantic Scholar

5 downloads 0 Views 7MB Size Report
Jul 25, 2016 - Smith et al., 1987; Davenport and Heindel, 1987; Oyen et al., ..... Acknowledgments-We would like to thank Tony Garber, Wayne. Connor, Bob ...
THE JOURNAL OF BIOLOGICAL CHEMISTRY 0 1989 by The American Society for Biochemistry and Molecular Biolow, Inc

Vol. 264, No. 21, Issue of July 25, pp. 12472-12481,1989 Printed in U.S. A.

Chicken Protamine Genes Are Intronless THECOMPLETEGENOMICSEQUENCE

AND ORGANIZATION OF THE TWO LOCI* (Received for publication, December 12, 1988)

Rafael Oliva and Gordon H. DixonS From the Department of Medical Biochemistry, Facultyof Medicine, University of Calgary, Calgary, AlbertaT2N 4N1, Canada

A positive cosmid clone obtained froma pwel5rooster DNA library using a chicken protamine cDNA probe reveals the complete sequence of the twoloci for the rooster protamine genes. The organization of these two loci within the cosmid clone matches that of genomic DNA. The copy number per haploid genome is two. The sequence for the rooster protamine predicted from the coding region shows differences from that previously determined at the protein level (Nakano, M., Tobita, T., andAndo, T. (1976) Int. J. Peptide Protein Res. 8 , 565-578). A recent re-determination of the rooster protamine amino acid sequence (28 residues from the N terminus) matches that predicted from the genome rather than the sequence of Nakano et al. (1976). Both loci are intronless and the gene is extremely GC-rich (88%in the coding region). The 5’ region of the gene contains a typical TATAAA box, several CG boxes, as well as other characteristic motifs. The 3’ region of the gene containsthe polyadenylation signal and several GT repeats of known Z-DNA forming potential. A correlation betweenthe functional map of the gene and the tendency of the DNA to bend or to adopt the Z-conformation is presented and possible roles for these conformations in thetranscription of this gene arediscussed.

During spermatogenesis, a dramatic change in chromatin structure involving a complete replacement of the nucleosomal core histones by sperm-specific proteins (protamines) takes place inmany species(Bloch, 1969; Marushigeand Dixon, 1969; Subirana, 1975). Protamines act by compacting the DNA in thenuclei of the spermatozoa.Various proposals for their function have been made such as streamlining the sperm cell, protection of the genetic message delivered bythe spermatozoa and contributing to the acquisition of an imprintedhlank state in the male germinal line (Subirana,1975; Kasinsky et al., 1987; Risley, 1988).Knowledge of the primary structure of the protamines has implicationsfor understanding how this transition in chromatin structure takes place, the elucidation of the mode of packaging and organizationof the DNA in the sperm, and the elucidation of the mechanisms of DNA unpackaging and pronucleus formation after fertilization (Dixon et al., 1975; Mezquita and Teng,1977a, 197713; *This work was supported by a MedicalResearch Council of Canada term operating grant (to G. H. D.) and an Alberta Heritage Foundation for Medical Research fellowship (to R. 0.).The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18U.S.C. Section1734 solely to indicate this fact. $ To whom correspondence should be addressed: Dept.of Medical Biochemistry, Facultyof Medicine, Universityof Calgary, 3330 Hospital Dr. N. W., Calgary, Alberta T2N 4N1, Canada.

Mezquita,1985a, 1985b; Poccia, 1986; Oliva and Mezquita, 1986; Oliva et al., 1987; Risley, 1988). Here we describe a n avian(rooster)protamine geneshowing thecompletesequence of the coding region as well as the complete 5’- and 3”flanking regions of the gene which occurs at two loci in the genome. The rooster genes are likely to be representative of those from other avian species as judged by the presence of protamines of similar size in different avian orders (Chivaet al., 1987, 1988). The distribution of protamines throughout phylogeny is not completely understood at present. Here we show how the redetermined rooster protamine amino acid sequence is more strikingly similar to mammalian protamines than that determined previously (Nakano et al., 1976). The availability of the protamine gene DNA sequence and its alignment with those of other protamine genes has implications for understanding protamine gene evolution as well as the identificationof those consensus DNAsequences required for the developmentally regulated expression of sperm-specific genes (Dixon et al., 1985; Johnson et al., 1988; Krawetz and Dixon, 1988). Since the expression of protamine genes takes place onlyin thegerm cell line, they provide an excellent model system for the studyof the mechanismsof gene expression leading to terminal differentiation. The implications of the distributionof consensus cis-actingsequences in the5’ of the gene along with a highly potential Z-DNA forming set of sequences and the predictednucleosome phasing pattern are discussed in the contextof the controlof expression of these genes. EXPERIMENTALPROCEDURES’ RESULTS

Cloning and Genomic Organizationof the Rooster Protamine Genes Adoubledigestby BglII/BclI of rooster genomic DNA showed an average size fragment distribution of 2-6 kb2while the potential rooster protamine gene(s) were localized in DNA fragments larger than 22 kb (Fig. lA). The potentially enriched DNA containing the rooster protaminegene and having asizelarger than 22 kb wasgel-isolated and used to construct a pWE-15 cosmid library. The librarywas screened with a rooster protamine cDNA probe and a positive clone was isolated and characterized (Figs. 1B and 2). The organization of the cosmid clone matches that of genomic DNA since the same pattern of bands is present in the Southern blots in both cases when a battery of different restriction in miniprint at the The “Experimental Procedures” are presented end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press. * The abbreviationsused are: kb, kilobase(s);bp, base pair(s); CRE, CAMPregulatory element.

12472

Sequence and

A

Organization of the Rooster Protamine Genes

12473

gion. Immediately upstream of the ATG start codon, several direct and inverted repeats are found. The mRNA start has been determined tobe 77 nucleotides upstream from the ATG Genomic DNA Cormid clone start codon as determined by nuclease S1 mapping (Fig. 5). Bgl II+BclI This mRNA start is defined as the nucleotide +1 (Fig. 4). Upstream from the mRNA start, theGoldberg-Hogness box Et. South. Kb. Kb (TATAAA) is found a t position -28. Further upstream (po23.1 23.1 sition -51) the two half-sites of a CAMP receptor element 99. 9.4split by 10 bp of CG-rich sequence are found (Roesler etal., 6.56.5 1988; Fink et al., 1988) (Figs. 4, 7, and 8). Several CG boxes (Dynan and Tjian,1983; Gidoni et al., 1984,1985) are present 4.3. 4.3along the 5’ region of the gene, some of which match with spl binding sequence of Dynan and that corresponding to the 2.32 3. (1985) (Fig. 4). Tjian 2.0 2.0Coding Region-The amino acid sequence for the rooster protamine derived from the coding region of the gene shows significant differences fromthat previously determined at the protein level by Nakano et al. (1976) (Fig. 6). We have resequenced 28 residues fromthe N terminus of the protein and the sequence obtained matches that predicted from the ge0.5. nomic DNA sequence rather than thesequence of Nakano et 0.5al. (1976). The genomic DNA sequence encoding the C terFIG. 1. PWE-15 rooster DNA libraryconstructionand minus of the protamine also differedfrom our previously characterization of the cosmid clone CPC4. A, ethidium bromide reported cDNA sequence which contained some ambiguous stained gel (Et.)and correspondingSouthern blot (South.)of a double digest, BgfII/BclI, of rooster testes DNA. B, genomic Southern blot bands due to thedegeneration of sequence that occurs when poly(A) tail (Oliva et al., 1988). and cosmid CPC4 Southern blot using the same battery of restriction reading through an extensive enzymes. The band of the genomic DNA BarnHI digest is retarded We have resequenced this cDNA clone from the other end to with respect to the corresponding one in the cosmid clone because of avoid the poly(A) tail and it has become apparent that in the the overloading of DNA (10pg) in the high molecular weight range previously reported cDNA sequence a “CGC” codon was missfor this lane. The quantity of DNAin the cosmid clone Southern blot was adjusted to give a band intensity equivalent to that of 10 pg of ingdue to the presence of a compression artifact in the sequencing gel. The cDNA sequence redetermined from the genomic DNA containing a single copy gene/locus/haploid genome. other end isunambiguous with no signs of compressions and enzymes isused (Fig. 1B). The presence of two bands in each shows clearly seven arginine codons in the final C-terminal digest was suggestive of two loci for the rooster protamine cluster in perfectaccordance with the reportedgenomic DNA gene. The existence of these two loci was confirmed by se- sequence for both loci. 3’-Noncoding Region-Overlapping the actual termination quencing (Figs. 3 and 4) and mapping (Fig. 2). Some of the mapping gels are shown in Fig. 2, b and c. Restriction sites codon of the rooster protamine, a sequence exists that, if represented in large type (Fig. 2a) were first mapped. At this translated, would encode an additional cluster of arginines point, the size of the two bands in a NotI digest and the fact followed by another tyrosine and a termination codon. This that the only two bands generated in a double XhoIINdeI is inaccordance with the previously reported cDNA sequence digest were labeled in the Southernblot, narrows the placing (Oliva et al., 1988). Downstream of the termination codon, a isfound (AATAAA). The of each locus to the flanking NotI-XhoI site for locus 1 and typicalpolyadenylationsignal to the flanking NotI sitesfor locus 2. The presence of single mRNA end and the poly(A) tail beginningoccurs at the band in a BamHI digest when using a BamHI-PstI 5’-non- nucleotide +347 as indicated from the mRNA sequence (Fig. coding flanking probe (Fig. 2c) suggests that this partof both 4). From position +424 to the end, thesequence is markedly genes is contained in the same fragment and situates the two enriched in alternating purine/pyrimidine repeats. Some of loci about 6 kb apart. The samedigest when hybridized with these repeats have the potential to adopt the Z-DNA form a coding region probe shows two bands of sizes 1.8 and 5 kb, (Fig. 8). respectively, which indicates that both loci are probably in opposite orientations. (The unique BamHI site 5‘ of the gene DISCUSSION is conserved in both loci (Fig. 4)). Such an arrangement is We have cloned and sequenced the two loci for the genes supported by the absence of XhoI andMluIsitesinthe encoding the rooster (Gallus domesticus) protamine, galline. sequenced 3’ region of the gene. The relative position of loci 1 and 2 is based on theknown flanking sitesused for subclon- Both loci have identical coding regions, therefore predicting ing andsequencing (Fig. 3) and isconfirmed bythe restriction the same amino acid sequence. This predicted amino acid map of the 7-kb PstI subclone of locus 2 in a Bluescript sequence shows significant differences from the sequence for plasmid (result not shown). The results of five independent the rooster protamine previously determined at the protein copy number determinationswere as follows: 1.0 & 0.15, 0.83 level by Nakano et al. (1976). The changes are shown in Fig. 6 and include 8 point changes and a deleted arginine cluster k 0.3, 0.88 f 0.3, 1.1 k 0.1, and 1.05 k 0.05 copies/haploid genome for each locus. The overall average is 0.97 k 0.10; resulting in a total length of 61 amino acids (predicted from therefore, we conclude that thecopy number/haploid genome the genes) instead of 65 (Nakano et al., 1976). To make sure for the rooster protamine gene is two, one copy a t each locus. that thesequence predicted from thegenome is expressed, we have purified the rooster protamine, galline, and resequenced Characteristics of the Rooster Protamine Gene Derived from 28 residues fromthe N terminus. Theredetermined protamine the Sequence sequence matches that predicted from thegenes rather than 5’-Noncoding Region of the Gene-Numerous consensus the sequence of Nakano et al. (1976) (Fig. 6). In addition, a sequences and characteristic motifs are localized in this re- partial cDNA sequence also agrees with the sequence pre-

B

Sequence Organization and

12474

of the Rooster Protamine Genes

b. Partial Digests CPC4

Kb

-- .. .) = 10

48.3 23.1

C. Complete Digests

"

CPC4

23.19.4d

6.5 I

-

0-

0

"

-

4.30

9

4.3 13

-

0

2 .3 I

-

".rl) 1

-

-

2.3

2.0

-

3

Y

9

0.5

-

0

2

--

FIG.2. Mapping the two rooster protamine gene loci. A, map of the cosmid clone CPC4. For restriction enzymes represented with small type only those sites flanking the gene are listed. The numbers correspond to those shown on the partial digest. E, partial digests of the large NdeI-EgZII fragment end-labeled at the NdeI site. Arrows indicate increasing digestion. The numbers correspondto those shown in the CPC4 map. C, Southern blots of some complete digests of the cosmid clone CPC4 are shown. The filters were probed with a 5"noncoding probe. M I is X-Hind111 size marker; M2 is XDNA + XEgZII size marker.

dicted from the genes (Fig. 6). The copy number/haploid genome is two, one copy at each locus. We conclude that the DNA sequence reported in this paper corresponds clearly to a functional gene. We can only speculate on the basis of the difference in sequence reported here from that of Nakano et al. (1976) but it is likely that the determination of the sequence of a polypeptide like galline, which is highly argininerich and repetitive, by the classical method of overlapping peptide fragments, is prone to error. However, the main point is that thevariety of rooster described in this paper (White Leghorn) expresses the protamine sequence predicted from the genome. This provides the basis for using this system as a model to study mechanisms that may control the expression of sperm-specific genes. In addition, knowledge of an accurate amino acid sequence is importantfor understanding the mechanisms of nucleosome disassembly during spermiogenesis and nucleoprotamine formation (Oliva et al., 1987; Nakano et al., 1989) and for interpreting crystallographic data in thestudy of the three-dimensional structure of nucleoprotamine. Compared to the well characterized structure of the somatic nucleosome (Richmond et al., 1984; Burlingame et al., 1985; Morse and Simpson, 1988), little isknown of the structureof nucleoprotamine (Warrant and Kim, 1978; Subirana, 1982; Balhorn, 1982; Balhorn et al., 1984, Gatewood et al., 1987; Tobita etal., 1988). Regulatory Sequences-Upstream of the mRNA start, a t

position -28 a Goldberg-Hogness or TATA box is found showing a perfect match with the consensus TATAAA (Breathnach and Chambon, 1981). This sequence is known to be important for the correct start of transcription of the gene (Mathis andChambon, 1981; Dynan and Tjian, 1985; Struhl, 1987) partlythrough the binding of transcriptionfactors TFIIA and TFIID (Nakajima et al., 1988; Hobson et al., 1988). The distance between the TATA box (-28) and theinitiation of transcription (+1)matches well with the average distance in higher eukaryotes (-34 to -26); however, the mRNA start site (GAGCGGC, see Figs. 4 and 5) fits only partially with the consensus (PyrAPyrPyrPyrPyrPyr) (Breathnach and Chambon, 1981).According to such a consensus, a betterrulefitting start site (CACTTCC) would be found 7 nucleotides downstream from the experimentally determined start site (+1) (Figs. 4 and 5). The lack of a close fit of the major S1 site to the CAP consensus sequence and the GC-richness of this region, which would inhibit breathing, raises the possibility that the faintsecond site found 8 bp downstream from the major site and fittingmuch more closely to theconsensus CAP site may represent an additional start site. Upstream of the TATA box, the two half-site elements of a CAMPregulatory element (CRE) arefound a t position -51 to -68 (Fig. 4). These half-site elements show a perfect match with the consensus (TGACGTCA (Roesler et al., 1988)) except that they areseparated by a 10-bp palindromic CG box

Sequence Organization and

of the Rooster Protamine Genes Et. Br.

12475 Southern Blot

FIG.3. Subcloning and sequencing the two protamine gene loci. The insert in the upper right corner shows PstI and Sau3A digests of the cosmid clone PSC4 and its corresponding Southern blot. The bands indicated with arrows were gel isolated and subcloned. Subclones CPC4S4 and CPC4S14 cover both orientations of the PstI fragment that contains locus 1. Subclones CPC4S9 and CPC4S10 cover both orientations of the Sau3A fragment that contains locus 1. Subclone CPC4S8 contains locus 2. All these subclones wereprogressively deleted using the ExoIIIlmung bean nuclease system and the serially deleted subclones, indicated by arrows, selected and sequenced. The length of each arrow represents the length of sequence read from each subclone. Arrow numbers correspond to the deleted subclone number.

(TGACCCGGCCGGCCCGTCA). The 3’ motif also matches the CGTCA consensus described to be essential for biological activity of CAMP-regulated enhancers (Fink et aZ., 1988) (Fig. 7). Since the two half-sites TGAC and GTCA are separated by 10 bp of CG-rich sequence, they would be on adjacent turns of the DNA and, therefore, on the same side of the Bhelix. Since CREbinding protein is thought tobind as a dimer to the two half sites it is likely that the split CRE binding protein motif is functional although split. Insertions of 10 bp between the two half-sites generated “in vitro” only slightly modify the activity of the element (Fink et al., 1988). Spermatogenesis is known to be dependent on hormonal control involving CAMP (Heindel et al., 1975; Eikvar et al., 1985; Smith et al., 1987; Davenport and Heindel, 1987; Oyen et al., 1988); therefore, such a regulatory sequence might represent a link between a hormonal signal and the expression of a sperm-specific gene such as protamine. Upstreamof the CRE, three perfect matches corresponding to thecis-acting element of Spl are found a t positions -118, -138, and -197 respectively (Fig. 4) (Dynan and Tjian, 1983, 1985; Gidoni et aZ., 1984, 1985). Sp-1 proteins usually act as transcription activators (Jackson and Tjian,1988) although silencing functions have also been proposed (Jankowski and Dixon, 1987). Other potential cis-acting sequences include these corresponding to the yeast GCN4 binding protein and MREd whose vertebrate homolog is the AP1 binding protein (Lee et al., 1987; Arndt and Fink, 1986; Hattori et al., 1988; Seguin and Hamer, 1987) (positions -289 and -405), the corresponding site for the yeast transcription factor RC2 (Pfeifer et al., 1987) (position -302), the core protected “in vivo” from dimethyl sulfate in

the alcohol dehydrogenase (ADH1) gene (Fer1and Nick, 1987) (positions -170 and -356), an E4TF1 factor binding site (Jones et al., 1988) (position +8), and a postulated testisspecific protamine P1 regulatory sequence (Krawetz and Dixon, 1988) (Fig. 7). The 3‘ region of the gene shows the typical polyadenylation signal, AATAAA (Proudfoot and Brownlee, 1976; Humphrey and Proudfoot, 1988). Locus 2 contains an extra “A” at this position, A A T W (Fig. 4). This is the only change found at the mRNA level between the products of each locus, and should prove a useful unique marker to distinguish any possible differential expression of the two genes. The poly(A) start is located a t position +347 and isderived from the cDNA sequence of two independent clones (Oliva et al., 1988). Correlation between the Functional Map of the Gene, CpC Distribution, Potential for 2-DNA Formation, and Flexure of the DNA-The gene location correlates well with the presence of a CpG island (Fig. 8). This is in contrast with the usual lack of CpG islandsin highly tissue-specific genes (Bird, 1986). The very high frequency of CpG dinucleotide sequences associated with this gene is, at leastin part, due to the extremely high G+C content of the gene (88%C+G in the coding region). This unusual abundance of the CpG dinucleotide is in markedcontrast to its general rarity in vertebrates (one-fifth (1.25%) of the expected frequency (6.25%)) and may also be significant because CpG sequences are the major target for DNA methylation in eukaryotes. The CpG content of this gene from +1 to +350 is 20.5%, 16.4 times higher than the general frequency in bulk DNA. It hasbeen proposed that the rarity of the CpG dinucleotide is likely to be a consequence

12476

Sequence and Organization of the Rooster Protamine Genes -415

-338

AP 1

11 CTGCAGTCCGATGAGTGCCCCAACCACAGCCTCTGCGGTG

G----CCGCG GTGCCACGCG TCCCCCCGCT CCGCGCCCGC CAGTG

SP1 -179 CCCGCCCGGC GCCATCCCGT

SP1 AGCACCGCCC GGCCCGGGAC

SP 1 -103 CCCCCCGCCCCCCATCCCGT

--

-

CRE -23 G G C C G G C C G T C A G A G G G C CGCGCGCGGG CGCCTATAAA

E

CCCGTCCCGA ACGGCCCGGG CGCGTTCGGC TCAGTGACCC

v mRNA

GCCGTGCGCC GCCACCGCCT

L2*

CAGG-CTGAA CCACAACGGG GCTGCGGGGG GGCGTCGGAG C TS ATCCTTTGAACGCCCTTTAATTGCCCCGCGCCGTCCCGCT

C

APl GGTCGCTGTG AGTGCCCGCA

CCCACGTGCTGTAGATGCCG T N -259 CGGCGACAAC GCGCGGAGGG

CCCACCGCCT CCGCGTGTCC

start

AGCGGCGCGG GGCGCGGAGC GGAGCGGCAC TTCCCGGTCC CGCACGGCTCGGCACCGGCC

+58 GCGGGCGCCG TCCCACCGCC

+76 GTCCCACCGCAGCCCGGC ATG GCC CGC TAC M A R

FIG. 4. Sequence of the two rooster protamine gene loci. Ll, loci 1 sequence. Sequence for loci 2 is shown between the two asterisks, indicating the changes from the loci 1 sequence. The minus sign indicates abthis level. sence of nucleotide at

CGCTCC R S

Y

CGG CGC AGC R R S

GGC CGC CGC CGC AGC CCC G R R R S

CGC TCC GTG R S V

AGG ACC CGC AGC R T R P

S

CGC AGC R S

CGC CGC CGC CGC CGC R R R R R

GGG GGA CGG CGC CGC CGC TAC G G R R R R

Y

CCC CGC P R

AGC CGC CGC CGC CGC S R R

R

R

CGC+139 R

TAC GGG AGC GCC CGG CGG T C CC G C + 2 0 2 Y G S A R R S R

CGC CGC CGC CGC CGC CGC TAC GGG AGC CGC G S R R R R R R

R

Y

T

TGA+265 e r +344

GCCGCCGCCG CCGCCGCTAC TGAGCCGCGC CCGCCCCGCG

-poly

A

CCCCGCGCTG CCGCTCCGCC AATAAA-CCG CCGTGGCACC A +424

GGGCGCGCGT GTCCGTGCTT CTGTCTGCGC CCCGGGGCCG

TCCGTGTCCG GCGCTCCCAC GGCCCGGGCC GGCACCGTGC

GCCGGCTGTG TGCGGCCCGG GGTGGCCGCT GCGGGTGTCC

CTTCGGTGGCCGTCGGTGTCCCTTCGGTGTCCCATCGGTG

+504

* L2 +584 TCCGTGTGTGCCATTGTGTGCCATTGTGTGCCAGCTGTGT

GTGTCCCTTTGGGTGTCCGTTGGTGTCCCTTTGGTGTCCC

ATCGGTGTCCGTGTGTGCCATTGTGCTCCAGCTGTGCTCA

TTCCATGGTG CCCATCCCTG TCCCCTGGGT CCCATTGGTG

CCCCATGGTGTCCATATCCAGTGGTCCCACATCCCATGGT

GTCCATGTGTGCCATTGTGCTCCACTGTGTGCGTCCCGTG

GTGTCCATCCGTGTCCCTTCAGTCCATGTCCCATCGGTGT

CCCATTGGTGTCCCATTGGTGTCCGTGTGTGCCATTGTGT

GCCAGCTGTGTGTATCCCTTTGGGTGTCTGTCGGTGTCCC

ATCGGTGTCCGTGTGTGCCATTGTGCTCCAGCTGTGCACA

TTCCATGGTGCCCATCCCTGTCCCCTGGGTCCCATCGATG

TCCCATGGTG TCCCATCGGT GCCGATGTGT GTCATCGTGC

ACATCTGTGTGTCCCATGGTGCCCATCCATGTGCAATGGC

TCCATGCCCACGGTGTCCGTGTCCATCCCAGGCTGTCACT

GCTGACCAGC TGTGTGCATC CCGTGGTGTC CATCTGTGTC

CCTGCAG

+664 +744

+8 24 +go4 +984 +1064

+1111

11

of DNA methylation (60-90% of the CpGs are methylated) can be predicted that a nucleosomal core would be preferensince 5mCpGs are unstable and would tend to mutate to TpGtially assembled at position -400 f 50 and anotherone could by deamination at position 4 (Coulondre et al., 1978; Cooper be positioned at -225 f 50. The 5' region of the gene including et al., 1983). The preservation of such a well defined CpG most of the potentialcis-acting elements (-150 to -1) would island might, therefore, depend upon its being kept demeth- have a low tendency to allow the positioning of a nucleosome. ylated. Another consequence of the high CpG content of this Analogous regions have been described as lacking nucleogene could be an underlying higher spontaneous mutation somes or being complexed into particular structures (Almer tendency to generate certain amino acids. For example the et al., 1986; Gross and Garrard, 1988; Thomas and Elgin, CGC codon for arginine (which accounts for 31 (88.6%) of the 1988). Another featureof interest derived from the DNA sequence 35 arginine codons) could mutateto TGC, the codon for is the accumulation of potential Z-DNA forming regions in cysteine, several residues of which are presentin all the mammalian protamines. Also the CGA arginine codon could the 3' region of the gene (Fig. 8). The adoption of a 2-DNA generate, through this mechanism, a TGA termination codon conformation in uitro has already been demonstrated for the S'-noncoding region of an alternating purine/pyrimidine reperhaps accounting for the presence of an additional potential peat present in the trout protamine genes (Aiken et al., 1985). coding region beyond the first termination codon (i.e. it is Also the negative superhelical density that might be generated possible that, in earlier avian forms, the protamine coding during transcription at the5' end of the gene could drive the region was longer by seven codons but that a CGA arginine potential Z-DNA site found between the TATA box and the codon mutated at position 62 to the present day termination CRE to theZ-DNA conformation (Hagenet al., 1985; McLean codon, TGA) (see Fig. 4 and Oliva et al., 1988). and Wells, 1988; Wells, 1988). The adoption of such a conforThe flexure of the DNA for the wholegene has been mationprior to transcription could affect the binding of predicted according to the algorithm of Calladine and Drew controlling trans-acting factors to thecis-acting elementsand (1986). This property is known to be responsible in part for thus modulate transcription (Hirose and Suzuki, 1988), for the accurate positioning of nucleosomes in uitro (Calladine example, supercoiling has been described as changing the and Drew, 1986; Satchwell et al., 1986; Drew and Calladine, interactions between operator and repressors (Whitson et al., 1987; Retief et al., 1987; Kelafas et al., 1988). From Fig. 8 it 1987; Kramer et al., 1988).

Sequence Organization and

of the Rooster Protamine Genes

12477

contain a single intron (Krawetz et al., 1988; Johnson et al., 1988) but is similar to the intronless protamine genes from T GA salmonid fishes (States et al., 1982; Moir and Dixon, 1988). ,T Accordingly, the absenceof introns in thechicken protamine genes suggests that they mightbe more closely related to the 1 1 2 3 4 5 6 7 8910 salmonid protamine genes than to the mammalian ones. On A T the other hand, the striking similarity between the N terminus A A of the mammalian protamines and that of the rooster would A place mammalian and bird protamines closer to each other than to fish protamines (Fig. 7). Thissimilarity between *b c mammalian and rooster protamine becomes stronger when the sequence predicted from the genome (and in accordance c 'with the resequence obtainedforthe N terminus of the protein) is used instead of the sequence obtained by Nakano c et al. (1976). It is specially interesting that theonly threonine c residue found in rooster protamineoccupies exactly theposiG G . tion of that present in the ram and bull protamines (Fig. 7). A The similaritiesbetween the mammalian and chicken protG amines raise two possible hypotheses:1)independent parallel evolution of two "similar in function" domains starting from two independent gene lines or 2) divergence from a single gene line. Divergence from a single gene line by duplication and subsequent independent evolution is a much more comA mon mechanism, thereforewe tend tofavor this later hypothesis. How do we then explain the presence or absence of introns between mammalian on one hand andfish-bird protamine genes on the other? It seemsmore likely that introns c . n were present initially prior to the eukaryote-prokaryote dic c G vergence rather than introduced into genes later in evolution G (Gilbert etal., 1986; Marchionni and Gilbert,1986; Quigley et T r al., 1988; Hawkins, 1988; Shih et dl., 1988). This reduces the c . question to whatwas the mechanism of this potential intron c . r C # loss from the fish and chicken protamine genes. A common G mechanism of intron loss is the integration of a processed C I gene into the genome through a retrovirusorretroposon A (Temin, 1985; Weiner etal., 1986). In the intronless protamine C r genes from salmonidfish, evidencehas already beenpresented FIG.5. S1 mapping of the 5' end of the rooster protamine in favor of this retroposonmechanism due to thepresence of mRNA. Lanes 1 and 2 correspond to the Maxam and Gilbert reac- long terminal retroviral repeats flanking the gene (Jankowski tions TC and GA, respectively, performed on a sample of the probe used for S1 mapping. Lanes 3 and 4 were mapped with 1pg of RNA; et al., 1986), the mobility of protamine genes in the trout fishes lanes 5-10 were mappedwith 5 pg of RNA; lanes 5 and 8 were digested genome and theirsporadic distribution among different with 150 units of Sl/ml and all other lanes with 1500 units of S1 (Dixon etal., 1985). The situation with the rooster protamine nuclease/ml. Lanes 3, 6, and 9 were digested at 20 "C and all other genes contrasts with that of the salmonid protamine genes lanes a t 37 "C. because of their low copy number (2), the constancy of protamine presence and size among different avian orders (Chiva N-terminus C-terminus et al., 1987, 1888) and because only weak similarities have been found so far between the flankingregions of the rooster 1 ARYRRsRT"Z"I'RI"RRRRSGRRRSPRRRRRYGSARRSRRSVGGRRRRYG~RRRRRR~Y gene and several avian retroviral long terminal repeats (results not shown). However, this is not surprising since the 2 A R Y R ~ S R ~ R S R S ~ R ~ R R R R C ~ G R R R SYPGRS RA RR R S R R S ~ G ~ R R R R Y G S R R ~ R R R Revolutionary Y time between the potential event rendering prot,SRRRR "p amine intronless and the present might have been sufficient P FIG.6. Alignment of the amino acid sequence predicted to have changed markedly the sequence of associated retrofrom the genomic clone with that obtained by Nakano et al. viral elements. Compelling evidence for the origin of other (1976) for galline. 1, amino acid sequence predicted from the two testis-specific genes as recruited retroposons has been pregenomic loci of the chicken protamine genes. The area underlined at sented in the case of the testis-specific isoform of phosphothe N terminus corresponds to the region confirmed by resequencing glycerate kinase 2 (Boer et al., 1987). However, other possithe protamine at the protein level. The number of arginine residues in the C-terminal arginine cluster is confirmed by the sequence of a bilities to explain the lack of introns in the fish and bird partial cDNA clone. 2, sequence obtained by Nakano et al. (1976) a t protamine genes such as mispairing and unequal crossingthe protein level. Gaps and loop-outs have been introduced to maxi- over a t meiosis cannot be excluded at thispoint. mize similarity (mismatches are indicated as asterisks to thesequence T o our knowledge, the chicken protamine genes possessthe predicted from genome (and confirmed by protein resequencing)). most GC-rich coding region described so far (88% (84.7% for the entire transcriptional unit)). Such GC-richness is partly Evolution of Protamine Genes: Insights from the Chicken due to the codon usage which is extremely biased to the use Protamine Sequence-Neither of the two loci for the chicken of G and C in silent positions(98.3% C+G at the thirdcodon position) (i.e. out of the sixpossible codons forarginine (CGX protamine gene contains introns (Figs. 2 and 4). This is in contrast with the mammalian protamine genes, all of which and AGA/G) with the exception of a single AGC, only CGC

3

.*I-

4-

12478

Sequence and Organizationof the Rooster Protamine Genes B

A

C C G G C C G G C C

v

T G A C G RT oC oA s t e r

Trout P I 0 1

T GC A

Chum Salmon

A R00STEil

R

Y

R

R

S

GCC CGC TAC CGG CGC AGC AGG ACC

* *

* *

BULL P 1

GCC AGA TAC CGA TGC TGC CTC ACC CAT AGC GGG AGC

tMOUSL P 1

GCC AGA TAC CGA TGC TGC CGC AGC AAA AGC AGG AGC

*

R

HUMAN P 1

.

S

I

**f

I

CCC AGA AGA CCC AGA TCC TCC AGC CGA CCT GCT

* * ** **

R

S

f

* *

Bull P1

T G A C E T C A

Mouse P 1

T G A C f T C A

Mouse P z

T G A E G Z C A

Somdtostdtln

G TAC

G C TA

C

* *

* C i G AGC

ff

AGC T

A C AGC T

* *

*I

GCCAGGTACAGATGCTGTCGCAGCCAGAGC

* TROUT OPI

f

R

CGC AGC CGC AGC

.** *

f

T

GT

**. ***

1

Y T a t T t A a t P T G

2

C C C T T T A A T - T G

f

*

CGC

f

FIG. 7. Alignment of different protamines and protamine gene regions. A,top, alignment of the protamine N-terminal amino acid sequence from different species to the mammalian consensus. The mammalian protamines used were bull (Coelingh et al., 1972; Grimes, 1986; Krawetz et al., 1987, 1988; Lee et al., 19871, ram (Sautiere et al., 1984), boar (Tobita et al., 1983), mouse P1 (Kleene et al., 1985), and human (McKay et al., 1985; Ammer et al., 1986; Lee et al., 1987). Trout PlOl sequence is from States et al. (1982). Small numbers indicate the abundance of the variable amino acids. Two dots indicate an amino acid identical to the mammalian consensus, and one dot indicates that the amino acid shares common chemical properties. Bottom, alignment of the DNA sequence corresponding to the N-terminal coding region of several protamine genes. Mismatches are indicated as asterisks. B, alignment of the potential CRE from different protamine genes to that found in the somatostatin gene (Roesler et al., 1988). Chum Salmon sequence was from Moir and Dixon (1988) and mouse P2 from Johnson et al. (1988). C, best alignment found between the postulated protamine P1 regulatory sequence (Krawetz and Dixon, 1988) and the rooster protamine gene. mRNA start

FIG. 8. Correlation between the functional map of the gene, the CpG distribution, the Z-DNA forming potential, and the DNA flexure.

c

POTENTIAL 5

NUCLEOSOMAL

(31) and CGG (4) codons are used). Such specific codon usage contrasts with that of trout (Gedamu et al., 1981; States et al., 1982; Aiken et al., 1983), dogfish (Berlot-Picard et al., 1986), or mammals (Kleene et al., 1985; Krawetz et al., 1987; Lee et al., 1987), where there is either approximately equal usage of both arginine codon sets or a predominance of the AGA/G codon usage. The C+G richness of the rooster protamine gene fits well with the known tendency for the C+G content to increase in evolution as the vertebrate temperature increases (Bernardi et al., 1988). Internal temperature in thechicken is markedly higher than inmammals and in addition the testes are internal, in contrast with the external location in mammals, therefore, such a marked temperature difference could account for the differences in codon usage found between mammalian and bird protamine genes. It has also been spec-

ulated that specific codon usage may play a role in transcriptional regulation throughlimitation of tRNA availability (Konigsberg and Godson, 1983; Liljenstrom and von Heijne, 1987), stability of the DNA (Bernardi et al., 1988) or secondary structure at the mRNA level (Spaer, 1985; Wada and Suyama, 1986). Acknowledgments-We would like to thank Tony Garber, Wayne Connor, Bob Winkfein, Rob Moir, and Paul Cannon for numerous helpful discussions and for transmitting methods, Jacques D. Retief for running the DNA-flexure program and advising on how to interpret the predicted DNA flexure, Fred K. Hagen for making available his Z-DNA site search program and his latest unpublished data, and Don McKay, protein sequencing laboratory, Faculty of Medicine, for performing the resequence of the N terminus of galline.

Sequence Organization and

ofRooster theProtamine Genes

12479

Hagen, F. K., Zarling, D. A,, and Jovin, T. M. (1985) EMBO J. 4, 837-844 Aiken, J . M., McKenzie, D., Zhao, H.-Z., States, J . C., and Dixon, G. Hanahan, D. (1985) in D N A Cloning: A Practical Approach (Glover, H. (1983) Nucleic Acids Res. 11,4907-4922 D. M., ed) Vol. I, pp. 109-135, IRL Press, Oxford Aiken, J . M., Miller, F. D., Hagen, F., McKenzie, D. I., Krawetz, S. Hattori, K., Angel, P., Le Beau, M. M., and Karin, M. (1988) Proc. A., van de Sande, J . H., Rattner, J. B., and Dixon, G. H. (1985) Natl. Acad. Sci. U. S . A . 85, 9148-9152 Biochemistry 24,6268-6272 Hawkins, J. D. (1988) Nucleic Acids Res. 16, 9893-9905 Almer, A,, Rudolph, H., Hinnen, A., and Horz, W. (1986) EMBO J . Heindel, J. J., Rothenberg, R., Robison,G. A., and Steinberg, A. 5,2689-2696 (1975) J . Cyclic Nucleotide Res. 1,69-79 Ammer, H., Henschen, A,, and Lee, C. H. (1986) Biol. Chem. HoppeS. (1987) Methods Enzymol. 155, 156-165 Henikoff, Seyler 367,515-522 Hirose, S., and Suzuki, Y. (1988) Proc. Natl. Acad. Sci. U. S. A . 85, Arndt, K., and Fink, G. R. (1986) Proc. Natl. Acad. Sci. U. S. A . 83, 718-722 8516-8520 Ho, P. S., Ellison, M. J., Quigley, G. J., and Rich, A. (1986) EMBO Balhorn, R. (1982) J. Cell Biol. 93, 298-305 J. 5, 2737-2744 Balhorn, R., Weston, S., Thomas, C., and Wyrobek, A. J . (1984) Exp. Hobson, G. M., Mitchell, M. T., Molloy, G. R., Pearson, M. L., and Cell. Res. 150, 298-308 Benfield, P. A. (1988) Nucleic Acids Res. 16, 8925-8944 Bencini, D. A,,O'Donovan, G. A,,and Wild, J. R.(1984) BioHumphrey, T., and Proudfoot, N. J . (1988) Trends Genet.4,243-245 Techniques 2, 4-5 Jackson, S. P., and Tjian, R. (1988) Cell 55, 125-133 Berk, A. J., and Sharp, P.A. (1977) Cell 12, 721-732 Jankowski, J. M., and Dixon, G. H. (1987) Biosci. Rep. 7,955-963 Berlot-Picard, f., Vodjdani, G., and Doly, J . (1986) Eur. J. Biochem. Jankowski, J. M., States, J. C., and Dixon, G. H. (1986) J. Mol. Evol. 160,305-310 23,l-10 Bernardi, C., Mouchiroud, D., Gautier, C., and Bernardi, G. (1988) J. Johnson, P. A,, Peschon, J. J., Yelick, P. C., Palmiter, R. D., and Mol. Euol. 28, 7-18 Hecht, N. B. (1988) Biochim. Biophys. Acta 950,45-53 Bird, A. P. (1986) Nature 321, 209-213 Jones, N. C., Rigby, P. W. J., and Ziff, E. B. (1988) Genes & Dev. 2, Bloch, D. P. (1969) Genetics 61, (suppl.) 93-111 267-281 Boer, P. H., Adra, C. N., Lau, Y.-F., and McBurney, M. W. (1987) Kaiser, K., andMurray, N. E. (1985) D N A Cloning: A Practical Mol. Cell. Biol. 7, 3107-3112 Approach (Glover, D. M., ed) Vol. I, pp. 1-47, IRL Press, Oxford Breathnach, R., and Cambon, P. (1981) Annu. Rev. Biochem. 50, Kasinsky, H. E., Mann, M., Huang, S. Y., Fabrel, L., Coyle, B., and 349-383 Byrd, E. W. (1987) J. Exp. Zool. 243, 137-151 Burlingame, R. W., Love, W. E., Wang, B., Hamlin, R., Xuong, N., Kefalas, P., Gray, F. C., and Allan, J . (1988) Nucleic Acids Res. 16, and Moudrianakis, E. N. (1985) Science 228,546-553 501-517 Calladine, C. R., and Drew, H. R. (1986) J . Mol. Biol. 192, 907-918 King, P. V., and Blakesley, R. W. (1986) Focus, Vol. 8, No. 1,pp. 1Chirgwin, J. M., Przybyla, A. E., MacDonald, R. J., and Rutter, W. 3, Bethesda Research Laboratories, Gaithersburg, MD J. (1979) Biochemistry 18, 5294-5299 Kleene, K. C., Distel, R. J., and Hecht, N. B. (1985) Biochemistry Chiva, M., Kasinsky, H. F., and Subirana, J. A. (1987) F E E S Lett. 24, 719-722 215,237-240 Konigsberg, W., and Godson, G. N. (1983) Proc. Natl.Acad. Sci. Chiva, M., Kasinsky, H. E., Mann, M., and Subirana, J . A. (1988) J . U. S. A . 80,687-691 EXP. ZOO^. 245,304-317 Kramer, H., Amouynal, M., Nordheim, A,, and Muller-Hill, B. (1988) Coelingh, J. P., Monfoort, C. H., Rozijn, T. H., Gevers Leuven, J. A., EMBO J. 7,547-556 Schiphof, R., Steyn-Parve, E. P., Braunitzer,G., Schranck, B., and Krawetz, S. A., and Dixon, G. H. (1988) J . Mol. Evol. 27, 291-297 Ruhfus, A. (1972) Biochim. Biophys. Acta 285, 1-14 Krawetz, S. A,, Connor, W., and Dixon, G. H. (1987) D N A ( N . Y.) Cooper, D. N., Taggart, M. H., and Bird, A. P. (1983) Nucleic Acids 6,47-57 Res. 11,647-658 Krawetz, S. A., Connor, W., and Dixon, G. H. (1988) J. Biol. Chem. Coulondre, C., Miller, J . H., Farabaugh, P. J., and Gilbert, W. (1978) 263, 321-326 Nature 274, 775-780 Lee, C.-H., Ahmed, M., Hecht, W., Hecht, N. B., and Engel, W. Davenport, C. W., and Heindel, J. J. (1987) J. Androl. 8,307-313 (1987) Biol. Chem. Hoppe-Seyler368, 131-135 DiLella, A. G., and Woo, S. L. C. (1987) MethodsEnzymol. 152, Liljenstrom, H, and von Heijne, G. (1987) J . Theor. Biol. 124,43-55 199-212 Maniatis,T.,Fritsch,E. F., and Sambrook, J. (1982) Molecular Dixon, G.H., Candido, E. M. P., Honda,B. M., Louie, A. J., MacLeod, Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, A. C., and Sung, M. T. (1975) Ciba Found. Symp. 28,229-258 Cold Spring Harbor, NY Dixon, G. H., Aiken, J. M., Jankowski, J. M., McKenzie, D. I., Moir, Marchionni, M., and Gilbert, W. (1986) Cell 46, 133-141 R., and States, J. C. (1985)in ChromosomalProteinsandGene Marushige, K., and Dixon, G. H. (1969) Dev. Biol. 19, 397-414 Expression (Reek, G. R., Goodwin, G. A., and Puigdomenech, P., Mathis, D. J., and Chambon,P. (1981) Nature 290, 310-315 eds.) Plenum Publishing Corp., New York Maxam, A. M., and Gilbert, W. (1980) Methods Enzymol. 65, 499Drew, H. R., and Calladine, C. R. (1987) J. Mol. Biol. 195, 143-173 560 Dynan, W. S., and Tjian, R. (1983) Cell 35, 79-87 McKay, D. J., Renaux, B. S., and Dixon, G. D. (1985) Biosci. Rep. 5, Dynan, W. S., and Tjian, R. (1985) Nature 316, 774-778 383-391 Eikvar, L., Levy, F. O., Jutte, N. H. P.M., Cervenka, J., Yoganathan, McKay, D. J., Renaux, B. S., and Dixon,G. H. (1986a) Eur. J. T., and Hansson, V. (1985) Endocrinology 117, 488-491 Biochem. 158,361-366 Ferl, R. J., andNick, H.S. (1987) J . Biol. Chem. 262, 7947-7950 McKay, D. J.,Renaux,B. S., and Dixon, G. H. (1986b) Eur. J. Fink, J . S., Verhave, M., Kasper, S., Tsukada, T., Mandel, G., and Biochem. 156, 5-8 Goodman, R. H. (1988) Proc. Natl. Acad. Sci. U. S. A. 85, 6662McLean, M. J., and Wells, R. D. (1988) Biochim. Biophys. Acta950, 6666 243-254 Gatewood, J. M., Cook, G. R., Balhorn, R., Bradbury, E. M., and Mead, D. A., and Kemper,B. (1988) in Vectors,A Survey of Molecular Schmid, C. W. (1987) Science 236, 962-964 Cloning Vectors and Their Uses (Rodriguez, R. L., and Denhard, Gedamu, L., Wosnick, M. A., Connor, W., Watson, D. C., Dixon, G. D. T., eds) pp. 85-102, Butterworths Publishers, Stoneham, MA H., and Iatrou, K. (1981) Nucleic Acids Res. 9, 1463-1482 Mezquita, C. (1985a) in Chromosomal Proteins and Gene Expression Gidoni, D., Dynan, W. S., and Tjian, R.(1984) Nature 312,409-413 (Reck, G. R., Goodwin, G. A., and Puigdomenech, P., eds) pp. 315Gidoni, D., Kadonaga, J. T., Barrera-Saldana, H., Takahashi, K., 332, Plenum Publishing Corp., New York Chambon, P., and Tjian,R. (1985) Science 230, 511-517 Mezquita, C. (1985b) RevisionesSobre Biologia Celular (BarberaGilbert, W., Marchioni, M., and McKnight, G. (1986) Cell 46, 151Guillem, ed) No. 5, pp. 1-114, Leioa-Vizcaya, Spain 154 Mezquita, C., and Teng, C. S. (1977a) Biochem. J . 164,99-111 Glen, A. E., and Wahl, G. M. (1987) Methods Enzymol. 152, 604Mezquita, C., and Teng, C. S. (197713) Biochem. J . 170, 203-210 610 Moir, R. D., and Dixon, G. H. (1988) J. Mol. Evol. 27, 8-16 Grimes, S. R. (1986) Comp. Biochem. Physiol. 83B, 495-500 Morse, R. H., and Simpson, R. T. (1988) Cell 54, 285-287 Gross, D. S., and Garrard, W. T. (1988) Annu. Reu. Biochem. 57, Nakajima, N., Horikoshi, M., and Roeder, R. G . (1988) Mol. Cell. 159-197 Biol. 8, 4028-4040 REFERENCES

12480

Sequence Organization and

of the Rooster Protamine Genes

Nakano, M., Tobita, T., and Ando, T. (1976) Int. J . Peptide Protein Res. 8,565-578 Nakano, M., Kasai, K., Yoshida, K., Tanimoto, T., Tamaki,Y., and Tobita, T. (1989) J. Biochem. (Tokyo) 105, 133-137 J. H. (1986) EMBO Naylor, L. H., Lilley, D. M. J., and van de Sande, J. 5, 2407-2413 Oliva, R., and Mezquita, C. (1986) Biochemistry 25, 6508-6511 Oliva, R., Bazett-Jones, D., Mezquita, C., and Dixon, G. H. (1987) J. Biol. Chem. 262, 17016-17025 Oliva, R., Mezquita, J., Mezquita, C., and Dixon, G. H. (1988) Deu. Biol. 125,332-340 Oyen, O., Scott, J. D., Cadd, G. G., McKnight, G. S., Krebs, E. G., Hansson, U., and Jahnsen, T. (1988) FEES Lett. 229,381-394 Ozkayanak, E., and Putney, S. D. (1987) Biotechniques 5, 770-773 Peticolas, W. L., Wang, Y., and Thomas, G. A. (1988) Proc. Natl. Acad. Sci. U. S. A . 85, 2579-2583 Pfeifer, K., Arcangioli, B., and Leonard, G. (1987) Cell 49, 9-18 Poccia, D. (1986) Int. Reu. Cytol. 105, 1-65 Proudfoot, N. J., and Brownlee, G. G. (1976) Nature 263, 211-214 Putney, S. D., Benkovic, S. J., and Schimmel, P. (1981) Proc. Natl. Acad. Sci. U. S. A . 78, 7350-7354 Quigley, F., Martin, W. F., and Cerf, R. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,2672-2676 Retief, J. D., Sewell, B. T., and von Holt, C. (1987) Biochemistry 26, 4449-4453 Richmond, T. J., Finch, J . T., Rushton, B., Rhodes, D., and Klug, A. (1984) Nature 3 11,532-537 Risley, M. S. (1988) in Chromosomes: Eukaryotic, Prockaryotic and Viral (Adolph, K., ed) CRC Press, Boca Raton, FL Roesler, W. J., Vandenbark, G. R., and Hanson, R. W. (1988) J. Biol. Chem. 263,9063-9066 Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. U. S. A. 74,5463-5467

Satchell, S. C., Drew, H. R., and Travers, A.A. (1986) J. Mol. Biol. 191,659-675 Sautiere, P., Belaiche, D., Martinage, A., and Loir, M. (1984) Eur. J . Biochem. 144,121-125 Seguin, C., and Hamer, D. (1987) Science 235, 1383-1387 Shih, M. C., Heinrich, P., and Goodman, H. M. (1988) Science 242, 1164-1 166 Smith, H. 0. (1980) Methods Enzymol. 65, 371-380 Smith, F. F., Tres, L. L., and Kierszenbaum, A. L. (1987) J . Cell. Physiol. 133,305-312 Sollner-Webb, B., and Reeder, R. H. (1979) Cell 18, 485-499 Shpaer, E. G . (1985) Nucleic Acids Res. 13, 275-288 States, J. C., Connor, W., Wosnick, M. A., Aiken, J. M., Gedamu, L., and Dixon, G. H. (1982) Nucleic Acids Res. 10,4551-4563 Struhl, K. (1987) Cell 49, 295-297 Subirana, J. A. (1975) in The Biology of the Male Gamente (Duckett, J. G., and Racey, P. A., eds) pp. 239-244, Academic Press, London Subirana, J. A. (1982) Proceedings of the Fourth International Symposium on Spermatology (Andre,J., ed)pp. 197-213, Martinus Nijhoff, B. V., The Hague Tartof, K. D., and Hobbs, C. A. (1987) Focus, Vol. 9, No. 2, p. 12, Bethesda Research Laboratories, Gaithersburg, MD Temin, H. M. (1985) Mol. Biol. Euol. 2,455-468 Thomas, G. H., and Elgin, S. C. R. (1988) EMBO J. 7,2191-2201 Tobita,T.,Tsutsumi, H., Kato, A., Suzuki, H., Nomoto, M., and Ando, T. (1983) Biochim. Biophys. Acta 744, 141-146 Tobita, T., Tanimoto, T., and Nakano, M. (1988) Biochem. Znt. 16, 163-173 Wada, A,, andSuyama, A. (1986) Prog. Biophys. Mol. Biol. 47, 113157 Warrant, R. W., and Kim, S.-H. (1978) Nature 271, 130-135 Weiner, A. M., Deininger, P. L., and Efstratiadis, A. (1986) Annu. Reu. Biochem. 55,631-661 Wells, R. D. (1988) J. Biol. Chem. 263, 1095-1098 Whitson, P. A,, Hsieh,W.-T.,Wells, R. D., and Matthews, K. S. (1987) J . Biol. Chem. 262,4943-4946

Sequence and Organizationof the Rooster Protamine Genes

12481