Document not found! Please try again

Structure and Organization of Mouse U3B RNA Functional Genes*

2 downloads 0 Views 2MB Size Report
Jun 8, 1988 - Uniuersiti Paul-Sabatier 118, Route de Narbonne 31062 Toulouse Cedex, ...... Crouch, R. J., and Bachellerie, J. P. (1986) in DNA Systematics.
THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1988 by The American Society for Biochemistryand Molecular Biology, Inc.

Val. 263, No. 36, Issue of December 25, pp. 19461-19467, 1988 Printed in U.S.A.

Structure and Organization of Mouse U3B RNA Functional Genes* (Received for publication, June 8,1988)

Sylvie Mazan and Jean-PierreBachellerieS From the Centre de Recherches en Biochimie et Ginitique Cellulaires, du Centre National dela Recherche Scientifique, Uniuersiti Paul-Sabatier 118, Route de Narbonne 31062 Toulouse Cedex, France

We report the isolation and primary structure of a segment of U3 RNA and the 5”terminalnucleotides of the three genes encoding mouse U3B RNA which are ex- internal transcribed spacer 2 (ITS2) of pre-rRNA suggesting pressed after injection into Xenopus laevis oocytes. a potential involvement of U3 in the excision of ribosomal Over the U3BRNA coding region, their sequences are ITS2 (10,ll).However, the proposed model remains tentative perfectlyidentical,showingninepointdifferences so far and, more generally, for each of the reactions of the with rat U3B, which do not alter the RNA secondary complex pathway of rRNA processing, the recognition signals structure. A comparison of the threemouse sequences still appearelusive (12), with the sole exception of the initial for the gene flanks reveals the extensive divergence of cleavage of the primary transcript(13). the downstream regions, except fora few nucleotides Mouse is one of the three eucaryotes, in addition to yeast adjacent to the U3RNA coding region, which contain Saccharomyces cerevisiae and amphibian Xenopus laeuis, for a motif matching the consensus sequence for the U which the entire sequence of the rRNA primary transcript is small nuclear RNA 3’ end formation signal. By conspecies a valuablesystem trast, the upstream flanking regions are strongly ho- now available (14),thus making this for further studying the role of U3 RNA in pre-rRNA procmologous up to position -500, but they completely essing. In this attempt, we have first analyzed the mouse diverge thereafter. Within the homologous portion of 5‘ flanks, several motifs can be recognized which are genescoding for U3 RNA. Like for othermammalian U unambiguously related to sequence elements involved snRNA genes (15), the study of U3 genes is impeded by the in the transcriptional control of other U small nuclear presence of multiple pseudogenes (16-19). However, here we RNA genes: two of these motifs precisely map at the report on the cloning, sequencing, and genomic organization of mouse U3B RNA genes which appear to be functional. locations (relative to the transcription start site) expected for the proximal and distal (enhancer-like) seMATERIALSANDMETHODS quence elements of U small nuclear RNA genes, and “Sp1”-GC boxes and a CCAAT box are also present in Isolation and Analysis of Clones Containing U3 RNA Sequencestheir vicinity. The comparison with the ratU3B gene A library of mouse liver DNA cloned into the X vector Charon 4A confirms that the preferential preservationof the 5’- was screened, according to Ref. 20, for U3 RNA coding sequences as flanking sequences extends up to position -500, sug- described previously (19). Prehybridization and hybridization were gesting the functional importance of sequences well carried out in 6 X SSC, 10 mM EDTA, 5 X Denhardt’s solution, 0.5% upstream from the distal sequence element of the pro- SDS, 100 pg/ml sonicated Escherichia coli DNA a t 45 “C (1X SSC is M NaC1, 0.015 M sodium citrate, pH 7 ; 1 X Denhardt’s solution moter. Two of the mouse genes are closely linked in 0.15 is 0.02% Ficoll, 0.02% polyvinylpyrrolidone, 0.02% bovine serum genomic DNA (5 kilobase pairs apart, same orientaalbumin). After hybridization in the presence of a nick-translated tion) and seem to have been homogenized through a cloned cDNA probe corresponding to the first 135 5’-nucleotides of recent conversion event. More generally, this small rat U3B RNA (kindly provided by A. V. Furano, National Institutes multigene family (at most six to sevencopies of func- of Health), nitrocellulose filters were washed twice for 30 min in 1 X tional U3B genes per mouse haploid genome) appears SSC, 0.1% SDS at37 “C andtwice for 30 min a t 55 “C in0.1 X SSC, 0.1% SDS, thus allowing retention of hybrids with up to about 13% to have undergonea concerted evolution in rodents.

of mismatched base pairs. DNA was isolated from positive clones and restriction fragments containingU3RNA sequences were identifiedby Southernblot hybridizationwith labeledDNA orRNAprobes (see analysis of Although suspected for long (1-3), the involvement of U genomic DNA), according to Ref. 20. After a second screening of the snRNAs’ in nuclear RNA processing has recently received positive recombinants(see“Results” ) and subcloning into pUC8 direct experimental support in the case of U7 (4), U1 (5, 6), vectors, sequencing was performed on M13 subclones (21) by the and U2 (7). The role of these snRNA species appears to be dideoxynucleotide chain termination method (22). Taking advantage mediated by specific base-pairings with selected portions of of previous partialRNA sequence determinations formouse U3 RNA: three synthetic oligonucleotides (22- to 25-mers) mapping on pre-mRNA molecules. Due to its selective association with either strand of the U3 coding region were first used as primers for nucleolar pre-ribosomes (8,9), U3 RNA represents the most sequencing the gene proximal portion. In order to further extend the likely snRNA candidatefor a role in pre-rRNAprocessing. A sequenced portion, other synthetic primers mapping in the flanks of substantial base complementarity has been observed between the U3 coding region were later used, in addition toa M13 universal primer. Expression of the Mouse U3 Genes in XenopusOocytes-After CsCl * This work was supported by Grant CRE 851 001 from the Institut National de la Santb et de la Recherche M6dicale and Grant 6158 purification, recombinant plasmid DNAs were tested for expression from ARC. The costs of publication of this article were defrayed in by microinjection into the nuclei of Xenopus oocytes, according to part by the payment of page charges. This article must therefore be Ref. 23. Fifteen stage 6 X . laeuis oocytes were each injected with 20hereby marked “aduertisement” in accordance with18 U.S.C. Section 40 nl of a solution containing 0.5 pg/ml plasmid DNA and 5 pCi/pl of [a - 32P]CTP(800 Ci/mmol), whereas control oocytes were only 1734 solely to indicate thisfact. injected with labeled CTP. Aftera 24-h incubation of oocytes a t 18 “C $ T o whom correspondence should beaddressed. The abbreviations used are: snRNA, small nuclear RNA; SDS, sodium dodecyl sulfate; bp, base pairs;kb, kilobase pairs. L. H. Qu and J.-P. Bachellerie, unpublished results.

19461

19462

Structure and Organization of Mouse U3B RNA Genes

in modified Barth's solution (23) supplemented with10 pg/ml streptomycin and penicillin, total RNA was extracted by a proteinase KSDS treatment prior to phenol extractionand analyzed either directly directly or after a hybrid selection with a DNA fragment specificof the U3 coding region(a PstI-HinfI fragment extending from position +137 to position +214 in the U3 RNAsequence).For hybrid selection, 1-2 pg of this heat-denatured DNA fragment were spotted onto nitrocellulose filters (1 pm, Hybond C , Amersham), and total RNA pooled from 5-10 microinjected oocytes was analyzed according to the indications of the membrane supplier. Hybrid-selected RNA was recovered as previously described(17). Analysis of GenomicDNA-GenomicDNAwas isolated from BALB/c mouse liverby proteinase K-SDS treatment prior to phenol extraction accordingto Ref. 20.Appropriately digested genomic DNA was fractionated on 0.7, 1, or 2% agarosegels and transferred to nylonmembranes(HybondN,Amersham).Prehybridization and I hybridization were carried out at 50 "C, in50%deionized formamide, 5 X SSC, 50 mM sodium phosphate buffer, pH 7.0, 0.1% SDS, 5 X Denhardt's solution, 250 pg/ml denatured E. coli DNA. Filters were r""-""""""" washed twice for 30 min in 1 X SSC, 0.1% SDS at 37 ' C and twice for 30 min in 0.1 X SSC, 0.1% SDS at 60 "C. I I ECORI t~vull ?Sac1 t Scar I Several 32P-labeledRNA probes, obtainedby in vitro transcription of T7 recombinants (24), were utilized, which map either within the U3 coding region or within its immediate 5' and 3' flanks (see Fig. l"""""""""_J 1). The probespecificfor the U3codingregionwas obtained by subcloning the HinpI-HinfI fragment of pE U3.1 (whichextends from position -22 upstream from the coding region to the last nucleotide of the U3 RNA sequence) into a pTZ18R recombinant (in reverse orientation relative to the T7 promoter). Transcription by the T7 polymerase of this PuuII-digested recombinantgave rise to an antisense U3 probe complementary to the segment (+140, +214) of the U3B RNA sequence. A 5' flank probe was obtained by subcloning into a pTZ19R recombinant a 323-bp EugI fragment of pE U3.1. A UG.3 I 3' flankprobefor the U3.1locuswas prepared by subcloning a " 4 - 1 HindIII-ScaI fragmentof pE U3.1. Other RNA probes specific forthe i 3' flanks of the U3.2 and U3.3 loci were also generatedby the same . I procedure. Labeled pTZ transcripts were obtained according to the FIG. 1. Restriction maps of the regions surrounding the indications of the enzyme's supplier and checked by gel electrophothree mouse U3B genes and sequencing strategy. For the two resis. X phage recombinants,the hatched boxes denote the Charon 4A arms. Expanded views of the DNA fragments subcloned in pUC recombiRESULTS nants are also shown (above or below each h insert), with the locations Isolation of Mouse DNA Clones Containing U3 RNA Coding of U3B RNA coding sequences indicated by large horizontal arrows Regions-About 2 x lo5 recombinant phages were screened (the arrowheadpointingtoward the 3' end). Foreach restriction enzyme, the sites shown correspond only to the closest ones (both at moderate stringency with a plasmid recombinant DNA upstream and downstream) from the U3 coding sequence. Sequencing probe containing a U3 RNA coding sequence (19). At this strategy: the horizontal arrows (abovethe restriction maps of plasmid stage, 25 clones showing positive hybridization signals were subclones) delineatethe extent of sequence read without any ambiisolated. Inpreviousscreenings,threetypes of U3 RNA guity from each of the primers. Below the restriction map of each "processed" pseudogenes had been identified (19)3 which ap- recombinant clone, the bracketed segments represent the restriction fragments also detected in genomic DNA,by Southern blot hybridipear to have arisen through an RNA-mediated mechanism zation with labeled probes (the filZed boxes immediately below each and insertionof cDNA at staggered nicks in genome the (15). bracketed segment show the location and extent of the probe along For such pseudogenes, the flanking regions are structurally the sequence). unrelated to thoseof bona fide functional genes. Accordingly, in order to eliminate positive recombinants corresponding toseveral labeled probes of different locations in the vicinity of any of these three retrogenes, a negative selection was perthe cloned U3 genes (Fig. 1). A probe restricted to the U3 formed with synthetic oligonucleotide probes corresponding RNA coding sequence results in rather complex patterns of to sequences immediately upstream of each retrogene, and labeled bands,indicating that about 20-25 copies of this negative recombinantswere submitted to restriction analyses. sequence are present per mouse haploid genome. Most of Three distinctgenomic loci were finally recognized among the these copies correspond toprocessed pseudogenes ( 19).3Conphage recombinants, with one of them, represented in X13 versely, a probe restricted to the immediate 5' flank (-361, (Fig. l), containing a pair of U3 RNA coding regions. Se-38) of the three cloned genesdescribed in Fig. 1 produces a quence analysis revealed that one of these three loci (not of six to sevencopies simpler pattern, indicating the presence represented in Fig. 1)corresponds to an additional specimen of these sequences per haploid genome (cross-hybridizations of U3 retrogene,with a poly(A)-tailed U3 codingregion are observed between the5' flanks of the three cloned genes). flanked by a pair of direct repeats (result notshown). Genomic Organization of the U3 RNA Coding Loci in Whatever the restriction enzymes that were used, for each Mouse-Total mouse DNA was digested with several restric- labeled band obtained for the blotted digests of each of the three cloned genes there is always a precisely co-migrating tion enzymes (with a hexanucleotide recognition sequence) which can cut either within or around the U3 coding regions radioactive band in the pattern obtainedby parallel analysis Fig. 1. Comparison in the cloned DNA inserts. Southern blot hybridization of the of mouse genomic DNA as summarized in of these three genes is digested DNA was performed under stringent conditions withof band intensities indicates that each likely to be present in only one copy per haploid genome. As for the three four to additional copies of 5'-flanking sequences S. Mazan and J.-P. Bachellerie, unpublished results.

-

Structure and Organization of Mouse U3B RNA Genes present per mouse haploid genome, they do representdistinct loci, which have not been further characterized so far butmay also contain other U3 functional genes. A probe selective for the 3' flank of the U3.1 clone (a HindIII/ScaIfragment extending from positions +33 to +200) confirms the identify of this cloned DNA with the corresponding locus in themouse genome (Fig. 1). Moreover, it indicates that, unlike the 5' flank, this proximal downstream sequence is present as a unique copy in themouse genome (and no cross-hybridization is observed with the two other cloned U3 genes). By contrast, probes specific for the 3' flanks of U3.2 and U3.3 clones reveal the presence of repetitive sequences within less than 1kb of downstream DNA (see below). Finally, it is important to note that the patterns of blot hybridization of mouse genomic DNA are in full agreement with the arrangement of two linked U3 genes such as observed in X13 recombinant: a 5.5-kb-long PuuII fragment is also detected for genomic DNA, as for X13 recombinant, which shows positive hybridization signals both with the 5' flank probe and with the probe immediately downstream from the U3.1 gene (PuuII cutswithin the U3B RNA coding region but not within the U3.1/U3.2 interval). Transcription of the Cloned Mouse U3 Genes in Xenopus Oocytes-Microinjection of DNA from each of the threeplasmid recombinants pE U3.1, pE U3.2, and pEB U3.3 into the nuclei of X.laeuis oocytes results in the appearance of RNA transcripts which cross-hybridize with mouse U3 coding sequences (Fig. 2). Results are identical for the threeplasmids: as shown in Fig. 2 for pEB U3.3, these transcriptsessentially appear as a band doublet-the faster migrating band has precisely the same mobility as U3B RNA isolated from mouse cells, while the slower migrating band canbe estimated 5-10 nucleotides longer. It is noteworthy that the U3 probe used for hybrid selection allows for the selective recognition of transcripts encoded by injected mouse DNA; no signal was obtained when only labeled CTP was injected (in the absence of cloned mouse DNA). Plasmid pSH U3.1, for which the mouse DNA insert corresponds to a shorter portion of the U3.1 locus (extending

a

c _

b

c

d

516 .

u3+ 183 240

FIG. 2. Transcriptionof the cloned mouseU3 genes in Xenopus oocytes. RNA from X.laeuis oocytes injected with pEB U3.3 DNA and [~u-~'P]CTP was analyzed on a 5% polyacrylamide gel in 7 M urea, either directly ( d ) or after hybrid selection with U3 coding DNA sequences (c). In b, total cellular RNA extracted from mouse 3T3 cells by the LiC1-urea method (25) was hybrid-selected, electrophoresed, and blotted in the same conditions, then hybridized with a labeled RNA probe complementary to mouse U3B RNA sequence (a pTZ anti-U3 transcript extending from position +140 to the3' end). Labeled RNA size markers (obtainedby in vitro transcription of pTZ recombinants) have been analyzed in lanes a.

19463

from position -380 to +33 by reference to thecoding region), was also tested by microinjection. Results were essentially identical with those obtained with pE U3.1, indicating that such a restricted portion of flanking sequences is sufficient to direct the appearance of U3B RNA inthis heterologous system. Conversely, microinjection of the mouse U3 retrogenes, which lack any sequence homology with the U3.1, U3.2, and U3.3 loci outside their U3 coding region, does not result in theappearance of mouse U3 transcripts (notshown). Structure of Mouse U3B RNA-For the three genes, the sequences are identical over the 215 nucleotides of the U3 RNA coding portion (Fig. 3) and arein full agreement with a direct RNA sequence analysis by primer extensionperformed over the 190 5' nucleotides of mouse liver U3 RNA.' Comparison with the rat homologs clearly indicates that the mouse genes code for a U3B-form RNA (only nine point differences with rat U3B instead of 30 with rat U3A). Most of the differences with rat U3B RNA are located in the 3' half of the molecule. The comparison with human U3 RNA (26) confirms the higher rate of sequence change for this portion of the molecule (Fig.3). The conserved sequence motifs (boxed in Fig. 3), previously recognized by a general comparison of eucaryotic U3 RNA sequences (27), have also been preserved in mouse U3B RNA. Mouse U3B RNA can be folded into a secondary structure (Fig. 4) which fits with the model proposed in Ref. 27. Six of the base changes observed in mouse as compared to rat are located withinsingle-strand loops,while the three others preserve the pairings. The motif complementary to the5' end of pre-rRNA ITS2 (10) appears in a single strand configuration (denoted by a broken line in Fig. 4), thus available for base-pairing withthe rRNA precursor. Structure of the Mouse U3B Gene Upstream Fhnking Regions-Up to position -500, the 5"flanking regions of the three mouse U3B genes are extensively homologous to each U3B representative other and are also clearly related to the rat but they diverge abruptly further upstream (Fig. 5a). However, within these 0.5 kb of gene proximal sequences, two subdomains can readily be distinguished on the basis of both intraspecies and mouse/rat comparisons. The gene-proximal (-1, -255) subdomain is strikingly homogeneous in mouse: except for a single nucleotide position (-76, where U3B.3 differs from U3B.1 and U3B.2), the three gene specimens have there anidentical sequence which differs a t 60 positions from the rat U3B gene sequence (17). This subdomain seems also homogeneous for rat, at least for the (-1,-120) portion, for which a second rat gene, termed U3B.4, has been analyzed (17); both rat sequences are identical for this portion where 33 differences are observed between rat andmouse. By contrast, the degree of homogeneity for the mouse genes is dramatically decreased in themore upstream (-256, -500) region, although the level of mouse-rat divergence remains roughly similar to thatobserved for the gene-proximal region, except for the presence of two insertions (25 and 69 bp long) in mouse. If an overall analysis points to the closer relationship of the linked U3.1 and U3.2 genes, a detailed inspection of sequence alignments reveals a more complex situation. Up to position -470 (arrowhead in Fig. 5a), the U3.1 and U3.2 upstream sequences are in fact very closely related (only 8 differences) and quite distinct from U3.3 (70 differences), and these relationships are notmodified when the highly variable mouse-specificinserts are excluded from the comparison (with a number of remaining differences then amounting to 2 and 15, respectively). Quite unexpectedly, the situation is reversed portion, with U3.1 and U3.3 then over the (-471,-500)

Structure Organization and

19464

""_ ""_

AAGACTATAC TTTCA """"" """"" """""

""f

CTCC

"" ""

""-

b

t

"" ""

"_ "_

G A T C A T T T C T ATTCAG T T A C T

".

"""""

""""

"""""

""""

"""""

-

GI"""

AGACAAGTTT CTCTGACTGT GTAGAGCACC CGAAACCACG AGCACGAGAC GTAGCGTTCC """"" """"" """"" """"" """"" A"""C" """"" """"" """"" """"" """"-1 ""G""""""" """"" """ AC" GI"""" "",""G """"TT

"""_a: _.""""""_""_ "" ""__

AGCG TGAAGCC ""

of Mouse U3B RNA Genes

C T * C T A G G T CTCG C T T C b C T t CTG" -."TA"-T CC." -TTcT"c-T """6"-

100 100 100 100

19.

I91

"

197 200

d

FIG. 3. Primary structure of the mouse U3B RNA coding region and alignment with other U3 mammalian sequences. The boxed motifs a, b,c, and d denote the highly conserved sequences also found in the Dictyostelium discoideum and yeast U3 homologs (27). _uuu"c

FIG. 4. Secondary structure of mouse U3B RNA. The base substitutions found for rat U3B RNA are also indicated. Pairingswhere compensatory base changes are observed among the available vertebratesequences are denoted by a filled circle (the X.luevis and X . borealis sequences (28) have also been taken into consideration).

c,Guu-

G

c

G A A

m3 G

appearing by far theclosest relatives (only 3 differencesuersus 1 2 between U3.1 and U3.2). Although these numbers remain small, the discrepancy seems too large to merely result from statistical fluctuations. Comparison of the mouse gene 5' flanks with the ratU3A and human U3genes reveals no significant homology, except for the twomotifsboxed in Fig. 5a. These conserved sequences,which extendupstream from positions -44 and -225, respectively, for the mouse genes have also precisely the same location upstream from the U3A rat and human U3 genes and are unambiguously related to key promoter elements of Ul-U5 snRNA genes (Fig. 5b). However, the more upstream conserved motif (U3 box) contains, in addition to the octamer (enhancer-like)sequence common to the otherU snRNA genes, a left-end portion, only shared by the mammalian U3 RNAgenes, whichmatches theclassical "CCAAT" box motif (in reverse orientation). Moreover, two GGGCGG motifs are present in the vicinity of the upstream conserved elements of the mouse U3.1 and U3.2 genes (the U3.3 gene has only one copy). The homologous regions of the rat and human U3 genes do not contain suchGC motifs. The 3"Flanking Regions-Contrary to the 5' flanks, they are extensively divergent, either between mouse and rat, or even between the threemouse genes (Fig. 6a). However, some homology can be detected in the immediate vicinity of the RNA coding region; each of the mouse genes does contain a sequence motif (boned in Fig. 6b) closely related to the conserved sequence GTTT NO-3AAA Pu NP AGA required for the 3' end formation of pre-U1 and pre-U2 RNAs (32-34). This motif maps at positions +16 to +20 downstream from

the rodent U3 genes. Beyond position +30 downstream from theU3BRNA codingregion, nosignificant homology is apparent. However, over the approximately 100 bp of proximal 3' flanks, the three mousegenes arerelatedintheir displaying a very high T content (40, 53, and 57% for U3B.1, U3B.2, and U3B.3, respectively). At the U3B.3 locus, a repetitive R element is present in the immediate vicinity of the RNAcoding region (and inopposite orientation). This specimen exhibits a 92% homology with the consensus sequence derived for the mouse R elements (35) which exist in about lo5 copies in rodent genomes (36, 37). There is noreason so far to suspect a functional significance for this R repeat at such a location and its occurrence maymerelyreflect the knownpreference of mammalian retroposons for inserting into AT-richsequences (15,38). DISCUSSION

FunctionalSignalsin Mouse U3 Genes-In mammalian genomes, a large portion of the sequences hybridizing to the different U snRNAs has been shown to correspond to pseudogenes (present in some cases in up to hundredsof copies), most of them representing retroposons(38).As for U3 RNA, functional genes have been previously characterized for two mammals, rat (17) and human (26), but nonfunctional retrogenes have also been reported for these species (16-18). For mouse, we have recently identified severalforms of retrogenes (19),3 but the three coding U3 regions analyzedhere do appear as bona fide functional genes, on the basisof Xenopus oocyte injection experiments and of comparative analysis of their sequences.

of Mouse U3B RNA Genes

Structure Organization and

NOUSE U 3 B . 1

19465

-- NOUSE U 3 B . 3

ACAGAATTTA GTGC----C-

NOUSE U 3 B . 1 NOUSE U 3 B . 2 NOUSE U 3 B . 3 RAT U 3 B . 7

GTCTAAACCG

NOUSE U 3 B . l NOUSE U 3 B . 2 NOUSE U 3 B . 3 RAT U3B.7

C*CC*A*TGA CCCTTCGCAG GCAGAaGTGG GACGGGTCAA CAGAGGACTG TGATAC -e".-."""""" """"" """"" " -1""" -1"c-G""-G""" "0"""""""" " " " " " -.".-.A" "-G"T""G""""""0°C """G"-A+.-

NOUSE U 3 0 . 1 NOUSE U 3 B . 2 NOUSE U 3 B . 3 R A T USB.?

GCGT*GGTGT

AGGGACACGC

NOUSE u3s.1 NOUSE U 3 B . 2 NOUSE U 3 B . 3 RAT UJB.7

GGTGTGAGAT AGAGGCGCGG ACAGGGATGT CACGGGTCGT CTGTGCGTCA

NOUSE U 3 B . 1 NOUSE U 3 l . 2 NOUSE U 3 B . 3 RAT U3B.7

CGAGCGTGAC GTT

-1

" " " " "

-1

C""""C"""G" T-."""-

"""""

G"AGTCAA-

.""."" .......... *CGCGGTGCC

"""""

"-C""T"-C""-A

G---To-CGA

"""""

-T"--C"-.e.......

..........

e.........

""._

"""""

"""""

"""""

"""""

"c-A""-

"""T-..

"""+"

"""-1"

"""""

"""""

"""""

"""""

" " " " "

"""""

""-CT"A

"c""Gc-

"-c"-GAA

---

"""""

"""A""T"-G"-

"""""

"-.G"--A

*.*...--C-

TGCCTGTTCG GCGACCGTAT GCTAA """""

""-*c"-

"""-"

-403

"-T--A-T-

-402 -406

GTT---A---

-335

G

"""""

f LCCG I

GTGACAGTCT """"" """""

"1-

""""1-

-308 -308

-310 -306 -211 -211 -21 1 -210

GTCCCCGCGA CCGCAGACGC

-113

"""""

"""""

""""" """""

"""""

"""""

"""""

""""" """"" """""

"""""

"""""

"CT"-T-T

""C-C"T

-113 -113 -112

""."""_"".""-

-14 -14 -1 4 -1 3

""-Tee" 1""""-

CACAGALIGGT

"-Ac-GQ" """-cA-

GTGCAGTGCG GCGC*GGTGT """""

"""""

"""""

"""""

"-."-AI-

"""""

AT--C-G

-1 -1

6 UmsEWSUS

C

A T

61 ACCG 611 62 M +n6 16 C

HOUSE U38.1 HOUSE U3B.2 HOUSE U38.3 RAT U38.7 RAT U3D H W 03

""""-__

"""""

""""" """""

TAGCGACTCA AAATGCC*CC TGGACGGGGT CTGGAGAGGT GTCCGTCTGT

""""" " " " " "

"_"_

"""-." """_."

"

.e....

-490

......**.......----.................... ""._"" ........" """""

CGCGGGCGGG CATCAGGGCG AATTCGTGGC CGTC*AGTGA CTCCGCGGCC GGAGCACAGT """"" "AC""-L """"" """"" """"" A-AA*---CA GTC-C-----GCGACA"G A-G-G""G AAAACTTA-A -CC---TG-C

__""fIZAT

CT.-A"-GA

ACCGG

TCGTTTGTTC CTCACCGGAG GCCTCCCGTT TGAGCCCGAA

NOUSE U 3 B . 1 NOUSE U 3 B . 2 NOUSE U 3 B . 3 R A T U3B.7

"""""

-502

T

T 6

"_

TATGCTMT -225 -225 -225 "-224 -201 -218 CATT-AGTAT

____

" " "

" " " " "

-"_

FIG. 5. Sequence of the 5'-flanking regions of the mouse U3B genes. (a),the three mouse sequences are aligned and compared to the rat U3B homolog (17), a dash indicating nucleotide identity and a star a missing nucleotide. The two boxed motifs are maintained in rat U3A (17) and in human U3 gene (26). Two GC boxes in the mouse sequences are delineated by a thick ouerline. The pair of arrows denotes a direct repetition bracketinga deletion in the rat sequence, suggesting that DNA slipped strand mispairing (29) during replication may have generated this variation between both rodents. ( b ) , comparison of the two conserved motifs (boxed in a) with homologous regions of other mammalian U3 genes and with the consensussequences for the distal and proximal sequence elements of vertebrate U snRNA genes (30,31).

First, when injected intooocytes, all three genes direct the synthesis of an RNA which has the characteristics expected formouse matureU3 RNA. As for the slightlyelongated mouse U3 transcript which is also detected in theseinjection was observed after experiments (Fig. 2), similartowhat injection of rat U3 genes (17), it could correspond to a precursor of mature U3 RNA. precursors of U1, U2, and U4 snRNAs with about 10 extra 3' nucleotides have beendetected by both in uiuo (39-41) and in vitro (33, 40, 42) experiments. Sequence data also strongly suggest that the three mouse U3 genes are functional, considering either the U3 coding sequence (identical with mouse U3 RNA) or the 5'- and 3'flanking sequences, which contain the transcriptional signals previously shown t o be required for the expression of vertebrate snRNAgenes. Upstream of the RNAcoding region,two sequencemotifs are present which are homologous, in sequence and position, to the twomajor promoter elements of other U snRNA genes, i.e. the proximal, "TATA" box-like (43, 44), and the distal, enhancer-like (30, 31, 45), sequence elements. Moreover, the promoter region of the mouse U3

genes contains a CCAAT box sequence, a known target site for transcriptional factors (46, 47); its outstanding conservation in mammalian U3 genes(Fig. 5 b ) , where it retains a constant position relative to the octamer (20 bp apart), suggests that it may participate to a factor-mediated control of U3 RNA transcription. Other sequence motifs appear to be involved in the transcriptional control of U snRNA genes (44, 48, 49). Among them, GC boxes, characteristic of housekeeping genes (50), have been detected in some cases (48,49). Accordingly, the GC box motifs found in the promoter region of the mouse U3 genes may also be functionally significant. At the 3' flank, the shortconserved motif located about 20 bp downstream from all the rodent U3 coding regions (Fig. 6b) has the characteristics (insequence and position) of a 3' box, a sequence founddownstream of all vertebrateU snRNA genes (32), and required for the 3' end formation of U1 and U2 RNA precursors (32-34). Although largely conserved, this signal is not invariant and appears quite tolerant to a variety of point mutations (34): in fact, two of the mouse U3 genes present one deviation from the consensussequence. Beyond

19466

Structure and Organization of Mouse U3B RNA Genes

(a) MOUSE U 5 0 . 1

CTCGGTTTAT CTTAAAGGGT TTTAAATATA

GAAAGCTTCC TTAGTCTTAT GGGTGACAGT GCTCTTTAAG CCTTACCTAG GTATGTCTTT GAACATTTAT CAAAGTGAAG

+ZOO

C T C A A T T T T T TATTGTTTTT T A T A A A G T T A GAGTTTTATT TTTCTTCATT TCTCCTTTAC ACTTTTGGTT T T C T T C C C A T GCTCATATAC

TATGTGTGTC

+lo0

TATGACCACT CAGGTGTTCT GTTCTTGTGA ATTTAGACAC

TACTATCTGG GCCGTGAAAC TGGTCACATA T C A G T T A T T T TGCAGTAAA

YOUSE USB.2

+249

TTTTAAAGTC CTATTTCTAC ATCTTTAAGT TTATCCTTAT TAGTGATTCT T T T T A T A A A C ATTGTTGTTA CAGTGCGTTT ATGGCACGTG GTCTG

MOUSE

U5B.3

*IO0

ACTCAGGAAG G C T A C A C I C T C A A A G A C A C A

TACAGATTGT TATTTGTGAC

CTCGTTTTAT TTTACATGTT TGGGAAAGTC A G A A A A C T T A CAATTTTTTA ATTTTATTTT

+195

T T T T T T T C T T CTTTAAGGTC GTCTCAGTGA TTTGCCCTTT TTTTTTTGTA A T C A T A T G T A

-*"""-

CCATTCTTTA TTAGGTATTT AGCTCATTTA C A T T C C A A T G C T A T A C C A A A A T T C C C C C A T ATCTACCCAC C C C C A C T C C C TC"""" "-c""" -G"""" "&-c"-T-

A"-

-

""""-

+lo0

+ZOO

"""""

*300

COIISE)(SUS

MOUSE U3B.1 MOUSE U3B.2 MOUSE U3B.3 RAT U3B.7 RAT U3B.4

C*TCGTTTA TCTTAM666 -*--AA---T -TA-T+GTTT -*------- T AT--T-CAT-T-----C-A-C-TT-T-CCTCA-AC-

T m f * A M TA T M -AT*"+ T-" - 6 G b TC-A"A*b G A -

+32 +32 +34 +33 +31

FIG.6. Sequence of the 3"flanking regions of the mouse U3B genes. ( a ) ,the sequences start from the first nucleotide downstream of the 3' end of the U3B RNA coding region. For U3B.3, the sequence of the R element (starting 3' of the arrowhead) has been aligned with the R consensus sequence (35) which is shown below. In ( b ) , the sequences for the immediate 3' flanks are compared for the rodent U3B genes; homology with the 3' box sequence of U snRNA genes is denoted by a box (matches with the consensus are indicated by boldface letters). Only the nucleotides which differ from the top line sequence are shown, with identities denoted by hyphens. this 3' box, no homology wasdetected between the rodent U3 genes, not only between species but also within a species, suggesting that thisregion doesnot contain major a functional signal, in line with what has been observed for other U snRNA genes (32). By contrast at the 5' flank, the sequence conservation extends much farther than the 5' most putative transcriptional signal mentioned above (i.e. the U3 box), not only within a species (mouse) but also between rat and mouse. This could be taken to indicate that DNA segments located up to 500 bp upstream of the U3B coding region are submitted to a selective constraint, possibly suggesting their involvement in a transcriptional controlof U3 genes. Organization and Evolution of the Mouse U3 Genes-The rat genome contains only a few U3 genes (17). Our analysis of mouse genomic DNApoints to thepresence of at most six to seven copies of U3B functional genes per haploid genome, which are definitely not organized into a regular array of tandem repetition. However,two unitsare closely linked which deserves some comment since this cluster organization may affect the overall regulation of its members. The presence of an orthologous pair of linked U3 gene loci has not been reported in rat (17),but cannot be ruled out so far. While the two linked loci are about 5.5 kb apart, thesequence homology is restricted to a much shorter portion of this interval, around

the U3 gene (Fig. 5a). Accordingly, if a tandem duplication has generated this pair of loci, it must correspond to a rather ancient event (possibly predating the mouse/rat separation), which has allowed the major part of the original duplication unit to accumulate a high proportion of changes. In this hypothesis, the preservation of a near perfect homology over a continuous stretch of about 750 bp, which stops abruptly in both directions, is best explained by the occurrence of a recent gene conversion between the closely linked genes. This seems all the more likelywhen observing that thissequence identity encompasses the two tracts of the (-256, -500) domain which are highly variable when the other rodent genes are considered (Fig. 5a). The presence, at the5' flank of this likely conversion unit between U3.1 and U3.2 genes, of a sizable tract (-471, -500), for which the sequence relationships among the three mouse genes are sharply reversed also points to theoccurrence of another (former) conversion between U3.1 and U3.3 loci. In fact, the general comparison of the three mouse and the two rat U3B sequences shows that thissmall multigene family has undergone a concerted evolution in rodents: a sequence identity within a species, either mouse (Fig. 5a) or rat (17), coupled with a substantial level of interspecies differences, is observedover an uninterrupted domain encompassing the entire U3 RNA coding region and a long portion of 5' flank.

19467

S t r u c t u r e and Organization of Mouse U3B RNA Genes It is striking that inmouse the homogenization unit appears precisely congruent with the apparently basic transcription unit; it extendsonly over the regions whichcontain thesignals already known to be involved in transcriptional control. A similar congruency between a conversion unit anda transcription unit has already been reported in the case of a pair of duplicated goat a-globingenes (51). Such a coincidence is not unexpected after a long period of evolutionary time and a number of conversion events, whenever the frequency of these eventsis low relative totherate of sequencedivergence tolerated along the functional unit. In this case, the conversion unit is likely to shrinkprogressively to a minimal domain which must represent an uninterrupted clusterof functional motifs (closely spaced along the DNA (or RNA) sequence), severely constrained notonly in sequence but also in distance relative to each other.Any more distal signal, if it is tolerant to some positional variation,will eventually beexcluded from these rounds of homogenization; the presence of asizable tract of intervening sequence, which can accumulate a high rate of nucleotide changes (with subsequent segmental mutations), will inhibit homologous pairing, branch migration, and pose a boundary for conversion events. The comparison of the mouse unlinked genes shows that the (-256, -500) domain, while containing long stretches of homologous sequences in mouse and rat, has not been efficiently homogenized in allmouse genes as opposed to both thecoding region and its proximal5' flank. Exclusion of this domain from the concerted evolution of the mouse U3 genes may well result from the presence of the two tracts tolerant to extensive sequence and size variations (Fig. 5a) and does not rule out that functional signals exist within the more upstream conserved stretches. The structural analysis of the additionalU3 loci of the mouse genome (now in progress), together with further characterizations of rat homologs, may bring further insights into this possibility by providing a better understanding of the evolution of the U3 gene family in rodents. Acknowledgments-We thank Prof. J. P. Zalta for his support, A. V. Furano for providing a rat U3 cDNA clone, S.Gerbi for communicating Xenopus U3 RNA sequences prior to publication, and A. M. Duprat and J. C. Beetschen for facilities for microinjecting oocytes. We also appreciated the help ofB. Michot, P. Ramond, and A. Altibelli in computer analysis. REFERENCES 1. Reddy, R., and Busch, H. (1983) in Progress in Nucleic Acid Research and Molecular Biology (Cohn, W. E., ed) Vol. 30, pp. 127-162, Academic Press, Orlando, FL 2. Reddy, R., and Busch, H. (1981) in The Cell Nucleus (Busch, H., ed) Vol. 8, pp. 261-306, Academic Press, Orlando, FL 3. Busch, H., Reddy, R., Rothblum, L., and Choi, Y. C. (1982) Annu. Reu. Biochem. 51,617-654 4. Schaufele, F., Gilmartin, G. M., Bannwarth, W., and Birnstiel, M. L. (1986) Nature 323, 777-781 5. Zhuang, Y., and Weiner, A. M. (1986) Cell 46,827-835 6. Zhuang, Y., Leung, H., and Weiner, A. M. (1987) Mol. Cell. Biol. 7,3018-3020 7. Parker, R., Siciliano, P. G., and Guthrie, C. (1987) Cell 49, 229239 8. Prestayko, A. W., Tonato, M., and Busch, H. (1970) J. Mol. Bid. 47, 505-515 9. Epstein, P., Reddy, R., and Busch, H. (1984) Biochemistry 23, 5421-5425 10. Bachellerie, J. P., Michot, B., and Raynal, F. (1983) Mol. Bid. Rep. 9, 79-86 11. Tague, B. W., and Gerbi, S. A. (1984) J. Mol. Euol. 20, 362-367 12. Crouch, R. J., and Bachellerie, J. P. (1986) in DNA Systematics (Dutta, S. K., ed) Vol. 1, pp. 47-80, CRC Press, Boca Raton, FL

13. Craig, N., Kass, S., and Sollner-Webb, B. (1987) Proc. Natl. Acad. Sci. U. S. A. 84,629-633 14. Bourbon, H., Michot, B., Hassouna, N., Feliu, J., and Bachellerie, J. P. (1988) D N A ( N Y )7, 181-191 15. Denison, R. A., and Weiner, A. M. (1982) Mol. Cell. Biol. 2, 815828 16. Bernstein. L.B.. Mount. S. M.. and Weiner. A. M. (1983) Cell 32,4611472 17. Stroke. I. L.. and Weiner. A. M. (1985) . . J. Mol. Biol. 184, 183193 18. Reddy, R., Henning, D., Chirala, S., Rothblum, L., Wright, D., and Busch, H. (1985) J . Biol. Chem. 260,5715-5719 19. Ferrer, P., Qu, L. H., Bouche, G., and Bachellerie, J. P. (1986) FEBS Lett. 204,307-312 20. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular Cloning:A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY 21. Messing, J., Crea, R., and Seeburg, P. H. (1981) Nucleic Acids Res. 9, 309-321 22. Sanger, F., Nicklen, S., and Coulson, A. (1977) Proc. Natl. Acad. Sci. U. S. A . 74, 5463-5467 23. Colman, A. (1984) in Transcription and Translation: A Practical Approach (Hames, B. D., and Higgins, S. J., eds) pp. 49-69, IRL Prell, Oxford 24. Mead, D.A., Szezesma-Skorupa, E.,and Kemper, B. (1986) Protein Eng. 1, 67-74 25. Le Meur, M., Glanville, N., Mandel, J. L., Gerlinger, P., Palmiter, R., and Chambon, P. (1981) Cell 23, 561-571 26. Suh, D., Busch, H., and Reddy, R. (1986) Biochem. Biophys. Res. Commun. 137, 1133-1140 27. Hughes, J . M. X., Konings, D.A.M., and Cesareni, G . (1987) EMBO J. 6,2145-2155 28. Jeppesen, C., Stebbins-Boaz, B., and Gerbi, S.A. (1988) Nucleic Acids Res. 16, 2127-2148 29. Levinson, G., and Gutman, G . A. (1987) Mol. Biol. Euol. 4, 203221 30. Ares, M., Jr., Mangin, M., and Weiner, A.M. (1985) Mol. Cell. Biol. 5, 1560-1570 31. Ciliberto, G., Buckland, R., Cortese, R., and Philipson, L. (1985) EMBO J. 4,1537-1543 32. Hernandez, N. (1985) EMBO J. 4, 1827-1837 33. Yuo, C. Y., Ares, M., Jr., and Weiner, A. M. (1985) Cell 42, 193202 34. Ach,R. A., and Weiner, A.M. (1987) Mol. Cell. Biol. 7, 20702079 35. Soares, M. B.. Schon.. E.., and Efstratiadis. A. (19851 J . Mol. Euol. 22, i17-133 36. Gebhard, W., and Zachau, H. G. (1983) J. Mol. Biol. 170, 2552711 -. . 37. Rogers, J. (1983) Nature 306, 113-114 38. Rogers, J. H. (1985) Znt. Reu. Cytol. 93, 187-279 39. Madore, S. J., Wieben, E. D., and Pederson, T. (1984) J. Cell. Biol. 98, 188-192 40. Wieben, E. D., Nenninger, J. M., and Pederson, T. (1985) J . Mol. Biol. 183,69-78 41. Madore, S. J., Wieben, E. D., Kunkel, G. R., and Pederson, T. (1984) J. Cell Biol. 99, 1140-1144 42. Kleinschmidt, A. M., and Pederson, T. (1987) Mol. Cell. Bid. 7, 3131-3137 43. Skuzeski, J. M., Lund, E., Murphy, J. T., Steinberg, T. H., Burgess, R. R., and Dahlberg, J. E. (1984) J. Biol. Chem. 259, 8345-8352 44. Murphy, J. T., Skuzeski, J. T., Lund, E., Steinberg, T. H., Burgess, R. R., and Dahlberg, J. E. (1987) J. Biol. Chem. 262, 1795-1803 45. Mangin, M., Ares, M., Jr., and Weiner, A. M. (1986) EMBO J. 5,987-995 46. Jones, K. A., Kadonaga, J. T., Rosenfeld, P. J., Kelly, J. T., and Tjian, R. (1987) Cell 48, 79-89 47. Dorn, A., Bollekens, J., Staub, A,, Benoist, C., and Mathis, D. (1987) Cell 50,863-872 48. Ares, M., Jr. Chung, J. S., Giglio, L., and Weiner, A. M. (1987) Genes Deu. 1,808-817 49. Roebuck, K. A., Walker, R. J., and Stumph, W. E. (1987) Mol. Cell. Biol. 7, 4185-4193 50. Kadonaga, J.T., Jones, K. A., and Tjian, R. (1986) Trends Biochem. Sci. 11,20-23 51. Schon, E. A., Wernke, S. M., and Lingrel, J. B. (1982) J. Biol. Chem. 257,6825-6835 '

I

~

~I~

~~~

Suggest Documents