proto-oncogenes during development - NCBI

1 downloads 0 Views 2MB Size Report
Sep 6, 1989 - c-myc has been found in the genome of the human (Bernard et al. ,1983), the cat ..... human c-myc protein (Dang and Lee, 1988), and 5' to this.
The EMBO Journal vol.8 no.13 pp.4091 -4097, 1989

Differential expression of two Xenopus proto-oncogenes during development

Sophie Vriz, Michael Taylor' and Marcel Mechali Institut Jacques Monod, Tour 43, 2 place Jussieu, 75251 Paris Cedex 05, France 'Present address: CRC MERG, Department of Zoology, Downing Street, Cambridge CB2 3EJ, UK Communicated by J.B.Gurdon

Two distinct Xenopus c-myc cDNA clones have been characterized from an oocyte cDNA library. This allowed a comparison of the c-myc protein sequence across the vertebrate phylum to be made and prominent conservations to be identified. The majority of the sequence differences between the two Xenopus c-myc cDNAs are in the 5' and 3' untranslated regions. Sequence-specific oligonucleotide probes from the 5' untranslated region were used to demonstrate the differential expression of the two c-myc mRNAs during development. One of the mRNAs corresponds to the Xenopus c-myc gene previously reported expressed as a stable maternal mRNA uncoupled from cell division during oogenesis (c-myc I). It is the major mRNA species expressed during oogenesis and is expressed again from the zygotic genome in post-gastrula embryos. In contrast, the second c-myc mRNA (c-myc II) is expressed only from the maternal genome during oogenesis. Primer extension experiments show that in the oocyte the transcriptional initiation sites for c-myc I and c-myc II are at different distances from the translational start site. The 'oocyte-specific' and 'somatic-type' developmental regulation of c-myc is reminiscent of polymerase HI 5S RNA gene expression in Xenopus, and may provide new insights into the developmental regulation of genes transcribed by RNA polymerase II. Key words: c-myc/development/transcription/Xenopus

Introduction The c-myc gene was first identified as the transforming sequence of the avian retrovirus MC 29, and is conserved in evolution across species of the vertebrate phylum. Thus, c-myc has been found in the genome of the human (Bernard et al. , 1983), the cat (Stewart et al., 1986), rodents (Bernard et al., 1983; Hayashi et al., 1987), the chicken (Watson et al., 1983), the frog Xenopus laevis (King et al., 1986; Taylor et al., 1986; Nishikura, 1987) and the trout (Van Beneden et al., 1986). Expression of the c-myc proto-oncogene has been implicated in the growth of a variety of both normal and neoplastic cell types (for review see Cole, 1986). Whereas quiescent cells express low levels of both c-myc mRNA and protein, stimulation of virtually all quiescent cells to initiate proliferation leads to a rapid accumulation of c-myc mRNA (Kelly et al., 1983; Cole, 1986). Regulation of c-myc © IRL Press

c-myc

expression is both at the transcriptional (Kelly et al., 1983; Greenberg and Ziff, 1984), and at the post-transcriptional level (Blanchard et al., 1985; Dani et al., 1985). In mammals, three distinct transcription sites have been characterized and designated P0, P1 and P2 (Battey et al., 1983; Bentley and Groudine, 1986a,b). The c-myc protein is located in the nuclei of somatic cells (Alitalo et al., 1983; Hann et al., 1983; Eisenman et al., 1985) and has been characterized as a competence factor required for progression into the S phase (Kaczmarek et al., 1985; Heikkila et al., 1987; Prochownik et al., 1988). The interest in c-myc has recently been increased by the finding in c-myc of a conserved domain and developmentally important regulatory genes such as myoD, Drosophila achaete-scute and twist (Davis et al., 1987; Villares and Cabrera, 1987; Murre et al., 1989). C-myc has been characterized in the development of the amphibian Xenopus laevis, and was shown to be expressed in the oocyte (King et al., 1986; Taylor et al., 1986; Nishikura et al., 1987) as a member of the class of stable maternal mRNAs (Taylor et al., 1986). The Xenopus c-myc mRNA is highly accumulated during early oogenesis, and post-transcriptional regulation of the gene induced at fertilization results in the degradation of 90% of the c-myc RNA when gastrula is reached (Taylor et al., 1986). It is further expressed in the whole growing embryo, and high levels of activity are detected in specific tissues such as the epidermis, optic cup and lens placode (Hourdry et al., 1988). Although one major RNA species was detected throughout embryonic development, Southern analysis suggested the possible presence of two distinct c-myc genes in the Xenopus genome (Taylor et al., 1986; Nishikura, 1987). In this study we report the characterization of the complete cDNA sequence of the two Xenopus c-myc genes. We also show that these genes are differentially regulated, with one c-myc gene active in oocytes and the other active in both oocytes and in postgastrula embryos. This situation is reminiscent of 5S RNA gene expression in Xenopus (for review see Wolffe and Brown, 1988), but has not previously been reported for RNA polymerase II transcribed genes.

Results and Discussion Characterization of a second c-myc cDNA sequence expressed during oogenesis We previously reported the partial characterization of a Xenopus c-myc cDNA and its expression during embryonic development (Taylor et al., 1986). Analysis of 29 independent clones of the cDNA library showed that they could be grouped into two classes. Clones of the Xenopus c-myc I class were more abundant than the Xenopus c-myc H clones. The complete sequence of these two cDNAs was determined (Figure 1). The 2333 nucleotide long c-myc I cDNA sequence corresponds to the cDNA previously partially described (Taylor et al., 1986). It contains two potential 4091

S.Vriz,. M.Taylor and M.M6chali

CGCGATTAGACOGGAGACCA GAATTCCOGATTTATAAACGCGACCAAAGMAATATTGGACTCCTATTATCC -CC--AGA--TGCA-CA-TAT-AGCGA- -M -T -----C----------CGCA-

Ser Pro Ser Thr TAC GCA GCG TCT CCC TCG ACC -- - - -- -A----

Tyr Ala Ala

37

1016 Leu Lys Val Asp Tyr Val Ser Ser Lys Arq Ala Lys CTA AAA GTG GAC TAT GTT TCT TCC AMA AGG GCG AM - -- -- - --

-- -

-- -

- --

- --

- --

- --

--G ----

- --

G-G---G-GC-C-

104

Met Pro Leu Asn Ala Asn Phe Pro Ser Lys Asn Tyr Asp Tyr

GAC TAC ATG CCT CTT AAT GCC AAT TTT CCC AGC AAG MC TAC --- --- ---

ATCGCATGGCAGGAAAG C--- -G -------

--C

---

---

---

---

---

---

---

---

1073 Glu Ser Asn Ile Arg Val Leu Lys Gln Ile Ser Asn Asn Arg GM AGC AAC ATC CGG GTC CTC AAA CAG ATC AGC AAC MC CGC Val

-

_

161

Gln Asp Tyr Asp Leu Gln Pro Cys Phe Phe Phe Leu Glu Glu Glu Asn Phe Tyr His CAG GAT TAT GAC TTG CAG CCC TGC TTC TTC TTT CTG GAG GAG GAG AAC TTC TAC CAC

Lys Cys Ala Ser Pro AAG TGC GCC AGT CCC

Ser Ser AGG TCC TCG

Arg

Asp

GAT

Ser Glu TCC GM

Glu

1130 Arg Asp Lys Arg Lys Thr His Asn Val Leu Glu GAC AAG AGG MG ACG GCC AGT GTT CTG GAG CGC --T-A ----CG--C --- --- --- --- Arg -

Asn

GAG AAC

-----

---

--- --- --- ---

216

Leu Gln Ser Arg Leu Gln Pro Pro Ala Pro Ser Glu Asp Ile Trp Lys Lys Phe Glu CTG CAA AGC CGA CTG CAG CCG CCG 0CC CCC AGT GAG GAC ATC TGG AAG MG TTT GAG

1187

Gln Val Pro Glu Gln Arg Arg Asn Glu Leu Lys Leu Ser Phe Phe Ala Leu Arg Asp CAG CCG GAG MC GAG CTC MG TTG AGT TTT TTT GCC TTG CGC GAT --- GTA --- --CAG--A--C ----C ---C --- --- --- --- --T

CAG AGG CGG

---

275 Pro Leu Pro Thr Pro Pro Leu Ser Pro Ser Arg Arg Ser Ser Gln Ser Ser Leu Phe CCC CAG TCC AGC CTT TTC CTC CCC ACC CCG CCC TTG TCA CCC AGC CGC AGA TCC AGC -------- ---

_ ______ _ -----G

-----

1244 Val Ala Ser Asn Glu MC GAG GTG GCG TG- CGT -A-

AGOC

332 Ser Thr Ala Asp Gln Lou Glu Met Val Thr Glu Phe Leu Gly Gly Asp Met Val TCC ACG GCT GAT CAG CTG GAG ATG GTG ACC GM TTC CTG GGA GGG GAC ATG GTC --T --- --- --- --- --- --- --- --- --- --- ---

--C

Asn

Trp Arg

GCC ATT TCT CTG --A

Glu

Glu Phe Leu Gly Gly Asp Met Val Asn Gln Ser Phe Ile Cys Glu Ala Asp Asp TTT ATC TGC GAG GCG GAT GAC GM GAA TTC CTG GGA GGG GAC ATG GTC MC CAG AGC ________________ __ __ _- -- -- -- -- -T --- --- --- --- --_

---

---

---

---

---

A-

CAG GAG

GAC

Ala Leu Leu Lys Ser GCC TTG CTG AAG TCC ---

---

---

---

---

--A

---C --C

---

--G

---

---

-

GMA

Arg Arg Leu Ile Arg Glu Thr Glu Gln Leu Lys CGG CGG CTC ATA CGG GAA ACA GAA CAG TTA AAG

--G --A

--- --- --- ---

---

---

---

---

---

---

-

1358

Gln Leu Arg Asn Phe Val Tyr Arg Lys Glu Gln Leu Lys Gln Arg Leu Gln TAC AGG AAA GAG CAG TTA AAA CAG AGA CTC CM CAG CTG AGG AAC TTT GTC TAATTCA --- -- ---- -- -- -- -- -- -- -- -------A --- --- -C----ser

446 Ile Val Ile Gln Asp Cys Met Trp Ser Gly Phe Ser Ala Ala ATC GTC ATA CAG GAC TGT ATG TGG AGT GGA TTT TCG GCT GCG

---

1301

Ala Ile Ser Leu Gln Glu Asp Glu Met

389

---

---

-

Asn

AAC

---

-

Lys Lys Ala Thr Glu Tyr AAA AAG GCA ACG GAA TAC

Pro Lys Val Val I le Leu CC CCC AAA GTA GTC ATC CTC

Lys Ala

MAG

Arq

-

1452

TAMACTGTGACCGTCTATATCACGGTTGTCTCCvCAGACCGACTA -------------------A------------ TC---A ----------TA

CAAACTCTTATTTAACACTTTATA --------------

1527

503 Ala Lys Leu Glu Lys Vai Val Ser Glu Lys Leu Ala Ser Tyr Gln Ala Ser Arg Lys TCT AGG AM GAG TCT MG CTG GCG TCC TAC CAG GCT GTG GTG GCC MG CTG GAG MA --- --- --- ---

__________________________------------A---

TACACAACCTACACAACCTTG CAAAACTGAATTTGTAGAACATGGACAACTGCATGC AGAGATGATTTCACAAC --------------T---------------------

---------------------A------------1602

TT GCATGGTCTCAAACAGGATTCTGCCAGCACCTTAAAACTGCCTCAATACTGGGATTTGGGTATTATGGGAC ----------------TG-O----------------------------AT- ---A----------

560 Glu Ser Ala Lou Ser Ser Ser Ser Arg Cys Gln Ser Gln Pro Pro GAG AGT GCT CTG TCT TCT TCT TCT CGG TGT CAG AGT CM CCA CCA --- -A- --C --- --- --- --G ----- --- --- TThr - Gln -

Pro Ser Pro Leu CCG AGC CCG CTT --T

-AGln

---

AAG --A -

TG TTT-TTTTTTT-CTTGCTTGAGGGGGTATrATTTGCCAACTTTTTGTTTGTTTGTTATTATTGT-TAC A -- --- --- - ---- -- - ---G---- --- - --TAAT----- ---- ---- C-----

1752

617

Lys

1677

His Ser Pro Ser Cys His Gly Ser Leu Ser Leu Gly Gly Thr His Arg Ser Ser CAT TCT CCC TCG TGT CAT GGG AGC CTG AGT CTG GGA GGG ACC CAC AGG AGO AGC -G-- --- --- T-A -A- --C --- --- ---A-- --- --- --- --- Asn - Asn - - Asp -

AACCTTTTGTATTTAAAACATTTArrTTTCTTATAAACCAAATTCCTGAGrrTGGCTCTGsATAGCTTAAMTAT -----C__-_-_-_-_-_-_-_-_---_--

1827

TATAATA TCATTGATAGAACATTATAGAATTATTGTATTCACTTCTAAC ----CCAG---T--G-GAC--G-C-- --ATA-----A----- TCG ----------------------------

ATATATCATCGCTGACTCGTC

674

Pro Tyr Gly Phe Lou Gln Asp Pro Ser Ser Asp Cys Val Asp Pro Ser Vai Val Phe TAC GGT TTT CTC CAG GAC CCC AGC TCG GAT TGT GTT GAC CCT TCA GTG GTC TTC CCA -AA --C Glu

--- ---

---

---

---

---

---

--- ---

--- --- --- --- --- --- --- ---

731

-

-

788 Thr Pro Pro Ile Ser Ser Asn Ser Ser Ser ACA CCG CCC ATC AGC AGT AAC AGC AGC AGC -C ---T --A -

Ser Glu Ser Glu Glu Glu AGT GMA TCT GAA GAG GAA -C----- --- --A -Asp -

Pro Glu CCA GAA -AG ---GAC

Gln

-

Asp

845

Glu Asp Glu Asp Glu Asp Cys Asp Glu Glu Glu Glu Ile Asp Val Val Thr Val GTC ACA GTA GAM GAT GAG GAT GAA GAC TGC GAC GAG GAA GAG GAG ATT GAC GTT ----- ---

GAT --C --C

Asp

-

Asp

--- ---

--- --- --- --- --- --- --- --- ---

TTTAATGAACTTTTT

ACTTTTAA

-----C-G ------ TTATGTTTTGTTGA -----

1977

TTTCMGTTTCAT G AAGAAA AM CATTTAATC A-GCGC ----G---AG-AAA--G-AAAAC-ATTT---TT2052

TTGAA

TAATTAATCTGTATTGGAGCT

G--TTTGC

----

ATTGGQ-CTAGTCGTACC

----

TTAAAAAAGAAAATACTAAAAGTACTTAATAT --C -----

--

ACATGTCAAT

AGTTG TACCAGTTCTGCC

-----TC--GAAGMCTGATGMGMGGCACTGACCA

2127 -------------------

--

CCCCATACCCTGTA

CCTTGTCATGTGGAAACCTCTAACC

TAT ------------- TC

------------A-----C------

2202

TGCTGGCTGCTCCTGGTGTCCTGTTGTAMT AGCAGCAGCAGGGATCTT GGTCTTTCCTAGTAACACAGAAGCA -----------------------G-T --A---A----AA-C -------------A----------------

2277

--G---

-

CGTTTTGCTAATTCGTACCATGAACGCA

-GA----------- CTATTTTTTTTTGT ------------A -------------

---------- ------G---

-

Glu Pro Leu Asn Asp Ser Ile Ser Asn Ala Ser Ser Pro Cys Gln Asp Leu Ile Leu TTG GAA CCA CTG AAC GAC AGC ATT TCC MC 0CC AGC TCT CCT TGC CAA GAT CTC ATT --- - -- - -- -- G --- ------- - - - - - -- - - -----A ----- Met -

1902

CGCTATGGCGGCAAAA AAAA AAAAAGCAAAAA

GTCTGC

902

TGTTTCTATATTGATTCTTTTAAAGATATTATAACATGCTGGTCTGAAAAA MCTCATCA -ATGA------------------C-------------------------GT CG----- T-----2352

Lys Arg Gln Ser Ala Ser Lys Arg Val Glu Ser Ser Ser His Ser Gln Pro Ser Arg CGA AAA AGG CAG TCG GCA TCC MG CGG GTG GAA TCC AGT TCT CAT TCG CAG CCC TCC --G --- -----A-- --- ---G-- --- --- --- ---T-- --- ----------A-Ser Met - Arg Gly - SThr -

TTCGAAACTCATCATTCGTTTAT GTAAAGCTTGC TTTGCTTAAGATGAAACTATTCACATTrGGGGGCCAACC T- T----G-----TT----------TA------ ----- G-----T--G--T --T-- T-

959

TCAACTTCGA ACCCCCCATAATQATATAGTGACTTAACATAGCAAGCTGTrTGGTAAGCTTCAGAAAATAAAAA G--T ---G---GNNNNNHNNNNNNNHNNNN NNNNNHNNNN --- ------------------- A

Pro His Tyr Ser Pro Leu Vas Leu Lys Arg Cys His Val Pro Ile His Gln His Asn CCC CAC TAC AGC CCT TTA GTT CTG MG CGG TGT CAC GTT CCC ATT CAC CAA CAC AAC --G --- --- --- --- --- --- --- --- --- --- --- --- --- --- - -His-

2427

---

AC

aAAAAAAAAAAAAAAAAAAAAAAA&AkAAAA

. mycdlI * 2333 bp 2350 bp . myc

Fig. 1. Nucleotide sequence of Xenopus c-myc I and c-myc II cDNA. The comparison of the nucleotide sequence and derived armino acid sequenceor of Xenopus c-myc I (lines 1 and 2) and c-myc II (lines 3 and 4) cDNA is shown. Dashes represent identity and a gap the absence of a nucleotide amino acid.

initiation codons and the second ATG codon was designated as the likely initiator of the c-myc I protein, on the basis of homology of this region with the amino terminus of the

4092

c-myc protein from other species (Bernard et al., 1983; Watson et al., 1983; Stewart et al., 1986; Van Beneden et al., 1986; Hayashi et al., 1987). In c-myc I, the nucleotide

Differential expression of Xenopus c-myc

YD'

)

F

¾ -L

Y,I A

- jVAi

~

~

DAVW Q

LG

2

-f

*1

P PT.

-~S

.4

A S

M

WJVFP

5

N

Jk A

L DN

75: rn

v

S AVFPs VkA

5 'II

5

A

VP

F

S

:7 P P

~~~~TPP

II

PP~~T

P

235. X V4

VY N----

A

P

A

P

IUKN A

5 5 ft N N

Y

-,AA.

V.i ~~~~~~~~~~~r!~~~~~~~~~~~~~~~~f

X

I

L~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~HsP

1. :R V

(in single-letter code) of Xenopus laevis

areas

are

for

an

identical amino

Q0

L

'S S N N

J

-

P A

X

IS S

K.

NI .4

p-_

14l.

Sv

I J

E

A~~P

A

L i

c-m~yc proteins from seven different species. Comparison of the proteins with the published sequences for trout, c-m~yc I and c-m~yc were

introduced in c-myc sequences

to

maximize

homology.

The

alignment (Smith and Waterman, 1981). V, conservative substitution; V, acid at the same position for at least seven out of eight sequences.

phase with region, raising the possibility of an upstream start site producing a second c-myc protein from the same transcript, as described for human c-myc (Hann et al., 1988). has only one initiation codon and, in contrast C-myc to c-myc there is no open reading frame upstream of the ATG, which suggests the possibility of a functional difference between the two Xenopus c-myc genes described sequence upstream of the initiation codon is in

common to a

the translated

thought

to

L~ ~ ~ 0L

II

combination of visual fit and computer

Shaded

E x

1

2.L

comparison amongst eight

2. Amino acid sequence

V

V

N-

human. Dots in the sequence represent gaps which a

A

s Z N

sequences

E

5~ ~ ~ ~ ~ ~P~ ~ ~ ~ L

N A S P

V~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fig.

£

F

S-PR

P]AAI

-IQINf

K R 1KR

TK

A

''

P

V

AR TY A IpN

T PPv 'v

number of

deduced amino acid

chicken,

alignment

mouse,

was

rat, cat and

obtained

by using

non-conservative substitution.

transiently expressed genes and are degradation (Shaw and

mediate selective mRNA

Kamen, 1986). Moreover, conserved elements

were

identi-

region which are prevalent in posttranscriptionally regulated genes (Vriz and Medchali, 1989). fied

in the

3'

end

I,

here. The sequences of these the

corresponding

genes and same

are not

gene,

or

Alignment at

the

similarity

the

are

two c-myc

cDNAs indicate that

transcribed from two different

products

of differential

of different promoters

similarity,

amino

in the

91.3%. Thus,

coding regions

5' and 3' untranslated

the

The

nucleotide

high

regions.

sequence

of c-myc I and c-myc II is

the differences

are

localized in the

long

Both genes have

3'

untranslated sequences, 958 nucleotides for c-myc I and 950 nucleotides for c-myc

II,

as

compared

for mammalian c-myc RNA et

al., 1986; Hayashi

et

with

(Bernard

et

-300 nucleotides

al., 1983; Stewart

al., 1987). These regions

are

66%

A-T rich for both genes and contain several ATTTA motifs: c-myc I has six such motifs and c-myc

and

from different

protein

3A). These characteristics (Caput

has four et

(Figure

al., 1986)

are

sequence

species

protein is highly conserved in vertebrates, the only phylum of organisms where it has been found. The characterization of the cDNA and predicted amino acid

The c-myc

allowed

sequence of X. laevis c-myc I and a

81 % at the nucleotide level and

acid level.

most of

on

splicing.

of Xenopus c-myc I and c-myc II shows

level of sequence 92 %

RNAs

Prominent features of the c-myc

the c-myc

protein

sequence of

from trout to human of conservation

are

eight

(Figure 2).

revealed,

one

us to

compare

different c-myc genes

Three extensive domains in the N-terminal and two

region of the c-myc proteins. These regions immortalizing activity and for nuclear localization of the c-myc protein (Sanid et al.,. 1987; Stone et al., 1987). There are also 32 conservative changes in amino acids, which emphasizes further the similarity of the eight proteins (Figure 2). A region rich in acidic amino acids is conserved around position 300. At this position there is a 'PEST' sequence, previously suggested to be a protein instability motif (Rechsteiner et al., 1987), which is conserved amongst the eight c-myc protein sequences (Figure in the C-terminal

are

essential for the

4093

S.Vriz, M.Taylor and M.M6chali I

B

A RUG

_4

T ALDTPPMSGSSSSSGSDSEFDCD. . EEDDED.EEIDVVTVEK Xl ILETPPISSMSSSSESEF-.rPED. EDEDCDEEEEIDVVTVEXK

LEUONE DPPE DOMAIN

COOING SEGMNCE

n-

m

MYC I

la

POTENTIAL METAL

IL

2500

A A

b

X2

AUUUA REPEAT

BIlOING DOMAIN

RUG

MYC 11

CM~~

5 z

1

a

MLETPPISSMSSSSESEcEQEDDDDDEDCDEF.EEIDVVTVE

CH GVDTPP.... TTSSDSEE.EQEE ......... DEEIDVVTLAE

POLYADENYLATION SIGNAL

2700 b

M

HFEETPP.... TTSSDSEEEQED .....

EEEIDVVSVEXK

R

HEETPP.... TTSSDSEEEQED

EEEIDVVSVEX

C

HETPP.... TTSSDSE.QEE

.....

......

ETP.... TTSSDSEEEQD.....

H

EEIDVVSVEK

E.EEIDWSVEK

D

C T

.VK [U] D. PSTSETR -W- HSPLVLKR

Xl

SSS X2 . SGS

H

SQPSRP...

H

SSPSRP ...

CH

SEE

H

CKP .

M R C H

PSRG

H

SKPP.

PSRG

H

PSGG PSGG

H "

SKPP

VST YSPLVLKR CH VPI H HSPLVLKR CH VPI H HSPLVLKR CH VNI

H

SPP.....H SKPP ..

. SPLVLKR . SPLVLKR

SPLVLKR

QDNYAAHP

T

QHNYAASP

Xl L X2 L

H QHNYAASP H QHNYAAPP

SPLVLKR CH

.

..

H H

VNLKEQ IRETEQ ILETEQ CH L FAEIKEQ

CH VST H

R

CH VST

QHNYAAPP

H

VST

H

L

C

L

ISEKDL

RRKSFEH L KQKLAQ L QNSC.

L KYRKEQ L KQRLQQ L RNFV. L KYRKEQ L KQRLQQ L RNSV. L RRRREQ L KHCLEQ L RNSRA

RKRREQ

TSEKDL

QHNYAAPP QHNTAAPP

CH

L

L

RRREQ RXRREQ

L L

KH=Q KHKLEQ nEQ

L

RNSGA

L L RNSCA L ISEDL L L ISEDL L RKRREQ L KHKIEQ L RNSCA

Fig. 3. Prominent conserved regions in the c-myc protein during evolution. (A) The major observations on the nucleotide and amino acid sequences (boxed areas) of c-myc I and c-myc II are illustrated. (B) Sequence of a possible metal binding domain. (C) Potential protein instability sequence, PEST motif (Pro, Glu, Ser, Thr rich region) see text. (D) Leucine zipper domain. The sequences aligned are trout (T), Xenopus myc I and II (XI and X2), chicken (CH), mouse (M), rat (R), cat (C) and human (H).

3B). Basic amino acids are localized in the C-terminal region of the protein, in one of the most conserved domains of the protein, which contains the nuclear migration signal of the human c-myc protein (Dang and Lee, 1988), and 5' to this the domain conserved between c-myc and myoD, achaetescute and twist (Murre et al., 1989 and references cited therein). Five cysteine residues are conserved in each of the eight sequences compared, which suggests that they have an important structural role. One of them is found in a cluster of histidines around position 350, in an amino acid sequence characteristic of a potential metal binding domain (Berg, 1986) (Figure 3C). Recently a new category of DNA binding proteins, includingfos, jun and the human c-myc protein was defined as containing a 'leucine zipper' domain (Landschulz et al., 1988). This motif is involved in oligomerization and transforming activity of human c-myc protein (Dang et al., 1989) and is especially conserved in the evolution of c-myc protein (Figure 3D). Other conservations were also noted: tryptophan is the least abundant and the most conserved amino acid of c-myc; serine, threonine and proline are the most abundant amino acids, but in contrast to some other nuclear proteins they have no specific localization in the coding sequence. We also noted that a succession of extremely conserved tyrosines (Y) and aspartic acids (D) is located at the beginning of the protein (Figure 2). The comparisons made in this section reveal strong conservations of amino acid sequences that might direct precise mutagenesis studies of the protein in order to disect specific functions. 4094

Fig. 4. Characterization of the two c-myc mRNAs. (A) Total cellular RNA isolated from oocytes (stage IV and V) and embyros at different stages of development (Nieuwkoop and Faber, 1956) was analysed for c-myc content by Northern blot analysis (see Materials and methods). The filter was probed with a HincII-SacI DNA fragment conserved in both c-myc I and c-myc II. ((B) RNA samples from oocyte stage V (lane 1) and embryo stage 46 (lane 2) were loaded side by side on wide slots to increase resolution. (C) Oocyte poly(A)+ RNA was used for this experiment to increase the sensitivity of the assay. The Northern blot was first probed with an oligonucleotide specific for c-myc I mRNA (lane 1) and further hybridized with an oligonucleotide specific for c-myc II mRNA (lane 2), without dehybridization. Oligonucleotides were the same as used for the primer extension experiment (Figure 6). Embryonic stages are according to Dumont (1972).

Differential expression of myc I and myc 11 genes The presence of two different c-myc cDNAs in the oocyte library from oocyte poly(A)+ RNA indicated that both c-myc I and c-myc II were expressed during oogenesis. However, previous analyses of c-myc expression during Xenopus development did not reveal the presence of two different c-myc RNAs (King et al., 1986; Taylor et al., 1986; Nishikura, 1987). Therefore, Northern blot analysis of total

Differential expression of Xenopus c-myc

A

A

.. M 0

6

1

t6 25 I

kb

g-- 622

2.5,

p

1 .8F1.

309

Pl

N.

4*

-* - 217 0-

P2 t:

-l612

B k Li 2.7k

-

,4_

*1 122

6 A 11 1 6 2 5 2 ;

1I

Fig. 5. Differential expression of Xenopus c-myc I and c-myc II genes during oogenesis and embryonic development. Total cellular RNA isolated from oocytes and embryos at different stages of development (Nieuwkoop and Faber, 1956) was analysed by Northern blot analysis (see Materials and methods). (A) The filter was probed with the oligonucleotide specific for c-myc I mRNA. (B) The filter was probed with the oligonucleotide specific for c-myc II mRNA. RNA was from total oocyte (0), staged oocyte (04, 05) egg (1), cleavage stage (2-8), gastrula (11), neurula (16), tailbud (25-28), tadpole (36-45), according to Dumont (1972). Staining of the gel after transfert, as well as detection of 18S and 28S rRNA on the filter, showed that RNA transfert was homogeneous.

RNA extracted from oocytes and embryos was performed after long runs at low voltage on denaturing agarose gels, to increase the resolution of RNA bands. These gels were hybridized with a Xenopus probe homologous to both c-myc I and c-myc II (positions 675-1 170, Figure 1). Figure 4(A) shows that in the pool of maternal RNA the band of 2.5 kb is actually a doublet of 2.5 kb (the major species) and 2.7 kb. There is also a 1.8 kb RNA species detected mainly during oogenesis, as previously described (King et al., 1986; Taylor et al., 1986). Most of the store of maternal 2.5 kb RNA, which is attributable to c-myc I (see below), is degraded during the early cleavage stages with a minimum level at the gastrula. From neurula onward a gradual increase in the steady state of this RNA was detected, presumably as a consequence of zygotic transcription. A slight decrease in the mol. wt of this RNA was also observed during this period and confirmed in Figure 4(B), when stage V oocyte and stage 46 embryo RNAs were electrophoresed side by side. This might be due to adenylation changes in the mRNA population during oocyte maturation and embryo development (Dambrough and Ford, 1979; Dworkin and Dworkin-Rasfl, 1985; M.Philippe, personal communication). In contrast to the 2.5 kb RNA, the second transcript of 2.7 kb detected during oogenesis, which is attributable to c-myc II (see below), was degraded during the early cleavage stages and did not reappear during later stages of development (Figure 4A). To demonstrate that the 2.5 and 2.7 kb mRNAs corresponded to the transcription of the Xenopus c-myc I and c-myc II genes, specific oligonucleotide probes for each of c-myc I and c-myc II were used. As the 5' untranslated regions differ between these two mRNAs, two synthetic

B

I.

Sf

-2 p

-U'

)U.OCYTE F MBRYO O

M

-

-

Fig. 6. Localization of the 5' end of c-myc I and c-myc H transcription units. (A) Primer extension experiments of c-myc I and c-myc II mRNAs in the oocyte and embryo. Lanes 1 and 2 are oocyte mRNA primed either with c-myc I oligodeoxynucleotide (1) or c-myc II oligodeoxynucleotide (2). Lanes 3 and 4 are embryo mRNA primed either with c-myc I oligodeoxynucleotide (3) or c-myc II oligodeoxynucleotide (4). The marker (M) is an end labelled pBR322 HpaII digest. (B) Summary of the primer extension experiment: the 5' ends of the mRNA are shown, upstream of the oligonucleotide primers

(boxed).

gene-specific antisense 29mer oligodeoxynucleotide probes were prepared from this region. These probes were hybridized to Northern blots of poly(A)+ RNA extracted from oocytes. When the antisense oligonucleotide specific for c-myc I was used, the 2.5 kb RNA was the major species detected (Figure 4C, lane 1). The 1.8 kb RNA was also. detected, indicating that it also results from expression of c-myc I. The same Northern blot was then hybridized with the antisense oligonucleotide specific for c-myc H, without previous dehybridization. An additional 2.7 kb mRNA species was detected, indicating that it corresponds to the transcription of Xenopus c-myc II (Figure 4C, lane 2). The c-myc II transcript was 5-1O % of the level of total c-myc RNA in the oocyte (Figure 4A and C), in agreement with the ratio of c-myc I to c-myc II clones obtained during the screening of the oocyte library. A more detailed analysis of the expression of the two Xenopus c-myc genes was carried out with the specific antisense oligodeoxynucleotides. Northern blots of total RNA extracted from different embryonic stages were probed with either the oligonucleotide specific for c-myc I or the oligonucleotide specific for c-myc II. Figure 5(A) shows that c-myc I is expressed during both oogenesis and embryonic development, as previously observed for the major c-myc RNA species (Taylor et al., 1986) (Figure 4). In contrast, the expression of c-myc H does not follow the same pattern (Figure SB). The 2.7 kb c-myc H RNA is expressed during 4095

S.Vriz, M.Taylor and M.Mechali

oogenesis, and is progressively degraded after fertilization, but is not detected again during later stages of development. A faint band at 2.5 kb was also detected with the oligonucleotide specific for c-myc II RNA. It is unlikely to be the result of a cross-hybridization with c-myc I mRNA as this signal did not reappear after the gastrula stage. It is formally possible that the use of 5' end-labelled oligonucleotides probes, together with the fact that c-myc II was less expressed than c-myc I, might have prevented the detection of c-myc II mRNA during late stages. However, overexposures of the Northern blot, together with repeated hybridizations in separate experiments (data not shown), and primer extension experiments (see below) have all failed to detect the zygotic expression of Xenopus c-myc II. It therefore appears that transcription from c-myc II is indeed oocyte specific. The c-myc I and c-myc II RNAs are the products of two distinct genes (see above). As X. laevis duplicated its genome 30 million years ago (Bisbee et al., 1977), most genes detected so far in Xenopus, including c-myc (Taylor et al., 1986; Nishikura et al., 1987), are found in two copies per haploid genome. The conservation of both c-myc sequences in an active form during evolution might indicate that both genes are functionally important. Our finding of oocytespecific expression from one gene and oocyte and zygotic expression from the other is the first demonstration, as far as we are aware, of differential expression of two duplicated polymerase TI-transcribed genes in Xenopus development. It would be interesting to know how widespread this phenomenon is. Localization of the 5' ends of Xenopus c-myc I and c-myc 11 transcription units cDNA sequence analyses revealed no difference in size in the 3' untranslated sequences of c-myc I and c-myc H mRNA that could account for the difference of 200 nucleotides observed by Northern blot analysis. To detect a possible difference at the 5' end, primer extension experiments were carried out using the antisense oligonucleotides specific for c-myc I or c-myc II RNA as gene-specific primers. The two antisense oligonucleotides are at the same distance (29 nucleotides) from the 5' end of the translated sequence. Oocyte or embryonic poly(A)+ RNA was hybridized with either the c-myc I or c-myc II primer oligonucleotide and extended using reverse transcriptase (Figure 6). With oocyte RNA (lanes 1 and 2) one major end product was found in c-myc I RNA (P2) extending 140 nucleotides beyond the primer, together with a longer product (P1) separated by 47 nucleotides from P2. The 5' end of c-myc II RNA was longer with a maximum end product (P') extending 361 nucleotides beyond the primer. The diffuse pattern of RNA extension observed is at least partly due to pauses by the reverse transcriptase during elongation of the primer rather than multiple initiation points, as this heterogeneity was not apparent on the Northern blot experiments. However, multiple points for transcript initiation have been observed in several other genes, e.g. N-myc (Kohl et al., 1986) and c-myb (Bender and Kuehl, 1986). With embryonic RNA (lanes 3 and 4) one major end product was detected in c-myc I RNA (P2) (Figure 6 and overexposure of the same gel not shown), whereas no signal was detected with the c-myc II antisense oligonucleotide. This result was in agreement with the observation that there is no zygotic expression of c-myc II. In summary (Figure 6B),

4096

the size difference between oocyte c-myc I and c-myc II RNA is due to a 220 nucleotide longer 5' end region in c-myc II RNA, and in post-gastrula embryos c-myc I RNA is transcribed from an initiation site 200 nucleotides upstream of the initiation codon. Taken together these data suggest that Xenopus c-myc I mRNA is transcribed from two promoters, P1 and P2, with a preference for P2, whereas c-myc II mRNA has an upstream start site, P'. The human c-myc gene is mainly transcribed from two promoters, P1 and P2, separated by 150 bases (Battey et al., 1983), and P2 is the major start site. In addition, an upstream promoter, P0, was found to be active in a B cell lymphoma, giving rise to a diffuse pattern of starts and contributing 5% of the total c-myc transcripts (Bentley and Groudine, 1986a). C-myc mRNAs initiated at the PO site appear to be the most stable (Bentley and Groudine, 1986b). In the normal regulation of c-myc, 5' sequences might be involved in post-transcriptional regulation (Pei and Calame, 1988) or translational regulation (Darveau et al., 1985; Parkin et al., 1988), although the major post-transcriptional sequences involved in c-myc mRNA stability are localized in the 3' end of this RNA (Jones and Cole, 1987). It is striking to note that in Xenopus the major c-myc mRNA population originates from P1 and P2 starts on the c-myc I gene, whereas the minor c-myc mRNA of the oocyte originates from diffuse starts at an upstream P' promoter on the c-myc II gene. This observation could reflect the use of two different genes to produce in Xenopus the range of RNA species observed with mammalian c-myc (Lindsten et al., 1988). Differential regulation of the two c-myc genes during -

embryonic development C-myc I is expressed from the maternal genome in oocytes as well as from the zygotic genome in post-gastrula embryos. In contrast, c-myc II is only expressed from the maternal genome. We do not know the reason for this differential activity of the Xenopus c-myc I and c-myc II genes, but the conservation of their potential for transcription during evolution together with the minor changes observed in their coding sequence suggests it may be significant. The differential activity of two very closely related genes during oogenesis and embryonic development has been described previously for the polymerase III-transcribed Xenopus 5S RNA genes. Two classes of 5S RNA genes, which differ by only 6 out of 120 nucleotides, are expressed in Xenopus: the oocyte type is active only in oocytes, whereas the somatic type is active in both oocytes and post-gastrula embryos (for review see Wolffe and Brown, 1988). The study of the regulation of these genes gave prominence to the role of specific factors that play a role in the regulation of genes transcribed by RNA polymerase Ill. Similarly, the differential expression of the Xenopus c-myc I and c-myc II genes might provide new insights into the mechanism by which a gene transcribed by RNA polymerase II is developmentally regulated.

Materials and methods Oocytes, eggs and embryos Xenopus animals imported from South Africa (South Africa Farms, Fish Hoek) were accommodated and fed as described (Gurdon, 1967), and oocytes, eggs and embryos were collected as described previously (Taylor et al., 1986).

Differential expression of Xenopus c-myc Molecular cloning and sequencing Xenopus c-myc cDNA was isolated from a cDNA library prepared from the poly(A)+ RNA of defolliculated ocytes (Rebagliati et al., 1985). Screening of the recombinant library was performed as described (Taylor et al., 1986). Subcloning of the positive clones was either in Bluescribe (Genofit) or in M 13 vectors as described (Messing, 1983). Sequencing was determined by the method of Sanger et al. (1977) either in single-stranded or double stranded DNA. Subclones were obtained from defined restriction fragments or by progressive digestion of a fragment by exonuclease III (Taylor et al., 1986).

RNA preparation and hybridization RNA was prepared and separated on an agarose gel as described (Taylor et al., 1986). Northern blots were on Hybond N nylon membranes (Amersham) and hybridization was with Xenopus c-myc single-strand probes prepared as described (Messing, 1983), or with oligonucleotide probes. With the single-strand probe the radioactive strand was isolated by acrylamide gel electrophoresis followed by electroelution. Hybridization was at 42°C in 50% formamide as described (Maniatis et al., 1982). The oligonucleotidespecific probe for c-myc I and c-myc II was 5'-CTAATCCCGGGATAATACGAGTCCAATAT-3' and was 5'-TATCGCTGATAGTGATGCAGATCTCAGGA-3' respectively. These were labelled with polynucleotide kinase (Boehringer) as described (Maniatis et al., 1982) and the labelled oligonucleotide probes were purified with a Biogel P6 column. Hybridization was at 50°C and washing was up to 2 x SSPE, I% SDS, 45°C. Primer extension assay Gene-specific oligonucleotide primers (see above) were labelled with [Ly-32P]ATP and polynucleotide kinase to -6 x 106 c.p.m./pmol and purified on a DE-52 ion-exchange column. Poly(A) + RNA (7 Ag) and 0.1 pmol of labelled primer were ethanol precipitated, resuspended in 16 Al of 15 mM Tris-HCI, pH 8, 1 mM EDTA and denatured (2 min, 85°C). NaCl (1.3 ytl, 5 M) was added and the sample incubated (1 h, 60°C). The following were quickly added: 100 mM Tris-HCI, pH 8.3, 140 mM KCI, 10 mM MgCI2, 20 mM (3-mercaptoethanol, 0.4 mM each of dGTP, dCTP, TTP, dATP, 45 ,tg/ml actinomycin D, 4% PEG 6000 and 20 U reverse transcriptase (Genofit). The mix (80 ttl) was incubated at 42°C for 1 h, chloroform extracted, ethanol precipitated and resuspended in TE/formamide sequencing dyes for analysis on a 6% sequencing gel.

Acknowledgements We thank A.L.Haenni and M.Nadal for critical reading and comments. We also thank M.Leibovici, Y.Andeol and M.Gusse for helpful discussions. This work was supported by the Association pour la Recherche sur le Cancer, INSERM and Ligue Nationale Francaise contre le Cancer. M.V.T. was the recipient of an EMBO long-term fellowship.

References Alitalo,K.G., Ransay, Bishop,J.M., Pfeifer,S., Colby,W. and Levinson,A. (1983) Nature, 306, 274-277. Battey,J., Moulding,C., Taub,R., Murphy,W., Stewart,T., Potter,H., Lenoir,G. and Leder,P. (1983) Cell, 34, 779-787. Bender,T.P. and Kuehl,W.M. (1986) Proc. Natl. Acad. Sci. USA, 83, 3204-3208. Bentley,D.L. and Groudine,M. (1986a) Nature, 312, 702-706. Bentley,D.L. and Groudine,M. (1986b) Mol. Cell. Biol., 6, 3481-3489. Berg,J.M. (1986) Science, 232, 485-487. Bernard,O., Cory,S., Gerondakis,S., Webb,E. and Adams,J.A. (1983) EMBO J., 2, 2375-2383. Bisbee,C.A., Baker,M.A., Wilson,A.C., Hadji-Azimi,I. and Fischberg,M. (1977) Science, 195, 785-787. Blanchard,J.M., Piechaczyk,M., Dani,C., Chambard,J.C., Franchi,A., Pouyssegur,J. and Jeanteur,P. (1985) Nature, 317, 443 -445. Caput,D., Beutler,B., Hartog,K., Thayer,R., Brown-Shimer,S. and Cerami,A. (1986) Proc. Natl. Acad. Sci. USA, 83, 1670-1674. Cole,M.D. (1986) Annu. Rev. Genet., 20, 361-384. Dang,C.V. and Lee,W.M.F. (1988) Mol. Cell. Biol., 8, 4048-4054. Dang,C.V., McGuire,M., Buckmire,M. and Lee,W.M.F. (1989) Nature, 337, 664-666. Dani,C., Mechti,N., Piechaczyk,M., Lebleu,B., Jeanteur,P. and Blanchard,J.M. (1985) Proc. NatI. Acad. Sci. USA, 82, 4896-4899. Darnbrough,C. and Ford,P.J. (1979) Dev. Biol., 71, 323-340. Darveau,A., Pelletier,J. and Sonenberg,N. (1985) Proc. Natl. Acad. Sci. USA, 82, 2315-2319.

Davis,R.L., Weintraub,H. and Lassar,A.B. (1987) Cell, 51, 987-1000. Dumont,J.N. (1972) J. Morphol., 136, 153-164. Dworkin,M.B. and Dworkin-Rastl,E. (1985) Dev. Biol., 112, 451-457. Esienman,R.N., Tachibana,C.Y., Alrams,H.D. and Hann,S.R. (1985) Mol. Cell. Biol., 5, 114-126. Greenberg,M.E. and Ziff,E.B. (1984) Nature, 311, 433-438. Gurdon,J.B. (1967) In Wilt,F.H. and Wessels,L.K. (eds), Methods in Developmental Biology. Crowell, New York, pp. 75-84. Hann,S.R., Abrams,H.D., Rohrschneider,L.R. and Eisenman,R.N. (1983) Cell, 34, 789-798. Hann,S., King,M., Bentley,D., Anderson,C. and Eisenman,R. (1988) Cell, 52, 185-195. Hayashi,K., Makino,R., Kawamura,H., Arisawa,A. and Yoneda,K. (1987) Nucleic Acids Res., 15, 6419-6436. Heikkila,R., Schwab,G., Wickstrom,E., Loke,S.L., Pluznik,D.H., Watt,R. and Neckers,L.M. (1987) Nature, 328, 445-449. Hourdry,J., Brulefert,A., Gusse,M., Schoevaert,D., Taylor,M.V. and Mechali,M. (1988) Development, 104, 631-641. Jones,T.R. and Cole,M.D. (1987) Mol. Cell. Biol., 7, 4513-4521. Kaczmarek,L., Calabretta,B. and Baserga,R. (1985) Proc. Natl. Acad. Sci. USA, 82, 5375-5379. Kelly,K., Cochran,B.H., Stiles,C.D. and Leder,P. (1983) Cell, 35, 603-610. King,M.W., Roberts,J.M. and Eisenman,R.N. (1986) Mol. Cell. Biol., 6,

4499-4508.

Kohl,N.E., Legouy,E., DePinho,R.A., Nisen,P.D., Smith,R.K., Gee,C.E. and Alt,F.N. (1986) Nature, 319, 73-77. Landschulz,W.H., Johnson,P.F. and McKnight,S.L. (1988) Science, 240, 1759-1764. Lindsten,T., June,C.H. and Thompson,C.B. (1988) EMBO J., 7, 2787-2794. Maniatis,T., Fritsch,E.F. and Sambrook,J. (1982) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbory, NY. Messing,J. (1983) Methods Enzjymol., 101, 21-78. Murre,C., Schonleber Mc Caw,P. and Baltimore,D. (1989) Cell, 56, 777-783. Nieuwkoop,P.D. and Faber,J. (1956) Normal Table of Xenopus laevis Daudin. North Holland, Amsterdam. Nishikura,K. (1987) Oncogene Res., 1, 179-191. Parkin,N., Darveau,A., Nicholson,R. and Sonenberg,N. (1988) Mol. Cell. Biol., 8, 2875-2888. Pei,R. and Calame,K. (1988) Mol. Cell. Biol., 8, 2860-2868. Prochownik,E.V., Kukonska,J. and Rodgers,C. (1988) Mol. Cell. Biol., 8, 3683-3695. Rebagliati,M.R., Weeks,D.L., Harvey,R.P. and Melton,D.A. (1985) Cell, 42, 769-777. Rechsteiner,M., Rogers,S. and Rote,K. (1987) Trends Biochem. Sci., 12, 390-394. Sanger,F., Nicklen,S. and Coulson,A.R. (1977) Proc. Natl. Acad. Sci. USA, 74, 5463-5467. Sarid,J., Halazonetis,T.H., Murphy,W. and Leder,Ph. (1987) Proc. Natl. Acad. Sci. USA, 84, 170-173. Shaw,G. and Kamen,R. (1986) Cell, 46, 659-667. Smith,T.F. and Waterman,M.S. (1981) Adv. Appl. Math., 2, 482-489. Stewart,M.A., Forrest,D., McFarlane,R., Onions,D., Wilkie,N. and Neil,J.C. (1986) Virology, 154, 121-134. Stone,J., de Lange,T., Ramsay,G., Jakobovits,E., Bishop,J.M., Varmus,H. and Lee,W. (1987) Mol. Cell. Biol., 7, 1697-1709. Taylor,M.V., Gusse,M., Evans,G.I., Dathan,N. and Mechali,M. (1986) EMBO J., 5, 3563-3570. Van Beneden,R.J., Watson,D.K., Chen,T.T., Lautenberger,J.A. and Papas,T.S. (1986) Natl. Acad. Sci. USA, 83, 3698-3702. Villares,R. and Cabrera,C.V. (1987) Cell, 50, 415-424. Vriz,S. and Mechali,M. (1989) FEBS Lett., 251, 201-206. Watson,D.K., Reddy,E.P., Duesberg,P.H. and Papas,T.S. (1983) Proc. Natl. Acad. Sci. USA, 80, 2146-2150. Wolffe,A.P. and Brown,D.D. (1988) Science, 241, 1626-1632.

Received on August 14, 1989; revised on September 6, 1989

4097