Document not found! Please try again

Nucleotide sequence of the delta-beta-globin intergenic segment in ...

1 downloads 0 Views 1MB Size Report
Springer-Verlag New York Inc. 1987. Nucleotide Sequence of the Delta-Beta-Globin Intergenic Segment in the. Macaque: Structure and Evolutionary Rates in ...
J Mol Evol (1987) 24:297-308

Journal of Molecular Evolution (~) Springer-Verlag New York Inc. 1987

Nucleotide Sequence of the Delta-Beta-Globin Intergenic Segment in the Macaque: Structure and Evolutionary Rates in Higher Primates P. Savatier, G. Trabuchet, Y. Chebloune, C. Faure, G. Verdier, and V.M. Nigon D6partement de BiologicG6n~raleet Appliqu6e--UA 92, Universit6 Claude Bernard--LyonI, 43 Boulevarddu 11 novembre 1918, 69622 VilleurbanneCedex, France

Summary. A 5600-base-pair (bp) fragment including the beta-globin gene and about 4000 bp of its 5' flanking sequence was cloned from the DNA o f M a c a c a cynomolgus (an Old World monkey), and the .5' flanking region was sequenced. Comparison with human, chimpanzee, mouse, rabbit, and Xenopus orthologous sequences reveals a tandemly repeated sequence called RS4 at the same position (about 500 bp 5' from the transcription start of the adult beta-globin gene) in all six species. We suggest that a tandemly repeated sequence has been maintained by functional constraints since the divergence between amphibians and reptiles. Excluding tandemly repeated sequences as well as about 400 nucleotides upstream from the cap site, the average base substitution frequencies among human, chimpanzee, and macaque intergenic sequences were calculated. They appear to be strongly correlated with the delta T50 values measured between the corresponding nuclear DNAs. They are also similar to base substitution frequencies calculated by Chang and Slightom (1984) at the pseudoeta-globin locus. Thus, exclusion o f sequences involved in specific modes of variation might allow the use o f intergenic sequences for the accurate calculation o f genetic distances. Using a time scale based on the dating o f the Atlantic split, we estimate the base substitution rate of primate noncoding DNA to be 1.0 x 10 - 9 substitution/site/year.

Offprint requests to: P. Savatier

Key words:

DNA sequencing -- Macaque -- Intergenic D N A -- Repeated sequences -- Delta T50 -- DNA divergence rate

Introduction For purposes of studying the tempo ofgenomic D N A evolution, nucleic acid comparisons have been carried out mainly by three different methods: D N A D N A annealing and both restriction map and nucleotide sequence comparisons. In the first method, D N A - D N A heteroduplexes are formed between single-copy sequences o f the nuclear DNAs. The number of mismatches, corresponding to nucleotide differences, between the two DNAs is measured by the decrease in the thermal stability o f these heteroduplexes (Benveniste and Tadaro 1975, 1976). This method cannot be used to compare distantly related species since their DNAs do not hybridize with sufficient efficiency. Restriction map comparisons (Helm-Bychowsky and Wilson 1986) require large nucleotide segments to obtain accurate estimates of evolutionary distances. However, as with the heteroduplex method, this method cannot be used to compare distantly related species because their D N A s may have undergone local rearrangements (duplications, deletions, or inversions). The third method makes use o f nucleotide sequence comparisons between orthologous sequences. Until recently, most of these comparisons were carried out using coding sequences. These comparisons led

298 to important findings although the models used in the calculations contained assumptions that were not strictly valid (Miyata and Yasunaga 1980; Perler et al. 1980; Li et al. 1985). Moreover, this method cannot be used to compare closely related species because the number of nucleotide differences between homologous silent positions is too small to give accurate estimates of evolutionary distances. Recently, the analysis of the primate beta-globin pseudogene, on which selection does not act at the translation level, has provided more accurate measurements of neutral substitution rates in various primate lineages (Chang and Slightom 1984; Goodman et al. 1984; Harris et al. 1984). However, these comparisons cannot be extended to more distant mammals, since in artiodactyls, rodents, and lagomorphs, the descendants of the ancestral proto-eta gene either have been deleted or are still functionally active (Collins and Weissman 1984; Goodman et al. 1984). Consequently, none of these methods is able to provide a comparison between both closely and distantly related species with high efficiency. Extragenic sequence comparisons would most probably resolve this problem since large nucleotide segments are available for these and orthologous relationships between "beta-like" globin intergenic sequences of different mammalian species are now well established (Collins and Weissman 1984; Goodman et al. 1984). However, very few comparisons of large intergenic sequences have been carried out (Maeda et al. 1983; Sawada et al. 1983; Savatier et al. 1985). Consequently, we lack information about both their features and their evolution. Little is known about functional constraints acting on intergenic sequences, although recent studies on globin gene expression suggest that they may contain controlling elements involved in gene expression (Semenza et al. 1984a; Wright et al. 1984; Charnay et al. 1985; Townes et al. 1985; Fordis et al. 1986). Therefore, it is of paramount importance to determine whether or not extragenic sequences evolve at a rate similar to the putatively neutral rate found in pseudogenes. In this context, we had previously sequenced most of the delta-beta-globin intergenic segment from chimpanzee (Savatier et al. 1985). We report here the homologous sequence from an Old World monkey, Macaca cynomolgus, and compare it with its human and chimpanzee counterparts. We report also a sequence comparison between part of the human delta-beta-globin intergenic segment and the orthologous pseudo-beta-h3-beta-1 am,j segment from mouse (Gilmour et al. 1984). We have identified local variations of the DNA structure or of its evolutionary rate that might be related to local functional constraints. We have also calculated the base substitution rate in primate extragenic DNA.

Materials and M e t h o d s DNA Isolation. DNA from a macaque (Macaca cynomolgus) was prepared from spleen provided by Rhrne-M~rieux. DNA from a lowland gorilla (Gorilla gorilla) was prepared from a blood sample provided by the Centre International de Recherches Mrdicales de Franceville (Gabon). DNA was purified by standard methods.

Cloning Procedure. Blot hybridization experiments were carried out to identify the EcoRI-generated restriction fragment containing the beta-globin gene in macaque and gorilla DNAs (data not shown). A macaque 5.6-kb EcoRI fragment (see Fig. 1) and a gorilla 6.4-kb EcoRI fragmem (see Fig. I in Savatier et al. 1987), both containing most of the beta-globin gene as well as 4-5 kb of its 5' Hanking region, were cloned into phage lambda gt WES lambda C (Leder et al. 1977) by using size-fractionated EcoRI digests (5-7 kb) of genomic DNAs. The resulting phages were screened with the homologous h u m a n 5.5-kb EcoRI fragment. The 5.6-kb and 6.4-kb inserts from positively hybridizing phages were subcloned into the EcoRI site o f pBR328 (macaque) or pEMBL8 (gorilla). We refer to these clones as " M a beta-Eco 5.6kb" and "Go beta-Eco 6.4kb," respectively. DNA Sequence Determination. The insert DNAs were digested with BamHI. The resulting fragments were subsequently subcloned into the appropriate M I 3 m p l 0 or mpl 1 vectors (Messing and Vieira 1982). From these subclones, randomly terminated fragments were generated by Bal 31 nuclease digestion follt~wed by blunt-end ligation into the SmaI site of M13 m p l 0 and Mpl I according to the method of Poncz et al. (1982). Enzymatic sequencing was carried out by the method of Sanger et al. (1977) using [JsS]dATP, essentially as described by Biggin et al. (1983). Most of the region was analyzed in both directions with overlaps. The strategy for sequencing the Ma beta-Eco 5.6kb clone is presented in Fig. lB. The strategy for sequencing the gorilla beta-globin gene is presented in Savatier et al. (1987). Sequence Analysis. Sequencing data were analyzed with the aid of an IBM PC computer and a Hewlett-Packard graphic plotter. For sequence comparisons, we developed a best-alignment program as well as a dot matrix comparison program. The latterallows the selection of an average homology between 0 and 100% inside segments up to 100 nucleotides in length; the algorithm used is similar to that developed by White et al. (1984). The alignment of the human and mouse sequences was done on a VAX computer using the algorithm developed by Needleman and Wunsch (1970). To detect and quantify local repetitions in the sequences studied, we used the method described by Hasson et al. (1984). This method computes a positive value measuring the extent to which each nucleotide falls in or close to repeated sequences. Briefly, the longer a given repetitive stretch, the higher this value is. Two statistical tests were used to investigate local aggregation of n ucleotide differences: (1) The "number-of-runs test" analyzes the presence ~f stretches of differences (Wald and Wolfowitz 1940). (2) The "group test" analyzes the variance of distances between divergence sites and is a powerful detector of local aggregation of nueleotide differences (Dixon 1940).

Results

A 5.6-kb EcoRI fragment from the macaque DNA including the beta-globin gene and its 5' surrounding sequences (Fig. 1A) was cloned. Its nucleotide se-

299 A

v Be

5t

E

E

BP E

3 ~ LtKB~

RS 1

! i !

I

IG 1

L e

I

RS 2 IG 2

I

RS 3 IG 3

1

",,

RS 4 IG 4

'

IG 5

~m

~

E

I

I

0.5 KB

I

---

7-. ~. ' Y - - ~

='.~

;--'"

,

=

7---0

" ~

o

C

I

- 4000

I - 3000

/ - 20~0

- 1000

0

Fig. 1A-C. A Linkage map of the macaque delta, and beta-globin gene region deduced from blot hybridization of D N A (data not shown) and showing the Ma beta-Eco 5.6-kb clone. E, EcoRI; B, BamHI; H, HindIII; P, PstI; X, XbaI. li Strategy for sequencing the Ma beta-Eco 5.6-kb clone. Horizontal arrows below map indicate direction of sequencing either from restriction endonuclease sites (filled circles) or from Bal 31-generated ends (open circles). RS 1, RS2, RS3, and RS4 indicate locations of the four tandemly repeated sequences and IG 1, IG2, IG3, IG4, and IG5 denote the intergenic segments located between the repeat segments (see Results). In bar at right, the hatched area represents the 5' untranslated region, the filled areas represent introns, and the open areas represent exons. C A m o u n t of repetition measured along h u m a n and macaque sequences using the method of Hasson et al. (1984). Macaque sequence is represented by solid line, h u m a n sequence by dotted line. Numbering begins at the cap site

quence (5642 nucleotides) was determined and then compared with the homologous human sequence (5548 nucleotides; Poncz et al. 1983) as well as with the chimpanzee sequence (5532 nucleotides; Savatier et al. 1985). The alignment of the 5' flanking regions (4239 bp in macaque, 4154 bp in human, 4138 bp in chimpanzee) is presented in Fig. 2. The nucleotide sequence of both the coding and the intervening parts of the macaque beta-globin gene is reported in Savatier et al. (1987).

Intergenic and Repeated Segments Two kinds of differences are observed: nucleotide substitutions and insertion-deletion events. The latter require one to insert gaps in the sequence to obtain optimal alignment. Insertion-deletions accumulate preferentially in four regions located at homologous positions in the three compared species. Each region corresponds to a long, tandemly repeated sequence of variable length whose probability of random occurrence is lower than 0.001. We refer to these regions as repeated segments (RSs)

and number them RS 1-RS4. These tandem arrays divide the delta-beta intergenic segment (IG) into five parts, which we specifiy as I G I - I G 5 (Fig. 1A). Using the method developed by Hasson et al. (1984), we calculated a positive value measuring the amount of repetitiveness surrounding each nucleotide of the human and macaque sequences. We took into account homonucleotide repeats X, with n --3, and di-, tri-, tetra-, and pentanucleotide repeats with n --- 2. A value of zero is attributed to nucleotides that do not belong to tandem arrays. Average values were calculated for overlapping segments of 50 nucleotides (Fig. 1C). In both the human and macaque sequences, we find four main peaks. Each of them is located in a nearly identical position in the two species and corresponds to one of the four RSs described above. These four repeated segments therefore appear as the major repetitive features of the delta-beta intergenic segment. Their structures are as follows: (1) RS1 is a tandem array of six TG dinucleotides in human, eight in chimpanzee. In the macaque, we find an imperfect repeat (PuPy)sz with 45 TG, 3 TC, and 4 TA dinucleotides. (2) RS2 is a

3OO Ch~n~p . . .4100 . . . . . . . Rc~TTD~$~TTQTnTT~TT~TTTGTT~L~T~`~`T~TT~aTap~T~3TCT~r~TGTcTG~T~TcL3T~L3T~~TC~nT~TCTn ~ (c c

~lq =n ChlWq~

G

Hac a q" ~ u ~-U.aMn

- - - r -~ r RCCT~T u . n

Chtmp

c

Hec~q

G

~

T~ u f

$

u t n T T~jC r ~ T C .

CC

GQTCaR~dqT ~ T C TGT~Q~ ~ G ~ C n l C ~ G T

T

C

t u T ~ C Tn

R~G

C T TT T C 5 C ~4T~I~TG r . ~ T TGC 1"T T~ T ~ - ~ I ~ R T ~ T C~C,T~ ~'GTCTGLR~T "

"

T

T'C~I~U r ~ l g i u~ ~ | GTI~TGTLI i ~ ~u ~'u i u~l u , u ] c~i u i o l ~ i

C~L~p

~TR~TTT~4RTG.~C~T~TL~.~Tn~TCTCTT~4~T~TCCnTC1CTT~TT~TC~TC~T~u

aci

i~q"

~G~u

YRCCTC T ~ T

CTO~T C TC~CITT Tn~C T ~ T G T ~ C C ' f T n CT

ChSn~ ~mq

C T~.~ T C T C C ~ T ~ T T ~ C n T ~ t C ~ T T ~ T T .. .u 6 ~

IlqlC~TC T(~qCCI~I:IGTIq~I~WI~T n~ 7 ~l:ff 01~I~IrAT~OTI3~~0~C T C~61~r ~T ~ .

T7 ~

C T GT~ T ~ T * n

~

TS

T

T T ~ - ~ - - - ~'~ T~q T t C T~ : 1 T CL:I R~ I~

~

TGTGTC T n t O T C ~

T T T~T T ~ T T T

~Dq TT~T~T~-r~TTTTT~T~TQTCCT~TTT~C~'~n~TTTG~T~T~u165165165

C

ChS"~ -Z

M*ClQ

~

~

. . . ~TT~1T~C~u165

.

.

.

T

~

UT~I:I~

u

o

Cn

.

.

GT~TT

1

.

-

C

~C,~T T ~ T ~ ' [ , ?TT C ~ G ~ T T T ~ G C T ~ T

T T ~ T TT ~ T ~

TKT ~ E " T ~ C ~ T T

T~T"t

~"

L~,~ )~j~:q T~`TCf4C~T~T~M~-T~C~TT~ ~G~`~cCC~CCTT~C~:4T~T~T~TCT~TT~~TCCnT~TTCqTcnTTt~ ~ C~tmp

c

RR

Chlm~p M4cDq vk,m*~

TTTCr~CT~L~TDs

14~r I q Human

TGTCC~,~TETI~TRTI:IT T T T~ G l T TR~L~TI~ ~ TT T i ~ q ~ ( ' I ~ C ~s 1 ~ TI~CC~ T . 5 n

1C~TGT~TC .

~.

. . . . T .T ~ : ~ T.~ L ~ T ~. T ~ T ~. . T T ( ~. r

.

HKIq

~:,oO .

TC T ~ R T G T O T C ~ C ~ T G I q T C ~ T

C~t~

.

g

T

Y TCT T T ~ I ~ G T ~

T

Ch|~

T

C

~n

TC TRT T t n x . x x G C m T ~ T ~ T TTTI~

C

C

U;M~KX~X.KxX~XXx

x~

CT

13 C T

C

T

~1~

~e~lq 9~

TT TC4:Cq(~CI~CTruq TTl~CCCCqr TGT TI~GTC~IqC T I"TGC.GTTI~TI~I~4~TGI~CT T TT TE T 1TNTTTGTnT T TT T T ~ T C ~ . . ~ T T ~ T a m n

P,f~

"~" ' ~"' ..... ~ ? ~ f l ] ~ 8 ~ 8 ~ , ~ ' "

..... ~V

. ~ ] T TT ~ T

T

Tc1r~TTTT~cTT~T~T~TTTTT~T~TT~T~G~.CcCC~TTTT~T~T~c~~TGTEC~C~TT~C*cCT~ ~. C CC C o ~ r

XT ~ T

. Cr

G

~G

~I~

lqlc aq

.

G

T~

. -1000 t T m T~ J G ~ C ~ T T T~

Chlm~

. ~s165

R

EC ~C~ T 1 ~ T ? r ~ T T T T

~

M*clq ~ln C~|~

. ~ T ~ n T

T

T~ TG xC T ~ T

. [

~CC TG~G~GGGT T G I ~ d ~ I ~ G T I ~ ' ~ T ~ T C

~

G

.

I~ C

xo

C

T 1 T TC I C ~ T ~ O

.

C

T

Ch[n~

.

.

~T

. OT TC T T r o t G ~ C T

u

' ~' " ' ' ' * " r r ~ : c " ' v ~ r r m " T c c ~ ' r ~ ,

- 2OO Cl ~ i : ~

. Tq TG t C T T~

. TC . ~ G T TT r4~l~ TCC ~

C ~

T~

"~176176 ~

fCCqT TC TGT ~C TGT~L~TiqT YI TC~G TnT ~C T ~ ~ 4 ~ C ~ T [ ~ G R ~

. (" 1 E~IT~ T~ TC ~

' ~ TC T~

CTC T ~ T T TTTT~ TC r ~ CRT TT C C C ~ O l"o

GTf~TOTG TGTq T~TI~T~I(~L~TqC ~x I~~rI:ll ~1T~IT~ T~T q I ~Tn Tn T~ r

~

TR?n~mTCCCT~L~ T~LqqT T n T ~ T ~ f C ~ n C C A m

. T CCT i i ~

. i: ~ TG~ C ~ i c ~ q i ~ , ~ C~

5

Tn T~

C C

T C T TCCps T T T T ~ T C ~ T C ~

TGTCnT C ~ * T~

1 ~ TC ~

Fig. 2. Nucleotide sequence of the 5' extragenic region of the Ma beta-Eco 5.6-kb genomic clone. In the corresponding human sequence (Poncz et al. 1983) and the corresponding chimpanzee sequence (Savatier et al. 1985), only differences from the macaque sequence are shown. Numbering (from the cap site) is that of the macaque sequence. Gaps used to maximize identity are indicated by x's. Repeated segments (RS) are shown in boldface (RSI from - 3 7 0 6 to - 3 5 9 5 , RS2 from - 2 6 8 7 to - 2 6 2 4 , RS3 from - 1 4 2 8 to - 1360, RS4 from - 569 to - 513)

301 - |

->

B

HUMAN

/

/J/

b RS4 Ip

/

/

/

Oi

/ -955

|G4'

1412

A

i

(C/~TMn

/ ,

i

i

9

i

i

MDUSE

,

IG 5

i

. i

i

,

-I

Fig. 3. Dot matrix comparison of part of the nucleotide sequence of the psi-beta-h3-beta-ldm~ segment from mouse (positions - 1 to -1412 with respect to the cap site; Gilmour et al. 1984) with the homologous human sequence (positions - 1 to -955 in Fig. 2). The algorithm used allows detection of homologies greater than 50% inside segments of 70 nucleotides t a n d e m array o f 16 or 17 T G dinucleotides in hum a n (Miesfeld et al. 1981), 10 in c h i m p a n z e e , a n d 12 in macaque. (3) RS3 is a t a n d e m a r r a y o f four, five, or six A T T T T pentanucleotides in h u m a n (Spritz 1981; M o s c h o n a s et al. 1982), three in c h i m panzee, and five in m a c a q u e . This perfect repeat is s u r r o u n d e d by i m p e r f e c t motifs G T T T T , A T T T , C T T T T , or A T T A T in the three c o m p a r e d species. (4) RS4 is represented by dinucleotides AT, A G , or A C leading to an alternative p u r i n e - p y r i m i d i n e stretch ( P U P y ) , with n = 26 ( M o s c h o n a s et al. 1982; Poncz et al. 1983), n = 27 (Semenza et al. 1984b), and n = 30 (Chebloune et al. 1984) in h u m a n , n = 32 in c h i m p a n z e e , and n = 7 in m a c a q u e .

A Repeated Sequence is Located in Mouse DNA at a Position Homologous to That o f Primate R S 4 T o investigate the occurrence o f such RSs in m o r e distantly related m a m m a l s , we e x a m i n e d available data on the 5' flanking region o f the b e t a - m a j o r gene o f m o u s e (1412 nucleotides 5' f r o m the cap site; G i l m o u r et al. 1984). T h e 3' end o f the m o u s e pseud o - b e t a - h 3 pseudogene a p p e a r s by dot m a t r i x analysis to be a d e s c e n d a n t o f the m a m m a l i a n ancestral p r o t o delta gene whereas beta-1 dmaj is a d e s c e n d a n t o f the ancestral p r o t o beta gene (Collins and Weiss-

m a n 1984; H a r d i e s et al. 1984; H u t c h i s o n et al. 1984). Therefore, the m o u s e p s i - b e t a - h 3 - b e t a - 1am"j intergenic segment a n d the h u m a n d e l t a - b e t a intergenic segment are p r e s u m a b l y orthologous. A dot m a t r i x c o m p a r i s o n o f this m o u s e sequence with the h o m o l o g o u s h u m a n sequence ( f r o m position - 1 to position - 9 5 5 in Fig. 2) is presented in Fig. 3. Extensive h o m o l o g y is seen between the hum a n RS4 and a 300-nucleotide segment (positions - 8 3 5 to - 1 0 8 2 f r o m the cap site) in the m o u s e sequence. This segment, m a i n l y (AT), in h u m a n D N A , shows a m a i n l y ( C A T A ) , repetitive structure in m o u s e D N A . O t h e r regions 5' a n d 3' f r o m RS4 h a v e noticeable h o m o l o g i e s with the m u r i n e sequence, as indicated b y the b r o k e n diagonals in Fig. 3. H o w e v e r , the h u m a n sequence has suffered a deletion o f nearly 300 nucleotides, indicated by the loss o f similarity between positions - 850 a n d - 550 in the m o u s e sequence. T h i s deletion is responsible for the difference seen between the positions o f h u m a n a n d routine RS4.

Percentage Differences in IGs Percentage differences calculated a m o n g h u m a n , c h i m p a n z e e , a n d m a c a q u e h o m o l o g o u s I G s are presented in T a b l e 1. Roughly similar values were ob-

302 Table 1. N u m b e r s and percentages ofnucleotide substitutions that have accumulated in h u m a n , chimpanzee, and macaque intergenic segments Comparison b Segments

Human--chimpanzee

Human-macaque

N a m e (positions in Fig. 2)

Length

T

tl

t2

% Corr.

T

tl

t2

% Corr.

IGI ( - 4 1 5 8 to - 3 6 2 1 ) IG2 ( - 3 5 7 2 to - 2 7 1 0 ) IG3 ( - 2 6 3 5 to - 1 4 4 9 ) [G4 ( - 1380 to - 599) TotallGl-IG4 IG5 ( - 5 1 3 to - 1 )

538 863 1187 782 3370 513

6 6 13 9 34 7

1 4 4 5 14 0

2 0 3 3 8 0

i.7 + 0.6 1.2 • 0.3 1.7 _+ 0.4 2.2 • 0.6 1.7 +- 0.2 1.4•

20 37 43 30 130 11

4 6 9 13 32 2

5 9 8 9 31 3

5.6 _+ 1.1 6.3 4- 0.9 5.3 + 0.7 7.0 • 1.0 6.0 +- 0.5 3.2+__0.8

a Numbering is that of the macaque sequence b T, n u m b e r of transitions; t l , n u m b e r o f T ~- A and C ~- G transversions; t2, n u m b e r o f t ~ G and A ~ C transversions; % corr., percentages of substitutions corrected for multiple hits using the three-substitution type model of K i m u r a (1981), plus or minus standard error

Table 2. N u m b e r s and percentages of nucleotide substitutions that have accumulated in higher primate intergenic segments since divergence from a c o m m o n ancestor Comparison

T

tt t2 % Corr.

Horno sapiens vs c o m m o n ancestor Pan troglodytes vs c o m m o n ancestor

16 5 7 0.73 +_0.14 23 2 7 0.83 +_ 0.15

Abbreviations are as defined in Table 1

tained from IG1, IG2, IG3, and IG4. However, in both comparisons (human--chimpanzee and human-macaque), IG5 displays the lowest base substitution frequency. In the human-macaque comparison, the percentage difference for IG5 is significantly lower than those calculated in other segments (P < 0.01). Therefore, we excluded IG5 and calculated an average percentage difference for I G I - I G 4 : 1 . 7 +__ 0.2% in the human--chimpanzee comparison, and 6.0 _+ 0.5% in the human-macaque comparison, both after correction for multiple substitutions. We used the macaque sequence as an external reference for using the maximum parsimony procedure to determine the homologous sequence in the last common ancestor of human and chimpanzee. We were then able to calculate percentage differences between the ancestral sequence and human or chimpanzee (Table 2). We found no significant difference in base substitution frequency between the Homo and Pan lineages (P > 0.35). In contrast, at the pseudogene eta locus, the chimpanzee sequence has accumulated twice as many substitutions as its human counterpart since they first began t o diverge (P < 0.01) (Chang and Slightom 1984). A possible explanation for such a discrepancy is discussed in Savatier et al. (1987). Analysis of the distribution of substitution sites between human and macaque along each IG segment using the number-of-runs test and the group

test shows that substitutions occur entirely randomly with respect to position. Based on the dot matrix comparison in Fig. 3, we selected two segments orthologous between the mouse psi-beta-h3-beta-1 intergenic segment and the human delta-beta intergenic segment. The first, located on the 5' side of RS4, is 250 nucleotides long. The second, corresponding to the primate IG5, is 500 nucleotides long. Percentages of homology were computed inside subsegments of 50 nucleotides (deletions and insertions were excluded from the calculations). The results are illustrated in Fig. 4. The four subsegments (i.e., 200 nucleotides) located immediately upstream from the beta--globin cap site display more than 70% homology between the two species. For all other subsegments, homology ranges from 54 to 70%. In IG4, we estimated the percentage difference (insertions and deletions excluded) to be 59.5 _+ 7.0% after correction for multiple substitutions. Interestingly, substitution sites between human and mouse homologous sequences are not randomly distributed: The group test detects local aggregations of nucleotide differences in IG5 (P < 0.001). This suggests the occurrence of highly mutable regions and/or unequally distributed selective constraints. The occurrence of mutational processes different from independent single substitutions must be considered too.

Relative Evolutionary Distances and Base Substitution Rates in Noncoding Sequences In Table 3, we summarize our results for the delta-beta IGs as well as previously reported results on beta-related globin pseudogenes and on intervening and intergenic sequences in the epsilon-gammapseudo-eta-delta-beta-globin cluster. For each pair of species (human-chimpanzee, human-gorilla, hu-

303 Ioo

IG4

R84

I05

3O

m

,

-1412

man-macaque, human-Aotus, and human-Ateles), we report the percentage difference calculated from different nucleotide segments (Table 3, column A), the delta T5OH value between the corresponding nuclear DNAs determined by either Sibley and Ahlquist (1984) or O'Brien et al. (1985) (Table 3, column B), and the relative immunological distance calculated by Sarich and Cronin (1976) considering the distance between Old World and New World monkeys as unity (Table 3, column C). In each pairwise comparison, we did not find any significant difference in percentage difference between the various nucleotide segments, except for the large intervening sequence (IVS2) of the G-gamma-globin gene in both the human-chimpanzee and the human-gorilla comparison. In particular, the percentage differences calculated between either human and chimpanzee or human and macaque etaglobin pseudogenes are very close to those calculated for the delta-beta IGs. We conclude that the deltabeta IGs have evolved at a rate similar to the putatively neutral rate of pseudogenes. H u m a n - c h i m p a n z e e and h u m a n - g o r i l l a percentage differences calculated from all available nueleotide segments in the beta-globin gene cluster are not statistically different ( h u m a n - c h i m p a n z e e , 1.61 _+ 0.12%; human-gorilla, 1.76 _ 0.22%). This demonstrates once again the close proximity of the human-chimpanzee, human-gorilla, and chimpanzee-gorilla divergence nodes, which is such that the resolving power of nucleotide sequence comparisons is not yet able to separate them. For each pair of species, we plotted the average percentage difference (value uncorrected for multiple hits) against the corresponding delta TSOH value (Fig. 5). The T5OH parameter measures the thermal stability of D N A - D N A hybrids between single-

I

I

--1

Fig. 4. Percentage homology (deletions and insertions excluded) between human and mouse nucleotide sequences. Two homologous segments, located 5' and 3' to RS4, were selected from the dot matrix comparison (Fig. 3), then aligned using the algorithm developed by Needleman and Wunsch (1970). Each was subsequently divided into 50-nucleotide-long subsegments and the percentage of substitutions in each subsegment calculated

copy genomic DNA sequences. Delta T5OH measures the effect of mismatching on this stability. We found a linear correlation between both parameters, starting from the origin. The slope gives the percentage of substitutions corresponding to 1~ of delta T5OH: 0.8%. Therefore, substitution percentages calculated from IGs are fully consistent and collinear with results obtained from thermal stabilities of D N A - D N A heteroduplexes. Considering the distance between human and platyrrhines (Aotus and Ateles) as unity, we used percentage differences between homologous sequences to calculate relative values for the h u m a n chimpanzee, human-gorilla, and human-macaque distances (Table 3, column C). Values calculated in this way appear very similar to the relative immunological distances calculated by Sarich and Cronin (1976) from analyses of albumin and transferrin. In both cases, the hominoid-cercopithecoid distance represents about 50% of the catarrhine-platyrrhine distance. In summary, the different methods used to estimate the relative distances between primates were found to give concordant results. Discussion

Significance of Repeated Segments The four RSs we report in the macaque, chimpanzee, and human DNAs exhibit high variability in the number of repeat units, presumably as a result of insertion--deletion events (Savatier et al. 1985). Interestingly, we found a tandemly repeated sequence in mouse DNA in a position similar to that of the RS4 segment in primate DNA. A length difference of 300 bp between RS4 and the beta-major globin gene is responsible for the difference seen in

304 Table 3. Compilationofcomparisonsbetweenprimatesofnoncodingsequencesin the epsilon-gamma-pseudo-eta--delta-betacluster~ Segment Name

A. Nucleotide differencesb

Ref. Length

T

tl

t2

% Obs.

% Corr.

C. Relative distance from d

B. Delta T50H c SE

Ref. 10Ref. 11

DNA

Proteins

1.8

1.85

0.13

0.125

2.4

1.8

0.15

7.7

7.7

0.50

0.55

1

1

Human-chimpanzee IGI-IG4 IVS2 beta Pseudogene eta 5' Delta IVS2 G-gamma

1 7 2 3 8

Total

3,370 850 2,148 3,150 912

34 6 20 28 18

14 3 5 9 5

8 3 4 6 3

1.66 1.41 1.35 1.37 2.85

1.68 1.42 1.36 1.38 2.91

0.23 0.41 0.25 0.21 0.58

10,430

106

36

24

1.59

1.61

0.12

2,149 851 850

24 16 5

5 5 3

2 4 3

1.44 2.94 1.29

1.46 3.00 1.31

0.26 0.61 0.39

3,850

45

13

9

i.74

1.76

0.22

3,370 2,251

130 99

32 23

31 25

5.72 6.53

5.99 6.88

0.44 0.58

5,621

229

55

56

6.05

6.34

0.35

2,082

156

39

27

10.7

889

61

25

18

11.7

12.9

1.3

2,971

217

64

45

10.9

!2.0

0.7

Human-gorilla Pseudogene eta IVS2 G-gamma IVS2 beta

2 4 1

Total Human-macaque IGI-IG4 Pseudogene eta

1 9

Total

Human-Aotus Pseudogene eta

5

11.7

0.8

Human-A teles IVS2 gamma

6

Total platyrrhines-human

13.0

IG, intergenic segment; IVS, intervening sequence; SE, standard error ~ References: 1, this paper; 2, Chang and Slightom (1984); 3, Maeda et al. (1983); 4, Scott et al. (1984); 5, Harris et al. (1984); 6, Giebel et al. (1985); 7, Savatier et al. (1985); 8, Siightom et al. (1985); 9, Koop et al. (1986); 10, Sibley and Ahlquist (1984); 11, O'Brien et al. (1985) h % Obs., observed percentage differences in nucleotide sequence before correction for multiple hits and exclusion of insertions and deletions; other abbreviations as in Table 1 c Thermal stabilities of D N A - D N A heteroduplexes between human, chimpanzee, gorilla, macaque, and Ateles DNAs d Relative distances between human, chimpanzee, gorilla, and macaque calculated taking the distance between human and platyrrhines as unity. DNA, relative distances calculated from corrected percentage substitutions in the "beta-like" globin gene cluster; Proteins, relative distances calculated from immunological comparisons (Sarich and Cronin 1976)

their locations. Therefore, the human RS4 segment and its mouse counterpart can be considered orthologous. The two segments, which are mainly (AT), in primate DNAs and (CATA)o in mouse DNA, are regions of potential Z-DNA. Greaves and Patient (1985) described a similar structure, (AT)34, 450 nucleotides in front of the adult beta-globin gene in Xenopus, although orthology between the primate beta-globin gene and the amphibian beta-globin gene cannot be ascertained. Finally, a polypyrimidine s t r e t c h , (TC)14, is found about 400 nucleotides upstream from the orthologous rabbit beta-globin gene (Dierks et al. 1981; Moschonas et al. 1982). This set of observations suggests that a repeated structure located in front of the adult beta-globin gene could be functionally constrained and therefore would have been conserved at this location since the early mammals and possibly since the beginning of the reptileamphibian divergence. Whether the RS4 segment is involved in any particular functions remains an open question. Gil-

mour et al. (1984) have suggested that the mouse (CATA), segment would have a specific negative regulatory activity on the beta-globin promoter. More generally, it has been suggested that a switch from B to Z conformation might be involved in gene regulation (Rich 1983), and Hamada et al. (1984) have shown that (TG)o segments have an enhancer effect when they are carried on expression vectors. However, this activity does not rely on the Z conformation, since (CG), segments do not display any enhancer activity. Struhl (1985) demonstrated that for at least three different yeast genes, naturally occurring stretches of To serve as upstream promoter elements for constitutive expression. In addition, it appears that longer T, stretches are more efficient upstream promoter elements. He suggests that these transcription effects may be due to exclusion of nucleosomes from the Tn region. Therefore, it is possible that RS4 segments play a role in beta-globin gene expression. Preliminary data on the nucleotide sequence of

305

q,-

o t~ I--

ILl

c~

O 0

5

10 % SUBSTITUTIONS

the rabbit psi-beta-2-beta-1 intergenic segment, which is orthologous to the human delta-beta intergenic segment (Hardison and Margot 1984), reveal the homonucleotide stretch TL4 found in a position nearly identical to that of the primate RS3 segment (ATTTT)n. The homologous sequence in mouse is currently being analyzed to determine whether it too contains an orthologous RS3 segment.

Stronger Homology in IG5 Segments We found IG5 segments to be the most conserved regions in the three sequence comparisons investigated (human-chimpanzee, human-macaque, and human-mouse). Of the 600 nucleotides compared between human and mouse DNAs, the 200 nucleotides immediately upstream from the cap site display the highest homology. Regulatory functions of the IGs neighboring the beta-globin gene are now well documented by analyses ofthalassemic mutations (Orkin and Kazazian 1984; Antonarakis et al. 1985), analyses of transient expression following directed mutagenesis against putative control elements (Grosveld et al. 1982; Dierks et al. 1983; Wright et al. 1984), and analyses of stable introduction of these genes into erythroleukemic cells (Charnay et al. 1985) and into transgenic animals (Townes et al. 1985). Therefore, the high degrees of homology reported in this part of the delta-beta IG between primates (Table 1) and

15

Fig. 5. Plot of delta T5OH values (from Sibley and Ahlquist 1984) measured between human (H), chimpanzee (C), gorilla (G), macaque (M), catarrhine (Ca), and platyrrhines (PI) genomic DNAs versus the corresponding uncorrected percentages of substitutions calculated from noncoding sequences (see Table 3)

between human and mouse (Fig. 4) can be correlated with the functional behavior of the region.

Correlation of Substitution Percentages with Delta T50 Measurements Intervening and intergenic sequences represent over 90% of the nonrepetitive DNA in a mammalian genome. This implies that the delta T50 parameter mainly detects mismatches in these two kinds of noncoding sequences. Thus, we were able to compare base substitution frequencies in noncoding parts of the beta-globin gene cluster with thermal stabilities of corresponding D N A - D N A heteroduplexes. We found a linear correlation between the two parameters, which implies that the thermal stabilities of D N A - D N A hybrids can be used with high efficiency to estimate genetic distances between species for which the base substitution divergence between homologous sequences is lower than 12~

Base Substitution Rates in Noncoding DNA Conversion of base substitution frequencies into base substitution rates requires the use of a time scale. As mentioned above, two estimates have been proposed for the date of the catarrhine-platyrrhine divergence: Simons (1976) considered the Atlantic split, which ended about 55 million years ago, to provide the best estimate for the date of the Old World-New World monkey divergence. Based on

306 this estimate, we calculated the base substitution rate o f n o n c o d i n g D N A to be 1.0 x 10 .9 substitution/site/year. Alternatively, from the paleontological record, Sarich and Cronin (1976) estimated the beginning o f the primate radiation as 70 million years ago. F r o m immunological comparisons, they estimated that the Old W o r l d - N e w W o r l d m o n k e y divergence node was located at half the distance between the beginning o f the primate radiation and the present time (see Table 3), i.e., 35 million years ago. For this case, we estimate the base substitution rate o f noncoding D N A to be 1.6 x 10 .9 substitution/site/year. Other estimates o f base substitution rates have been obtained from nucleotide sequence c o m p a r i sons a m o n g distantly related species. The paleontological records indicate that the primate and rodent lineages diverged about 8 0 - 1 1 0 million years ago (Britten 1986). F r o m nucleotide c o m p a r i s o n between h u m a n and m o u s e I G 4 segments, we estimate the average base substitution rate in the primate a n d rodent lineages to be 3.0 x 10 -9 substitution/site/ year. F r o m comparisons a m o n g primate, artiodactyl, and rodent genes at fourfold degenerate sites, Li et al. (1985) estimated the e v o l u t i o n a r y rate o f neutral D N A to be 4.2 x 10 -9 substitution/site/year. K o o p et al. (1986) estimated the base substitution rate in the lineage leading to lemurs, tarsiers, and simians to be 2.9 x 10 -9 substitution/site/year. All these values were calculated for distantly related species for which the divergence dates were estimated from the paleontological record, a n d all are markedly higher than the divergence rates calculated within the higher primates, irrespective o f the reference time used for calibration. Based on these data as well as others (Wu and Li 1985; Britten 1986; H e l m - B y c h o w s k y and Wilson 1986), it is now generally admitted that very different rates o f D N A divergence have occurred in different systematic groups. Since most nucleotide sequence c o m p a r i s o n s between closely related species have been between higher primates, it has been hypothesized that the base substitution rate has slowed d o w n in primates (Chang and Slightom 1984; G o o d m a n et al. 1984; Harris et al. 1984). Nucleotide c o m p a r i s o n s involving rodents, artiodactyls, and early primates give higher D N A divergence rates. However, these c o m parisons do not provide information on the D N A divergence rates in present-day rodents and artiodactyls. Hence, we do not know if the slowing d o w n o f base substitution rates is limited to primates or if other systematic groups have undergone the same change. W u and Li (1985) showed that rodent D N A m a y have accumulated twice as m a n y substitutions as primate D N A since the point at which rodents and primates diverged. Although this result is not

dependent on knowledge o f the exact divergence time between species, it remains controversial, since it relies on the a s s u m p t i o n that the p r i m a t e - u n g u late split coincided with the p r i m a t e - r o d e n t split (Easteal 1985; L a n a v e et al. 1985). T o confirm whether artiodactyl and rodent D N A s have actually evolved faster than primate D N A would probably require c o m p a r i n g Old W o r l d and N e w W o r l d species as has been done for primates. Analysis o f groups o f species separated by a c o m m o n geological event such as the Atlantic split would probably provide a general m e t h o d for c o m p a r i n g base substitution rates a m o n g systematic groups.

Acknowledgments. We are grateful to Dr. M. Gouy for his help with computer alignments and sequencing analyses. We thank Drs. Jaeger and DeBonis for helpful discussions about paleontological records as well as Dr. Legay for his advice on the statistical analyses. We thank Dr. Moulin from Rh6ne-Mrrieux for providing spleen samples from Macaca cynomolgus. This work was supported by research grants from INSERM and CNRS. References Antonarakis SE, Kazazian HH, Orkin SH (1985) DNA polymorphism and molecular pathology of the human globin gene cluster. Hum Genet 69:1-14 Benveniste RE, Tadaro GJ (1975) Evolution of type C viral genes: preservation of ancestral murine type C viral sequences in pig cellular DNA. Proc Natl Acad Sci USA 72:4090-4094 Benveniste RE, Tadaro GJ (1976) Evolution of type C viral genes: evidence for an Asian origin of man. Nature 261:101 Biggin MD, Gibson TJ, Hong GF (1983) Buffer gradient gels and 35Slabel as an aid to rapid DNA sequence determination. Proc Nail Acad Sci USA 80:3963-3965 Britten RJ (1986) Rates of DNA sequence evolution differ between taxonomic groups. Science 231:1393-1398 Chang LYE, Slightom JL (1984) Isolation and nucleotide sequence analysis of the beta type globin pseudogene from human, gorilla and chimpanzee. J Mol Biol 180:767-784 Charnay P, Mellon P, Maniatis T (1985) Linker scanning mutagenesis of the 5' flanking region of the mouse beta-major gene: sequence requirements for transcription in erythroid and non-erythroid cells. Mol Cell Biol 5:1498-1511 Chebloune Y, Trabuchet G, Poncet D, Cohen-Solal M, Faure C, Verdier G, Nigon VM (1984) A new method for detection of small modifications in genomic DNA applied to the human delta-beta globin gene cluster. Eur J Biochem 142:473-480 Collins FS, Weissman SM (1984) The molecular genetics of human hemoglobin. Prog Nucleic Acid Res Mol Biol 31:317461 Dierks P, van Ooyen A, Mantel N, Weissmann C (1981) DNA sequences preceding the rabbit beta-globin gene are required for formation in mouse L cell ofbeta-globin RNA with correct 5' terminus. Proc Natl Acad Sci USA 78:1411-1415 Dierks P, van Ooyen A, Cachran MD, Dobkin C, Reiser J, Weissmann C (1983) Three regions upstream from the cap site are required for efficient and accurate transcription of the rabbit beta-globin gene in mouse 3T6 cells. Cell 32:695-706 Dixon WJ (1940) A criterion for testing the hypothesis that two samples are from the same population. Ann Math Stat 11: 199-204 Easteal S (1985) Generation time and the rate of molecular evolution. Mol Biol Evol 2:450-453

307 Fordis CM, Nelson N, McCornick M, P a d m a n a b h a n R, Howard B, Schechter AN (1986) The 5' flanking sequences of h u m a n globin genes contribute to tissue specific expression. Biocbem Biophys Res C o m m u n 134:128-133 Giebel LB, van Santen VL, Slightom JL, Spritz RA "(1985) Nucleotide sequence, evolution, and expression of the fetal giobin gene of the spider monkey Ateles geoffroyi. Proc Natl Acad Sci USA 82:6985-6989 Gilmour RS, Spandidos DA, Vass JK, Gow JW, Paul J (1984) A negative regulatory sequence near the mouse beta m~ globin gene associated with a region of potential Z-DNA. EMBO J 6:1263-1272 Goodman M, Koop BF, Czelusniak J, Weiss ML, Slightom JL (1984) The eta-globin gene: its long evolutionary history in the beta-globin gene family of mammals. J Mol Biol 180:803823 Greaves DR, Patient R K (1985) (AT)n is an interspersed repeat in the Xenopus genome. EMBO J 4:2617-2626 Grosveld GC, Rosenthal A, Flavell RA (1982) Sequence requirements for the transcription of the rabbit beta-globin gene in vivo: the - 80 region. Nucleic Acids Res 10:4951-4971 Hamada H, Seidman M, Howard BH, G o r m a n CM (1984) Enhanced gene expression by the poly(dT-dG).poly(dC-dA) sequence. Mol Cell Biol 4:2622-2630 Hardies SC, Edgell MH, Hutchison CA III (1984) Evolution of the mammalian beta-globin gene cluster. J Biol Chem 259: 3748-3756 Hardison RC, Margot JB (1984) Rabbit globin pseudogene psibeta 2 is a hybrid of delta- and beta-globin gene sequences. Mol Biol Evol 4:302-316 Harris S, Barrie PA, Weiss ML, Jeffreys AJ (1984) The primate psi-beta 1 gene: an ancient beta-globin pseudogene. J Mol Biol 180:785-801 Hasson J-F, Mougneau E, Cuzin F, Yaniv M (1984) Simian virus 40 illegitimate recombination occurs near short direct repeats. J Mol Biol 177:53-69 Helm-BychowskyKM, WilsonAC (1986) RatesofnuclearDNA evolution in pheasant-like birds: evidence from restriction maps. Proc Natl Acad Sci USA 83:688-692 Hutchison CA III, Hardies SC, Padgett RW, Weaver S, Edgell MH (1984) The mouse globin pseudogene beta h3 is descended from a premammalian delta-globin gene. J Biol Chem 259:12881-12889 K_imuraM (1981) Estimationofevolutionarydistancesbetween homologous nucleotide sequences. Proc Natl Acad Sei USA 78:454-458 Koop BK, Goodman M, Xu P, Chan K, Slightom JL (1986) Primate eta-globin DNA sequences and man's place among the great apes. Nature 319:234-237 Lanave C, Preparata G, Saccone C (1985) Mammalian genes as molecular clocks? J Mol Evol 21:346-350 Leder P, Tiemeier D, Enquist L (1977) EK2 derivatives of bacteriophages lambda useful in the cloning of DNA from higher organisms: the lambda gt WES system. Science 196: 175-176 Li, W-H, Wu C-I, Luo C-C (1985) A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotides and codon changes. Mol Biol Evol 2:150-174 Maeda N, Bliska JB, Smithies O (1983) Recombination and balanced chromosome polymorphism suggested by D N A sequences 5' to the h u m a n delta-globin gene. Proc Natl Acad Sci USA 80:5012-5016 Messing J, Vieira J (1982) A new pair of M13 vectors for selecting either DNA strand of double-digest restriction fragments. Gene 19:269-276 Miesfeld R, Krystal M, Anaheim N (1981) A member of a new repeated sequence family which is conserved throughout eu-

caryotic evolution is found between the h u m a n delta- and beta-globin genes. Nucleic Acids Res 9:5931-5947 Miyata T, Yasunaga T (1980) Molecular evolution o f m R N A : a method for estimating evolutionary rates of synonymous and aminoacid substitution from homologous nucleotide sequences and its application. J Mol Evol 16:23-36 Moschonas N, de Boer E, Flavell RA (1982) The D N A sequence of the 5' flanking region of the h u m a n beta-globin gene: evolutionary conservation and polymorphic differences. Nucleic Acids Res 10:2109-2120 Needleman V, Wunsch M (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48:443-453 O'Brien SJ, Nash WG, Wildt DE, Bush ME, Benveniste RE (1985) A molecular solution to the riddle ofthe giant panda's phylogeny. Nature 317:140-144 Orkin SH, Kazazian HH (1984) The mutation and polymorphism of the h u m a n beta-globin gene and its surrounding DNA. Annu Rev Genet 18:131-171 Perler R, Efstratiadis A, Lomedico P, Gilbert W, Klodner R, Dodgson J (1980) The evolution of genes: the chicken preproinsulin gene. Cell 20:555-566 Poncz M, Solowiejczyk D, Ballantine M, Schwartz E, Surrey S (1982) " N o n r a n d o m " D N A sequence analysis in bacteriophage M13 by the dideoxy chain-termination method. Proc Natl Acad Sci USA 79:4298-4302 Poncz M, Schwartz E, Ballantine M, Surrey S (1983) Nucleotide sequence analysis of the delta-beta globin gene region in humans. J Biol Chem 258:11599-I 1609 Rich A (1983) Right-handed and left-handed DNA: conformational information in genetic material. Cold Spring Harbor Syrup Quant Biol 47:1-12 Sanger F, Coulson AR, Barrell BG, Smith AJH, Roe BA (1977) DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463-5467 Sarich VM, Cronin JE (1976) Molecular systematics of the primates. In: G o o d m a n M, Tashian RF (eds) Molecular anthropology. Plenum Press, New York, pp 141-170 Savatier P, Trabuehet G, Faure C, Chebloune Y, Gouy M, Verdier G, Nigon VM (1985) Evolution of the primate betaglobin gene region: high rate of variation in CpG dinucleotides and in short repeated sequences between man and chimpanzee. J Mol Biol 182:21-29 Savatier P, Trabuchet G, Chebloune Y, Faure C, Verdier G, Nigon VM (1987) Nucleotide sequence of the beta-globin genes in gorilla and macaque: the origin of nucleotide polymorphisms in human. J Mol Evol 24:309-318 Sawada I, Beal MP, Shen CJ, Chapman B, Wilson AC, Schmid C (1983) Intergenic D N A sequences flanking the. pseudo alpha globin genes of h u m a n and chimpanzee. Nucleic Acids Res 11:80-87 Scott AF, Heath P, Trusko S, Boyer SH, Prass W, G o o d m a n M, CzelusniakJ, ChangLYE, SlightomJL (1984) The sequence of the gorilla fetal globin genes: evidence for multiple gene conversions in h u m a n evolution. Mol Biol Evol 5:373-389 Semenza GL, Delgrosso K, Poncz M, Malladi P, Schwartz E, Surrey S (1984a) The silent carrier allele: beta-thalassemia without a mutation in the beta-globin gene or its immediate flanking regions. Cell 39:123-128 Semenza GL, Malladi P, Surrey S, Delgrosso K, Poncz M, Schwartz E (1984b) Detection of a novel D N A polymorphism in the beta-globin gene cluster. J Biol Chem 259:6045-6048 Sibley CG, Ahlquist JE (1984) The phylogeny of the hominoid primates as indicated by D N A - D N A hybridization. J Mol Evol 20:2-15 Simons EL (1976) The fossil record of primate phylogeny. In: Goodman M, Tashian RF (eds) Molecular anthropology. Plenum Press, New York, pp 35-62

308 Slightom JL, Chang L-YE, Koop BF, Goodman M (1985) Chimpanzee fetal G-gamma and A-gamma gene nucleotide sequences provide further evidence of gene conversions in hominine evolution. Mol Biol Evol 2:370-389 Spritz RA (1981) Duplication/deletion polymorphism 5' to the human beta-globin gene. Nucleic Acids Res 9:5037-5047 Struhl K (1985) Naturally occurring poly(dA--dT) sequences are upstream promoter elements for constitutive transcription in yeast. Proc Natl Acad Sci USA 82:8419-8423 Townes TM, Lingrel JB, Chen HY, Brinster RL, Palmiter RD (1985) Erythroid-specific expression of human heta-globin genes in transgenic mice. EMBO J 4:1715-1723 Wald A, Wolfowitz J (1940) On a test whether two samples are from the same population. Ann Math Stat 11:147-162

White CT, Hardies SC, Hutchison CA III, Edgell MH (1984) The diagonal-traverse homology search algorithm for locating similarities between two sequences. Nucleic Acids Res 12: 751-766 Wright S, Rosenthal A, Flavell R, Grosveld F (1984) DNA sequences required for regulated expression of beta-globin genes in murine erythroleukemia cells. Cell 38:265-272 Wu C-I, Li W-H (1985) Evidence for higher rates of nucleotide substitution in rodents than in man. Proc Natl Acad Sci USA 82:1741-1745

Received June 17, 1986/Revised September 5, 1986

Suggest Documents