MIYAZAWA and THOMAS (1965). Each batch of HAP ..... have been compared in both directions (FITCH and MARGOLISH 1967; BEVER-. LEY and WILSONÂ ...
Copyright 0 1986 by the Genetics Society of America
DNA SEQUENCE COMPARISON AMONG CLOSELY RELATED DROSOPHILA SPECIES IN THE MULLERZ COMPLEX DAN H. SCHULZE'
AND
C. S. LEE
Department of Zoology, University of Texas, Austin, Texas 78712 Manuscript received September 16, 1985 Revised copy accepted February 1, 1986 ABSTRACT DNA hybridization was used to establish DNA sequence relationships among seven Drosophila species. Single-copy DNA was isolated from four species within the Drosophila mulleri complex, D. mojavensis, D. arizonensis, D. ritae and D. starnaeri. These single-copy DNAs were used as tracers to be hybridized with each other and one additional member of the mulleri complex, D. aldrichi, a member of a closely related complex, D. hydei, and a distantly related species, D. melanogaster. Two methods have been used to determine the relatedness between these species: (1) the extent of duplex formed as measured by binding to hydroxyapatite and (2) the thermal stability of the duplexed DNA. Moderately repetitive DNA was purified from these species and used similarly to determine the divergence of this family of sequences. The rate of nucleotide substitution 0.1% base pair change per million years for both was estimated to be 0.2 single-copy and middle-repetitive DNAs. The size of the D. arizonensis genome, a representative of the mulleri complex, was calculated to be 2.2 X 10' base pairs from its kinetic complexity similar to that of D. hydei. The relative amounts (18%) and average reiteration frequency (100 copies) of the middle-repetitive DNA are similar for all Drosophila species studied. Finally, the data are presented in a phylogenetic tree.
+,
T
HE divergence of species can be determined by analysis of biological macromolecules. Recent work has demonstrated that accumulation of changes in macromolecules is linear with time (WILSON,CARLSONand WHITE 1977). This then permits the use of molecular approaches to discover routes, rates and the degree of divergence between species. Biochemical approaches, including protein electrophoretic properties and immunological techniques, have been used with more classical methods of morphology and chromosome inversion analysis to determine phylogeny for various species [for example, in and SMOUSE (1976), BEVERLEY the Drosophila mulleri complex, see RICHARDSON and WILSON (1982), WASSERMAN (1962a,b)]. Nucleic acid hybridization provides a quantitative means to measure sequence divergence that can be used to evaluate phylogenetic relationships (for example, see ANGERER,DAVIDSON and BRITTEN1976). Although an ultimate comparison could ideally be made on the nucleotide level by direct sequence analysis, this is possible only for a Present address: Department of Microbiology, University of Texas Medical Branch, Galveston, Texas 77550. Genetics 115: 287-303 June, 1986.
288
D. H. SCHULZE AND C. S. LEE
small fraction of the genome. Therefore, an overall picture of genomic evolution can better be obtained through DNA hybridization techniques. Recently, SIBLEYand AHLQUIST(1984a), using nucleic acid hybridization techniques, have been able to reconstruct the phylogeny of numerous avian species. These same researchers have used the same experimental approach of thermal stability of duplexed DNA to estimate relatedness and determine the phylogeny of the hominoid primates (SIBLEY and AHLQUIST1984b). For such phylogenetic studies using DNA hybridization, we chose several Drosophila species in the mulleri complex. This complex belongs to the mulleri subgroup, which in turn belongs to the repleta group (WASSERMAN 1954; THROCKMORTON 1975). The species of the mulleri complex range from central United States through Mexico, Central and South America (THROCKMORTON 1975). A phylogeny has been suggested using morphological characteristics (WASSERMAN 1962a,b), chromosome inversions (WASSERMAN 1962b), protein electrophoresis (RICHARDSON and SMOUSE 1976) and microcomplement fixation (BEVERLEY and WILSON1982) for a few of the species of the complex. Among the species in the mulleri complex, the following ones are chosen in our study: D. mojavensis and D. arizonensis are presumed to be quite closely related, possibly sibling species (HUBBYand THROCKMORTON 1968); D. ritue and D. starmeri appear in other phylogenetic representations to be the most divergent from D. mojavensis and D. arizonensis, thus defining the limits of the mulleri complex. D. aldrichi is presumed to be intermediate in its relationship to the other species. D. hydei is a member of a subgroup closely related to the mulleri complex. Finally, D. melanogaster is a greatly divergent form of another subgroup. The following general strategy for interspecies DNA hybridization was used. First, single-copy DNA and middle-repetitive DNA were isolated from representative species. '251-labeled DNA of a reference species was hybridized to a large excess of unlabeled DNA of the species to be compared. The extent of hybridization between tracer and driver DNA was determined to evaluate the amount of sequence homology between these species. Thermal stabilities of heteroduplexes were also determined to examine DNA mismatching that would be a measure of sequence divergence. The phylogenetic relationships obtained from our results agree, in general, with previous somewhat qualitative studies used for detecting relatedness. The rate of DNA sequence divergence obtained in this study is similar to those obtained from other organisms with a value of 0.2 f 0.1% base pair (bp) change per million years. MATERIALS AND METHODS Growth of Drosophila cultures: The species of Drosophila used in this study were obtained from the University of Texas Stock Center (now at the National Drosophila Species Resource Center, Bowling Green State University), except for D. hydei from T. GRECG(Miami University), D. melunoguster from Y . HIRAIZUMI (University of Texas) and D. ritue from R. H. RICHARDSON(University of Texas). All stocks were grown in half-pint milk bottles at room temperature on a standard cornmeal-agar food except
DNA ANALYSIS OF D. r t "COMPLEX
289
for D. starmeri and D. ritae, which were raised on banana-cactus food (RICHARDSON and KAMBYSELLIS1968). Isolation of DNA: About 25 g of flies were homogenized in cold Drosophila homogenizing medium (DHM) containing 0.35 M sucrose, 0.05 M Tris, pH 8, 0.025 M KCl and 0.005 M MgClz. The homogenate was filtered through eight layers of cheesecloth. The material retained on the cheesecloth was rehomogenized and filtered. The filtrate was centrifuged at 4000 rpm for 10 min. The crude pellet was resuspended in DHM plus 0.1% Triton X-100 and was centrifuged. This process was repeated three times. The nuclei were lysed in 30 ml of 0.1 M Tris, pH 8, 0.1 M EDTA, 0.05 M NaCl, 0.04 mM l-phenyl-2-thiourea, 0.5% SDS and 1 mg/ml pronase. 1-Phenyl-2-thiourea was added to inhibit phenol oxidase and formation of melanin derivatives during the procedure (DICKINSON and SULLIVAN 1975). The mixture was shaken gently at 37" for 2 h. The NaCl concentration was raised to 0.5 M, and three chloroform-isoamyl alcohol (24:l) extractions were performed. The DNA was precipitated with ethanol and dis solved in 1 mM sodium phosphate buffer (PB), pH 6.8. The DNA was further purified by hydroxyapatite (HAP) chromatography. Before loading onto the column, the DNA solution was passed through a 27 G syringe needle. The column purified DNA was dialyzed extensively against 0.24 M PB and 1.4 mM EDTA. Calf thymus DNA (Sigma) and E. coli DNA (from M. CORDEIRO-STONE) were further purified by chloroform extraction and HAP chromatography. Shearing of D N A DNA at a concentration of 300 pg/ml in 0.24 M PB and 1.4 mM EDTA was passed through a French pressure cell (Aminco) at 12,000 p.s.i. twice. The molecular weights of single-stranded and double-stranded DNAs were determined by sucrose gradient centrifugation, boundary velocity sedimentation using a Spinco model E, or agarose gel electrophoresis. Sheared single-stranded DNA gives a distribution of molecular weights of 300-600 bases. Hydroxyapatite chromatography: HAP was prepared according to the procedure of MIYAZAWAand THOMAS (1965). Each batch of HAP was tested to determine the a p propriate PB concentration permitting elution of single-stranded and double-stranded DNA selectively, usually 0.16 M PB. Fractionation of single-copy and middle-repetitive D N A Single-copy sequences were separated from the repetitive fraction by denaturing total DNA by boiling and were incubated in 0.24 M Na+ at 66" to an equivalent Cot of 22. The unreassociated DNA was separated on HAP from the reassociated DNA. The unreassociated fraction was used as single-copy DNA. The reassociated DNA was again denatured and incubated as described above to an equivalent Gt of lo-'. After HAP chromatography, the unreassociated fraction was used as the middle-repetitive fraction. Thermal denaturation: Thermal denaturation using HAP chromatography was performed as follows. The annealed mixture (100-200 pg DNA and 1-5 X lo5 cpm) in 0.24 M PB was diluted 10-fold with 1 mM PB and was applied to a water-jacketed (45") HAP column. Three 2-ml samples of 0.16 M PB were sequentially passed through the column. The ionic strength of the buffer was then decreased to 0.13 M PB in order to attain a convenient temperature range of the DNA melting and to sharpen the transition of the melt (MARTINSON and WAGENAAR1974). The temperature was raised in 4" increments, with 7 min permitted for temperature equilibration, before eluting with three 2-ml samples of 0.13 M PB. After the 94" elution, three 2-ml samples of 0.4 M PB were rinsed through the column to elute any residual unmelted double-stranded material. Never more than 1% of the total material was eluted at this step. Reassociation conditions for the Gt curve: Sheared D. arironensis DNA, at 752 pg/ ml, was denatured, incubated at 40" in 0.126 M Na+ and 30% formamide (MCCAUGHY, LAIRDand MCCARTHY1969). E. coli DNA at 45 pg/ml and 38 pg/ml was denatured and reassociated under the same conditions. One hundred micrograms of DNA was analyzed per data point and for percent reassociation on HAP. Preparation of tracer DNA: DNA was labeled with Iz5I according to COMMERFORD (1971). The iodinated DNA was purified on a Sephadex C-50 column and was dialyzed
290
D. H . SCHULZE AND C. S. LEE
FIGURE1.-Reassociation profile of Drosophila and E. coli DNAs. Sheared D. arizonensis (U) was reassociated as described in MATERIALS AND METHODS. 100 pg was used per point, and amount of reassociation was measured by binding to HAP. The normalized curve (0)simulates single-copy reassociation assuming 69% of the total DNA is unique with a Cot%= 179. The normalized curve (A) represents the reassociation of middle-repetitive, representing 18% of the genome and a C,tw = 2.5. The rate standard, E. coli DNA, was used (0)at 45 pg/ml and (A)38 Mg/ml. The Cot%of E. coli DNA = 4.8.
against 0.24 M PB and 1.4 mM EDTA. Specific activities of about 4 X IO6 cpm/Pg DNA were routinely obtained. Single-strand molecular weights were decreased by a factor of 2 during the iodination procedure. Construction of phylogenetic trees: The phylogenetic trees were constructed according to the method of FITCHand MARGOLISH(1967) when reciprocal data were available. When only unidirectional data were available, these data were added to the basic tree, described above, by methods described by BEVERLEY and WILSON(1 982). To evaluate different phylogenetic constructions, the percent standard deviation (FITCH and MARGOLISH1967) and F value (PRAGER and WILSON 1976) were used to measure goodness of fit. RESULTS
Genome organization of D. arizonensis: The genome organization of a member of the mulleri complex was examined to determine the complexity of the genome and the reiteration frequency of repetitive DNA. These results provide the conditions required for fractionation of the genome into different reiteration frequency classes. D. arizonensis DNA was denatured and reassociated to various Cot values. Cot is the product of the initial concentration of DNA in moles of nucleotide per liter and time in seconds (BRITTENand KOHNE 1968). T h e amount of reassociation was determined by HAP chromatography. T h e results for both D. arironensis and E. coli DNAs are presented in Figure 1. T h e reassociation profile for D. arironensis is typical of other higher eukaryotes. About 13% of the DNA reassociates rapidly with a C,ty2 (Cot where one-half of the DNA has reassociated) less than lo-*. T h e highly reiterated components must be greater than 8 X lo4 copies per haploid genome. Another 18% of the genome reassociates at a corrected Cotfi of 0.5 (see the legend to Figure 1 for appropriate normalization), and this represents the middle-repetitive DNA sequences. T h e
DNA ANALYSIS OF D. MULLER1 COMPLEX
29 1
remainder of the genome, 69%, is considered single-copy DNA and has a corrected Cotyrof 179. Figure 1, also, includes a reassociation profile of E. coli DNA with a Coty20f4.8. These data were used to determine the genome size of D. arizonensis and the reiteration frequency of its middle-repetitive DNA sequences. Since the E. coli genome size of 4 X lo6 bp has a Cotl/, of 4.8 and the Cots of D. arizonensis single-copy DNA is 179, then kinetic complexity is 1.5 X lo8 bp (179/4.8 X 4 X lo6 = 1.5 X 10'). As this value represents 69% of the genome, we calculate the total genome size to be 2.2 X lo8 bp. The kinetic complexity of D. arizonensis middle-repetitive DNA fraction was determined to be 4.2 X lo5 bp, which gives an average reiteration frequency of 94 copies per haploid genome. These reassociation experiments provided conditions for isolating single-copy and middle-repetitive DNAs. We designated the DNA that is not reassociated at this Cot of 22 as single-copy DNA and designated the DNA that is reassociated at this Cot, but not at Cot of lo-*, as middle-repetitive DNA. All of the species examined in this study contain 70-76% of the genome represented as single-copy DNA and 13-20% as middle-repetitive DNA. Divergence of single-copy DNA by extent of hybridization. The extent or amount of hybridization can be used as a measure of relatedness among species because, under stringent hybridization conditions, more divergent sequences will fail to form stable hybrid duplexes. It has also been demonstrated that slightly dissimilar molecules reassociate at a lower rate (BONNERet al. 1973). These two factors contribute to differences in the amount of hybridization observed when various DNAs are hybridized. Measurements are made by reassociating trace amounts of radiolabeled single-copy DNA from one species with an excess (>lOOO-fold) of unlabeled single-copy DNA. The amount of reassociation (duplex formation) is measured using HAP chromatography. Several parameters can be measured from such an experimental procedure. (1) The amount of reassociation measured optically reflects the extent of reassociation of unlabeled (driver) DNA. (2) The amount of hybridization measured by radioactivity under these conditions gives the amount of hybridization between labeled and unlabeled DNA molecules. Control hybridization with radioactive DNA alone or in the presence of an excess of completely heterologous DNA (calf thymus DNA) contributes