Molecular characterization and chromosomal distribution of ... - Genetics

6 downloads 0 Views 1MB Size Report
Feb 3, 2005 - s-1 is segregating for the 4 standard (4st) and 4s chromosomal ... hybridization of a lambda genomic library of the j-1 line (CÁCERES et al.
Genetics: Published Articles Ahead of Print, published on February 3, 2005 as 10.1534/genetics.104.035048

Molecular characterization and chromosomal distribution of Galileo, Kepler and Newton, three foldback transposable elements of the Drosophila buzzatii species complex

Ferran Casals*,1, Mario Cáceres†, Maura Helena Manfrin‡, Josefa González*, and Alfredo Ruiz*

*

Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona,

08193 Bellaterra (Barcelona), Spain.



Department of Human Genetics, Emory University School of Medicine, 615 Michael

Street, Atlanta, GA 30322. ‡

Departamento de Biologia – FFCLRP, Universidade de São Paulo, Riberão Preto-SP

14040-901, Brazil.

1

Present address: Unitat de Biologia Evolutiva, Facultat de Ciències de la Salut i de la

Vida, Universitat Pompeu Fabra, 08003, Barcelona, Spain.

Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under accession nos.xxxxx-xxxxx.

Running head: Foldback transposable elements in Drosophila

Keywords: foldback, transposable elements, chromosome evolution, Drosophila

Corresponding autor: Dr. Alfredo Ruiz, Departament de Genètica i de Microbiologia, Facultat de Ciències-Edifici C, Universitat Autónoma de Barcelona, 08193 Bellaterra (Barcelona), Spain. Phone: 93-5812729; FAX: 93-5812387; E-mail: [email protected]

2

ABSTRACT Galileo is a foldback-like transposable element of Drosophila buzzatii that has been implicated in the generation of two polymorphic chromosomal inversions in this species. The analysis of the inversion breakpoints led to the discovery of two additional elements, called Kepler and Newton, sharing sequence and structural similarities with Galileo. Here, we describe in detail the molecular structure of these three elements, based on the 13 copies found at the inversion breakpoints plus ten additional copies isolated for this work. Similarly to the foldback elements described in other organisms, these elements have long inverted terminal repeats, that in the case of Galileo possess a complex structure with different domains, and display a high degree of internal variability between the copies. We have constructed a phylogenetic tree and estimated the age of the elements as 5.5 millions of years. We have also carried out an exhaustive analysis of the abundance and chromosomal distribution of these elements in D. buzzatii and other species of the repleta group by Southern analysis and in situ hybridization. Overall, the results of these analyses suggest that the Galileo elements invaded the buzzatti complex some time before the divergence of its species and may have played an important role in shaping the genome of these species. In addition, we show that the recombination rate is the main factor determining the actual chromosomal distribution of these elements.

3

INTRODUCTION Transposable elements (TEs) have been described in both prokaryotes and eukaryotes organisms, and probably constitute ancient components of genomes (BERG and HOWE 1989; CAPY et al. 1998). In some cases, TEs make up an important fraction of the genome, like in humans or mice where they represent nearly half of the DNA

content

(INTERNATIONAL

HUMAN

GENOME

SEQUENCING

COMSORTIUM 2001; MOUSE GENOME SEQUENCING CONSORTIUM 2002). Traditionally, transposable elements have been considered to be intracellular parasites or selfish DNA (DOOLITTLE and SAPIENZA 1980; ORGEL and CRICK 1980), that are maintained in the genomes simply due to their ability to replicate (CHARLESWORTH et al. 1994). As a result of their parasitic activity TEs would mainly have a negative effect in the host genomes, by producing deleterious mutations. However, there is an increasing body of evidence showing that TEs may be beneficial to the host genome and that they may have played an important role in the evolution of species (MCDONALD 1993; WESSLER et al. 1995; KIDWELL and LISCH 1997; KAZAZIAN 2004). Transposable elements are classified into two major groups according to their mechanism of transposition (FINNEGAN 1989). Class I elements, including retrotransposons and retroposons, alternate DNA and RNA phases and are first transcribed into RNA and then reverse transcribed to DNA and inserted into the genome. Class II elements, or DNA transposons, are mobilized through a “cut and paste” mechanism, where the element excises from the original localization and inserts into a new site (PLASTERK 1995). These elements are characterized by having inverted repeats (IRs) at both ends, flanking an internal region that codes a transposase enzyme involved in their mobilization. Foldback elements are a particular group of

4

Class II elements, with some distinctive characteristics (FINNEGAN 1989; CAPY et al. 1998). These elements were first described in D. melanogaster and took their name from the FB (or Foldbak) element of this species (POTTER et al. 1980; TRUETT et al. 1981). Since then, similar elements have been subsequently described in several other organisms: TU in Stronglyocentrotus purpuratus (HOFFMAN-LIEBERMANN et al. 1985), TFB1 in Chiromonus thummi (HANKELN and SCHMIDT 1990), TC4 in Caenorhabditis elegans (YUAN et al. 1991), SoFT in several species of Solanaceae (REBATCHOUK and NARITA 1997), Hairpin and FARE in Arabidopsis thaliana (ADÉ and BELZILE 1999; WINDSOR and WADDELL 2000), an unnamed foldback element in Ciona intestinalis (SIMMEN and BIRD 2000), Tnr-8 in Oryza glaberrina (CHENG et al. 2000), and Galileo, Kepler and Newton in Drosophila buzzatii (CÁCERES et al. 2001). All foldback elements share some common structural features that distinguish them from the other Class II TEs and are summarized in Figure 1. We will use the nomenclature of ADÉ and BELZILE (1999) and SIMMEN and BIRD (2000) to describe the typical structure of these elements. First of all, foldback elements contain very long inverted terminal repeats (IR) that usually extend along almost the entire element and that are separated by a medium domain (M) of variable length and composition. As a consequence of this particular structure, when denatured, the two IRs of the element can fold-back and pair, giving rise to very stable stem-loop secondary structures that inspired their name (POTTER et al. 1980). The IRs have a modular structure, with three possible different sequence domains: the most external flanking domain (IR-FD); the outer domain (IR-OD), which includes several imperfect repeats in tandem; and the inner domain (IR-ID), that usually contains A-T rich sequences (Figure 1). In addition, as all other TEs, upon insertion foldback transposons are flanked by two

5

short direct repeats generated by the duplication of the target site sequence. The generation of these duplications and the existence of polymorphic insertions strongly support the mobilization of foldback elements, although their mechanism of transposition is not fully understood (CAPY et al. 1998). No coding capability has been found in the vast majority of foldback transposons described. In D. melanogaster, FB transposition depends on the presence of the 4-kb NOF sequence, which is found in the M domain of approximately 10% of the elements (HARDEN and ASHBURNER 1990; SMITH and CORCES 1991). The NOF sequence encodes a 120 KDa protein, but its function is still unknown (TEMPLETON and POTTER 1989; HARDEN and ASHBURNER 1990). More recently, the FARE2 foldback transposons of A. thaliana have been predicted to harbor a transposase (WINDSOR and WADDELL 2000). At the functional level, foldback elements are characterized by its capacity to induce genetic instability and recombination processes, leading to the generation of spontaneous mutations and chromosomal rearrangements in D. melanogaster (LEVIS et al. 1982; BINGHAM and ZACHAR 1989; SMITH and CORCES 1991). These processes appear to occur by ectopic recombination between different copies of the same element (COLLINS and RUBIN 1984). Ectopic recombination events between sequences of the same element could also explain the great structural heterogeneity described in most elements of this family (HOFFMAN-LIEBERMANN et al. 1985; CHENG et al. 2000; WINDSOR and WADDELL 2000). The capacity of foldback elements to mediate these rearrangements is probably caused by the presence of long IRs, which have been demonstrated to be a source of genomic instability (ZHOU et al. 2001). IRs have the ability to form secondary structures, which in turn stimulate the production of double-strand breaks (LOBACHEV et al. 1998).

6

Interestingly, Galileo, a transposon of D. buzzatii, is implicated in the generation of two natural polymorphic inversions through ectopic recombination between copies of the element (CÁCERES et al. 1999; CASALS et al. 2003). Moreover, the four inversion breakpoints have become hotspots for genetic instability and TE insertions (CÁCERES et al. 2001; CASALS et al. 2003). These insertions were mainly constituted of class II elements, and three of them, Galileo, Kepler and Newton, share sequence and structural similarities, including the presence of long inverted repeats, and were tentatively classified as foldback elements (CÁCERES et al. 2001). Here, we have isolated additional copies of the three foldback-like elements and carried out a detailed molecular characterization of all available copies. We have also studied the abundance and chromosomal distribution of these elements in D. buzzatii and other species of the repleta species group. These analyses have allowed us to estimate the age of these elements and gain insight into the factors controlling their distribution in the chromosomes and the relationship with other chromosomal inversions described in this complex.

MATERIALS AND METHODS Drosophila Stocks: Twenty-two lines of D. buzzatii and twelve lines of other different Drosophila species were used (Table 1). These lines were isolated from different natural populations, and for D. buzzatii they cover most of the distribution range of the species. All but three D. buzzatii lines are homokaryotipic for five natural different chromosomal arrangements: 2 standard (2st), 2j, 2jz3, 2jq7, 2y3 (Table 1). Line s-1 is segregating for the 4 standard (4st) and 4s chromosomal arrangements (RUIZ and WASSERMAN 1993). Lines j-23 and j-24 are homokaryotipic for the inversion 5I and

7

the translocation t(5,1), respectively, that were induced by introgressive hybridization (NAVEIRA and FONTDEVILA 1985).

PCR Amplification: PCR was carried out in a volume of 50 µl, including 100200 ng of genomic DNA, 20 pmols of each primer, 200 µM dNTPs, 1.5 mM MgCl2 and 1-1.5 units of Taq DNA polymerase. Temperature cycling conditions were 30 rounds of 30 sec. at 94ºC; 30 sec. at the annealing temperature, and 60 sec. at 72ºC. To isolate new

copies

of

the

D.

buzzatii

CCATACAACACATAGACTGGACA-3’)

foldback

elements, and

primers G8

G7

(5’(5’-

TCGTATTTGCTCGGGTTCTTACT-3’), corresponding to the IRs of Galileo (Figure 2), were used. PCRs with different combinations of these primers and varying annealing temperature (59-62.5 ºC) and elongation time (1min. 30 sec.- 2 min. 30 sec) were carried out over genomic DNA from st-2, st-3, st-4, j-7 and j-8 lines. PCR products were cloned into the pGEM-T vector (Promega) and sequenced.

Screening of Genomic Library and Southern Analysis: Screening of the genomic library and Southern hybridizations were performed according to standard procedures (SAMBROOK et al. 1989). Probes were labeled by random primer with digoxygenin-11-dUTP under the conditions specified by the supplier (Roche). Hybridization was performed overnight in standard buffer with 50% formamide at 42ºC for intraspecific and 37ºC for interspecific hybridization. Stringency washes were performed with 0.1 x SSC 0.1% SDS solution at 68ºC and 50ºC for intraspecific and interspecific hybridizations, respectively. Additional copies of the foldback elements were isolated by plaque hybridization of a lambda genomic library of the j-1 line (CÁCERES et al. 1999) with a

8

pGPE107.2.1.1 probe containing the entire Galileo-2 element (0.7 kb) (CÁCERES et al. 1999). The j-1 library was previously amplified as described in SAMBROOK et al. (1989). Positive phages were analyzed by restriction mapping and Southern hybridization. DNA fragments of interest were then subcloned into the Bluescript II SK vector (Stratagene) after gel-purification, and sequenced. To quantify the abundance of foldback elements, Southern hybridizations were carried out with D. buzzatii lines representative of different chromosomal arrangements (st-1, st-3, st-7, st-10, j-2, j-9, j-19, j-23, j-24, jq7-1, jq7-4, jz3-6, jz3-7, y3-1 and s-1) and several lines from other Drosophila species (H84, J79, KO-2, MA-4, 1371.5, 1611.2, SD-12, D62C2B, 1451.0, SM-3, and UN-2) (Table 1). Genomic DNA of the different lines was digested with BamHI and HindIII restriction enzymes. Southern blots were hybridized with a 1.1 kb probe containing approximately half of the Galileo-12 element. Probe

was

obtained

by

PCR

amplification

with

primers

G19

(5’-

CTTATCCACGAATCATTTTCAG-3’) (CÁCERES et al. 2001) and E14 (5’CACTAACCATACAACACATAG-3’) (Figure 2) over clone pGPE208 (CASALS et al. 2003). Prior to the amplification, DNA from pGPE208 was digested with DraI to separate the IRs and prevent the formation of secondary structures.

In situ hybridization: Intraspecific in situ hybridizations were carried out over polytene chromosomes from D. buzzatii lines including all the chromosomal arrangements studied: st-1, j-2, j-23, j-24, jq7-4, jz3-6, jz3-7, y3-1 and s-1. Interspecific hybridizations were carried out over polytene chromosomes from lines H84, J79, KO-2, MA-4, 1371.5, 1611.2, SD-12, D62C2B, 1451.0, SM-3, UN-2 and VZ-12. When the line was polymorphic for different chromosomal arrangements, hybridization was performed over heterokaryotypes. Hybridization to the larval salivary gland

9

chromosomes

was

carried

out

according

to

the

procedure

described

by

MONTGOMERY et al. (1987) using the same 1.1 kb Galileo probe as in Southern hybridizations. Probe was subcloned into p-GEM T- vector (Promega) and labeled with biotin-16-dUTP (Roche) by nick translation. Detection was carried out using the ABCElite kit from Vector Laboratories. Interspecific hybridizations were performed at 25ºC and intraspecific hybridizations at 37ºC. Chromosomal localization of the hybridization signals was determined using the cytological maps of D. buzzatii, D. koepferae, D. gouveai and D. seriema (RUIZ and WASSERMAN 1993) and of D. antonietae and D. serido (RUIZ et al. 2000). The coincidence of TE insertion sites and the cytological localization of chromosomal inversion breakpoints described in D. buzzatii and the buzzatii

species

complex

(Table

S1,

supplemental

material

at

http://www.genetics.org/supplemental/), was tested with the statistical method described by ZELENSTOVA et al. (1999).

DNA Sequencing and Sequence Analysis: Sequences were obtained on an ABI 373 A (Perkin-Elmer) automated DNA sequencer. Fragments cloned into Bluescript II SK or pGEM-T were sequenced using M13 universal forward and reverse primers. PCR products were gel-purified using the Geneclean Spin Kit (Bio 101) and sequenced directly with the same primers used for amplification. Nucleotide sequences were analyzed using GeneToolLite software (BioTools). Similarity searches in the GenBank/EMBL databases were carried out using Blastx, Tblastx and Fasta. Multiple sequence alignments was performed with ClustalW (THOMPSON et al. 1994), followed by analysis with the DnaSP version 4.0 program (ROZAS et al. 2003). Phylogenetic analysis was performed using the PHYLIP software package (FELSENSTEIN 1989).

10

RESULTS Isolation of new foldback elements of D. buzzatii: We have obtained ten additional copies of the three foldback elements Galileo, Kepler and Newton through PCR amplification and genomic library screening. PCR amplification was carried out using primers located in the IRs of the elements. PCRs with primer G7, corresponding to the IR most external part common to the three elements (Figure 2), yielded some smear and four clear bands of 0.6, 1.0, 1.2 and 1.3 kb in all D. buzzatii lines analyzed. The two more intense bands, of 0.6 kb and 1.3 kb, were cloned from line st-3 and sequenced. The 1.3 kb fragment was formed by partial copies of new Galileo (Galileo-8) and Newton (Newton-3) elements consecutively arranged, whereas the 0.6 kb fragment contained a small copy of a Kepler element (Kepler-4) (Table 2 and Figure 2). Amplification with primer G8, corresponding to a more internal sequence exclusive to the Galileo IR, resulted in two bands of 2.8 and 3.8 kb. The more intense band (2.8 kb) of line j-8 was cloned and its ends were sequenced. Both ends of the amplified fragment contained sequences homologous to Galileo IRs (Table 2 and Figure 2) plus an ISBu2 element (CÁCERES et al. 2001) of 553 bp located 8 bp away from the left GalileoIR and an ISBu1 element (CÁCERES et al. 2001) of 61 bp located 191 bp away from the right IR. These sequences were separated by 1 kb of unknown composition, and sequences homologous to Galileo were tentatively classified as belonging to Galileo-7. Therefore, apparently only partial and chimerical elements were amplified by PCR, probably because the formation of secondary structures between the long IRs of complete elements when denatured hinders PCR amplification (CÁCERES et al. 2001). As an alternative approach, we also searched for new copies of the elements through library screening. A genomic library of line j-1 (CÁCERES et al. 1999) was

11

screened with a probe containing the Galileo-2 element (see Methods). Many positive phages were obtained, and several of them were selected for further study by enzyme digestion analysis and southern hybridization. Bands of interest of four of these phages (λj-1/1, λj-1/2, λj-1/3 and λj-1/4) were subcloned and partially sequenced. Overall, five new copies of Galileo and one copy of Kepler were found in these clones (Table 2 and Figure 2), together with some additional known TEs. λj-1/1 contained the Galileo-5 and Kepler-6 elements arranged in tandem. λj1-3 contained the Galileo-6 element only. λj14 consisted of two Galileo elements (Galileo-9 and Galileo–13) separated by approximately 11 kb, plus several copies of other TEs. Galileo-9 is surrounded by a 446 bp fragment of BuT3 (CÁCERES et al. 2001) and a BuT1 element of 932 bp (CÁCERES et al. 2001), located 0.9 kb and 21 bp away from the end of the IRs, respectively. The BuT1 element contains an insertion of a 1731-bp long element homologous to the Tc-1 like Paris transposon of D. virilis (PETROV et al. 1995), which had not been previously described in D. buzzatii. Galileo-13 is inserted into an ISBu1 element of 894 bp, which is separated by 12 bp from a 103 bp fragment of an Osvaldo retrotransposon (LABRADOR and FONTDEVILA 1994) located at the end of the λj1-4 clone. Finally, λj-1/2 included the Galileo-14 element, flanked in one side by an IsBu1 element (CÁCERES et al. 2001) 79 bp away of its left IR and a BuT6 element (CASALS et al. 2003) immediately adjacent to the other side. The inclusion of several different TEs in most of these phages suggests that they probably originated from the heterochromatin. In fact, the new foldback elements isolated by PCR amplification and library screening in this work are heterochromatic copies, as they are all truncated and all except one of the PCRs and the lambda clones characterized contains more than one transposable element.

12

Structure of the foldback elements of D. buzzatii: Together with those obtained in previous studies (CÁCERES et al. 1999 and 2001; CASALS et al. 2003), overall we have analyzed fourteen copies of Galileo, six of Kepler, and three of Newton (Table 2 and Figure 2). The main characteristic of the three types of elements is the presence of very long inverted terminal repeats (IRs), often spanning almost the entire element. Long IRs are characteristic of foldback elements and confer them the ability to form secondary structures when DNA double strand is denaturalized. This was early demonstrated in the D. melanogaster FB element by electron microscopy (TRUETT et al. 1981) and could explain the negative results in some of the PCR amplifications of fragments containing complete Galileo-like elements in this work and previous studies (CACERES et al. 2001; CASALS et al. 2003). We have tested the capacity of the longest copies of Galileo, Kepler and Newton to form secondary structures using the mFold software (http://BiBiServ.TechFak.Uni-Bierfeld.DE/fold/). Sequences from the two IRs of Galileo-12, Kepler-5 and Newton-2 paired to each other and originate a very stable stem-loop structure, with free energy values of -1423.1, -550.8 and-994.1 Kcal/mol,

respectively

(Figure

S1,

supplemental

material

at

http://www.genetics.org/supplemental/). In addition, these elements show the high degree of structural variability between and within copies that is also typical of other foldback TEs. However, despite the differences between the multiple copies and the fact that many of them are incomplete, a consensus canonical structure for each of the elements can be inferred. The Galileo elements found vary in size from 20 bp to 2304 bp with IRs from 8 bp to 1115 bp (Table 2) and show all the structural domains previously described in other foldback elements (Figure 2). The most external part of Galileo IRs contains a 479 bp terminal region (IR-FD), that it is relatively constant between the different copies.

13

This domain is followed by an internal region formed by up to three imperfect tandem repeats of a 136 bp sequence, with an average identity of 91%, plus a fourth smaller incomplete repeat of 43 bp (IR-OD). The average number of these elements per IR is 23 and usually this number differs between both sides of the element. The region comprising the IR-ID domain is extremely heterogeneous in length and sequence composition and is characterized by an increase of the AT composition (up to 75%). A 141 bp sequence of the right IR-ID domain of Galileo-3 and-6 shows some similarity (34% aminoacid identity, E = 0.00018) with the transposase of Hoppel, a P-like element of D. melanogaster (REISS et al. 2003) (Figure 2). Finally, several of the elements contain a central region of 81-158 bp that has not yet found forming part of the IRs and can be considered the M domain. Kepler and Newton differ from Galileo in that apparently they only contain IRFD and M domains, but not tandem repeats (IR-OD) or IR-ID domains (Figure 2). The Kepler elements analyzed show considerable variability and include mostly partial copies, with size between 381 bp and 930 bp (Table 2). According to the longest copy (Kepler-5), the IRs are formed by an IR-FD domain of ~346 bp and flank a M section of 254 bp. The two longest Newton elements (Newton-1 and -2) are almost identical, with 566-575 bp IR-FD domains and a 363-378 bp M segment. The sequence of Newton shows a 90% nucleotide identity with Kepler, including both the IR-FD and most of the M region. Neither ORFs coding for more than 100 amino acids nor any sequence similarities with a transposase have been observed in Kepler and Newton elements. Besides the structural similarities, the three foldback elements of D. buzzatii, share also a high degree of sequence homology in their IRs. On average, the first ~600 bp of Galileo (corresponding to the IR-FD and part of the first tandem repeat), Kepler (corresponding to the IR-FD and most of M region) and Newton (corresponding to the

14

IR-FD) have a ~73% nucleotide identity (CÁCERES et al. 2001). In particular, the most terminal 40 bp are almost identical between the three foldback elements and more internally there is a region of 200 bp that shows a sequence identity of 86% (Figure 2). Another shared characteristic of Galileo, Kepler and Newton is the generation of a 7 bp duplication of the target site during insertion. The comparison of the flanking sequences of all these elements (Table 2) suggests that they have a common preferential insertion sequence and a consensus sequence of G16T17a10g10T15A18c10 can be inferred. Interestingly, this sequence is palindromic and most of the elements have 7 bp flanking sequences in which at least two of the three bases of each side are complementary. Furthermore, the putative preferential insertion site resembles the ends of these elements (CA...TG).

Phylogenetic analysis of D. buzzatii foldback elements: The structural and sequence similarity of Galileo, Kepler and Newton suggests that they are related and belong to the same family of TEs. To determine the phylogenetic relationship between the different copies of the three elements described in this work, we have built an unrooted tree by neighbor-joining joining using the two homologous sequences of the IRs (totalling ~250 bp; Figure 2). Only the elements that contain both complete regions were included in the analysis and, when available, the two ends of the elements have been considered independently (Figure 3). In this tree, Kepler and Newton copies form separate monophyletic groups and both elements are clustered together, deriving apparently from one particular type of Galileo elements (Figure 3). Galileo copies show a higher degree of variation and form several different groups. In most cases, the two copies of the IR of the same element group closely together, although there are exceptions like the Galileo3 and -12 elements.

15

Abundance and chromosomal distribution of foldback elements in D. buzzatii: To estimate the number of Galileo elements, Southern blots of genomic DNA of 15 different D. buzzatii lines were carried out with a Galileo-12 probe. Genomic DNA of each line was digested with restriction enzymes BamHI and HindIII, which do not have restriction sites in none of the 23 copies of the Galileo, Kepler, and Newton elements reported here. The number of hybridization bands thus provide a minimum estimate of the number of these three transposable elements in the D. buzzatii genome, since the probe used contains sequence similarities with Kepler and Newton. The number of hybridization bands per line varied between 21 and 29, with an average of 26.73 (Table 3), but no significant differences between the different lines were observed (χ2 = 3.63; df = 14; P = 1.00). In situ hybridization to the polytene chromosomes of nine different D. buzzatii lines with the same Galileo-12 probe allowed us to examine the chromosomal distribution of the TE copies. In all cases, several signals in the euchromatic part of the chromosomes were obtained, whereas the centromeres were always strongly stained (Figure 4). The cytological localization of these signals is shown in Table S2 (supplemental material at http://www.genetics.org/supplemental/). The average number of euchromatic signals in each line was 56.11 (Table 3), and no significant differences between them were observed (χ2 = 5.32; df = 8; P = 0.72). In addition, we have compared the distribution of these signals between chromosomes pooling together the data of all D. buzzatii lines (Table 4). A very significant deviation was found between the observed number of signals in each chromosome and that expected according to a random distribution (χ2 = 1042.25; df = 5; P < 0.001). This difference was mainly due to a great accumulation of insertions in the dot (6) chromosome (Table 4). When this

16

chromosome is excluded from the analysis there are still significant differences in the number of signals among chromosomes (χ2 = 22.27; df = 4; P < 0.001), apparently due to an excess of insertions in chromosome 3 and a deficit of insertions in chromosome 5 (Table 4). However, no significant differences between the X chromosome and the autosomes were found (χ2 = 3.13; df = 1; P = 0.08). To analyze the intrachromosomal distribution of these elements, we have compared the observed and expected number of signals in the distal, central and proximal regions of the chromosomes (the distal and the proximal regions include the 10% of chromosomal bands closer to the telomere and centromere, respectively) (Table 5). Galileo elements clearly tend to accumulate in the proximal regions of the chromosomes, both pooling the data of all of them together (χ2 = 22.27; df = 2; P < 0.001) or considering each individually (data not shown). Finally, we have analyzed the relationship between the distribution of the euchromatic insertions of Galileo elements and five natural inversions present in the lines used in this work. When the observed and expected number of signals inside and outside the inverted region were compared (Table 6), a significant association between the TE insertions and the chromosomal inversions was found (χ2 = 31.14; df = 1; P < 0.001). The accumulation of insertions was especially clear in inversions 2q7 (χ2 = 22.77; df = 1; P < 0.001) and 2y3 (χ2 = 17.82; d. f = 1; P < 0.001). Conversely, the only inversion induced by introgressive hybridization did not show any association with insertions. In addition, we have examined the coincidence between Galileo insertion sites and the breakpoints of the inversions. The Galileo element was located at the breakpoints of inversions 2j and 2q7 (Figure 4a and b), as shown previously (CÁCERES et al. 1999 and 2001; CASALS et al. 2003), and one of the in situ hybridization signals in line s-1 was precisely located at the inversion 4s proximal breakpoint (Figure 4c). However, a global comparison of the chromosomal distribution of Galileo insertions

17

and the cytological position of the breakpoints of the 16 natural polymorphic inversions described in D. buzzatii and 18 inversions induced by introgressive hybridization did not find any significant association between them (Table 7).

Abundance and chromosomal distribution of foldback elements in the buzzatii species complex: Southern blot and in situ hybridization were also carried out in ten additional species of the buzzatii species complex (Table 1) and two other species of the repleta group, D. mulleri (mulleri subgroup), and D. repleta (repleta subgroup). The number of bands obtained by Southern hybridization in the species of the buzzatii cluster (D. seriema, D. koepferae, D. antonietae, D. serido, and D. gouveai) was similar to that obtained in D. buzzatii, whereas fewer bands were obtained in the species included in the martensis cluster (D. martensis, D. uniseta, D. venezolana and D. starmeri) or the stalkeri cluster (D. stalkeri). In D. mulleri and D. repleta, further distant from D. buzzatii, no bands could be observed (Table 3). In situ hybridization to the polytene chromosomes of these species (Table S2, supplemental material at http://www.genetics.org/supplemental/) yielded similar results to the Southern analysis. Few hybridization signals were obtained in the martensis and stalkeri cluster species, and all of them were restricted to chromosome 6 and the proximal regions of the other chromosomes (Figure 4f). In the species belonging to the buzzatii cluster the number of signals and their inter- and intra-chromosomal distribution were similar to those for the D. buzzatii lines (Table 3, 4 and 5, and Figure 4d and e). First, signals tend to accumulate at the dot chromosome in all species (Table 4). When chromosome 6 is excluded from the analysis, there were still significant differences in the distribution of insertions between chromosomes in D. gouveai (χ2 = 18.72; df = 4; P < 0.001) and D. koepferae (χ2 = 12.73; df = 4; P = 0.012). In addition,

18

there were significant differences in the number of insertions between the X chromosome and the autosomes in D. gouveai (χ2 = 11.98; df = 1; P < 0.001), D. koepferae (χ2 = 9.72; df = 1; P = 0.002), D. serido (χ2 = 6.24; df = 1; P = 0.012), and D. seriema (χ2 = 4.58; df = 1; P = 0.032). Second, in all cases TE insertions accumulate at the proximal regions of the chromosomes, with a very significant excess compared to a random distribution (Table 5). However, when the proximal regions are excluded of the analysis, no significant differences are detected between the X chromosome and the autosomes in any species, and there are significant differences between chromosomes only in D. buzzatii (χ2 = 21,55; d. f = 4 ;P < 0.001). When the relationship between TE insertion sites and the cytological position of inversion breakpoints was examined, hybridization signals were found precisely at the two breakpoints of the 2a8 inversion of D. serido (Figure 4d). Overall no significant association

was

observed

between

chromosomal

inversion

breakpoints

and

hybridization signals for the three species where polymorphic inversions have been reported (D. koepferae, D. serido and D. seriema) (Table 7). However, taking into account the breakpoints of all the fixed and polymorphic inversions described in the buzzatii cluster species results in a strong association with the location of foldback insertions (Table 7).

DISCUSSION Similarities of Galileo, Kepler and Newton with other foldback elements: The exhaustive analysis and characterization of all the available copies of the D. buzzatii elements Galileo, Kepler and Newton carried out in this work, allow us to safely classify them as foldback TEs. As we have seen, they have very long IRs and an heterogeneous composition with a high degree of variability within and between copies.

19

In addition, it has also been shown that these elements have the ability to induce genetic instability and chromosomal rearrangements in D. buzzatii populations (CÁCERES et al. 1999 and 2001; CASALS et al. 2003). The general organization and main features of the foldback elements described in other eukaryotic organisms are summarized in Figure 1 and Table 8. According to REBATCHOUK and NARITA (1997) and SIMMEN and BIRD (2000), foldback elements can be classified in five types depending on the presence of the different domains that form the IRs of these elements (Figure 1): IR-OD and IR-ID, type 1 elements; IR-OD only, type 2 elements; IR-ID only, type 3 elements; IR-FD and IR-OD, type 4 elements; and all three described domains (IR-FD, IR-OD and IR-ID), type 5 elements. However, the number, type and limits of the domains included in an element are usually not easy to define, making this classification somewhat arbitrary. Galileo elements fall unambiguously into the type 4 class, since in most cases they include IR-FD, IR-OD and IR-ID domains (Figure 2 and Table 8). It is noteworthy that the internal tandem repeats of Galileo IR-OD are longer than those of other foldback elements, which usually range between 7-32 bp. Kepler and Newton elements, on the other hand, show only the flanking domain surrounding the middle region (Figure 2 and Table 8). This structural organization resembles that of the Hairpin elements described in A. thaliana (Table 8). Despite their similarities with MITEs, Hairpin elements have been classified as type 3 foldback elements due to the presence of AT-rich IRs (equivalent to the IR-ID of other foldback elements) and its ability to form secondary structures (ADÉ and BELZILE 1999). Therefore, Kepler and Newton can be considered as a new type 6 of foldback elements, including only the IR-FD and M sequences, although there is also the possibility that all the analyzed copies are defective. The IRs of Kepler and Newton show two blocks of high sequence similarity

20

to the IR-FD and first tandem repeat of the IR-OD of Galileo (Figure 2). A similar situation, in which foldback elements with an overall different organization contain homologous modules or sequence blocks has been found in other organisms. For example, in tomato the IR-OD of the type 1 elements SOFT1 and SOFT2 shows a high level of similarity, and the sequence of the repeats in this domain is also similar to those of the type 2 SoFT3 elements described in potato (REBATCHOUK and NARITA 1997). In A. thaliana, FARE1 and FARE2 elements share sequence and structural similarities at the terminal regions of the IR-OD, although the arrangement of the tandem repeats of these elements is more complex (WINDSOR and WADDELL 2000) (Table 8). In general, the mechanism of transposition of foldback elements and the proteins involved remain largely a mystery. Similarly, the characterized copies of the three D. buzzatii foldback elements do not show any strong evidence of coding capacity or presence of ORFs with homology to a known protein. Only in a few of the Galileo elements a small region with low similarity with the Hoppel transposase has been found. The high level of structural heterogeneity described in this and other foldback elements (HOFFMAN-LIEBERMANN et al. 1985; CHENG et al. 2000; WINDSOR AND WADDELL 2000; this work) suggests that all the elements described in this work could be defective and have lost the transposase coding sequence. In fact, at least Kepler must contain some transcriptional promoter, since it has been shown to induce the expression of an antisense RNA (PUIG et al. 2004). The analysis of all the FB elements described in the genome of D. melanogaster showed that only 40.62% of them are complete (KAMINKER et al. 2002), which contrasts with their high frequency of excision (COLLINS and RUBIN 1983). Alternatively, these elements could be mobilized thanks to the proteins produced by other TEs or cellular processes

21

(REBATCHOUK and NARITA 1997). In this sense, it has been proposed that FARE2 elements of A. thaliana encode proteins that could also interact with homologous structures in FARE1 elements, which do not show coding capacity, and promote their mobilization (WINDSOR and WADDELL 2000). In D. buzzatii, the similarities between the sequences of the target insertion sites and the IRs of the three foldback elements suggest that probably they are all mobilized by the same mechanism of transposition. In addition, the conservation of certain regions of the IRs, specially the ends, could be indicative of a possible role in the recognition and binding of the transposase. Finally, it seems that the internal tandem repeats could play an important role in the recognition and binding of proteins. These sequences could be the target of the transposase, which first binds to them and then moves to the end of the element. A higher probability to mobilize of an element would be achieved by increasing the number of tandem repeats (CHENG et al. 2000).

Estimate of the age of D. buzzatii foldback elements: The presence of some polymorphic Galileo, Kepler and Newton insertions in the 2j and 2q7 inversion breakpoints (Table 2) indicates that these elements were active in D. buzzatii at least until recently. The comparison of the nucleotide sequences between the different copies allows us to estimate the age of these elements. Similar analyses have been carried out with different LTR retrotransposons in S. cerevisiae (JORDAN and MCDONALD 1999), maize (SANMIGUEL et al. 1998), and Drosophila (BOWEN and MCDONALD 2001), and with a family of human endogenous retrovirus (COSTAS and NAVEIRA 2000). In general, the age of retrotransposons has been obtained by comparing the number of substitutions between the two LTRs of the element to the nucleotide substitution rate of the species. This approach is valid due to the specific nature of the

22

retrotransposition process, where the two LTRs of an element are synthesized from the same template and are therefore identical at the moment of the insertion (BOWEN and MCDONALD 2001). However, the same method is probably not valid for class II transposable elements, which have a different transposition mechanism, or foldback elements, with an unknown mechanism. Alternatively, the age of TEs can be estimated by calculating the average pairwise nucleotide diversity between the different copies of the element (KAPITONOV and JURKA 1996; COSTAS and NAVEIRA 2000; BOWEN and MACDONALD 2001), and the two methods have been shown to yield similar results (BOWEN and MACDONALD 2001). We have calculated the nucleotide diversity values of the three foldback elements of D. buzzatii, considering all the sequence information available for each of the copies and excluding the gaps only in pairwise comparisons (Table 9). To estimate the age of Galileo, Kepler and Newton (Table 9) we have considered the average synonymous substitutions rate of Drosophila of 0.016 substitutions per nucleotide per million years (LI 1997). Newton elements show a higher degree of diversity than Galileo, which is not in agreement with the phylogeny from Figure 3. This disagreement must be due to the sequences considered in each analysis and the fact that nucleotide diversity varies along the element, increasing from the ends to the middle domain. In the phylogenetic analysis only the common sequences to the three elements were considered, whereas in the variability analysis we have used the complete sequences from the elements. When the common sequences of the three elements are considered an age of 5.51 millions years is obtained. This value provides a minimum estimate of the time when the common ancestor to the three elements was found in the D. buzzatii genome. However, it must be taken into account that the nucleotide substitution rate corresponding to the TE sequences may be quite different than that

23

used and could result in an overestimation of the age of the elements. For example, a higher degree of nucleotide variation in the TE insertions than in the single-copy adjacent sequences was found in the sequence analysis of the 2j and 2q7 inversion breakpoints (CÁCERES et al. 2001; CASALS et al. 2003). In any case, the estimated age is similar to the divergence time of the species included in the buzzatii complex. The divergence time of the buzzatii cluster (where D. buzzatii is included) from the other two clusters of this complex, the martensis and stalkeri clusters, is 5.8 and 6.2 million years, respectively (RUSSO et al. 1995; RODRÍGUEZ-TRELLES et al. 2000). By Southern blot and in situ hybridization analysis (Table 3 and Figure 4), we have shown that Galileo elements are widely present in the buzzatii cluster species at similar levels to those of D. buzzatii. In contrast, the species included in the martensis and stalkeri cluster show a much lower number of elements and those are only located in the proximal regions of the chromosomes and in the dot chromosome (Table 3 and Figure 4). This suggests that these elements are not active outside of the buzzatii cluster, and only inactive and heterochromatic elements remain in the genomes of the other species. In addition, the continuous distribution across the different species shown by these foldback elements suggest that they were transmitted vertically before the radiation of the buzzatii cluster. No evidence of horizontal transmission was found in previous studies of the interspecific distribution of other foldback elements, such as that of the D. melanogaster FB element in the genus Drosophila (SILBER et al. 1989) or the SoFT elements in the Solanum and Lycopersicon genus (REBATCHOUK and NARITA 1997). These results differ considerably to what has been observed in several Class II transposons, which have been frequently transmitted horizontally between distant species (CAPY et al. 1998).

24

Factors determining the chromosomal distribution of the elements: Galileo, Kepler and Newton elements are found in high copy numbers in the genome of the species included in the buzzatii cluster (see Results). The studied elements are not randomly distributed along the chromosomes and they clearly tend to accumulate in the proximal regions of the chromosomes (Table 4) and in the dot chromosome (Table 5). Two main factors have been proposed to account for the distribution of TEs in the genome: recombination rate and gene density. Recombination rate is expected to inversely affect TE abundance in two different ways. First, the reduction of recombination rate in certain regions will difficult the elimination of slightly deleterious insertions by natural selection (HILL and ROBERTSON 1966; GORDO and CHARLESWORTH 2001). Second, TEs inserted in regions with low recombination rates will probably avoid the generation of deleterious chromosomal rearrangements by ectopic recombination and their subsequent elimination (LANGLEY et al. 1988; MONTGOMERY et al. 1991; GOLDMAN and LICHTEN 1996). On the other hand, TEs could accumulate in regions of low gene density, where their insertion is less likely to cause deleterious effects. The accumulation of D. buzzattii foldback elements close to the centromeric regions and in the dot chromosome is consistent with the expected effect of recombination rate. The same chromosomal distribution pattern was described in the TEs of D. melanogaster (BARTOLOMÉ et al. 2002; KAMINKER et al. 2002), where the elements tend to be localized in known regions of low or null recombination rate (CHARLESWORTH 1996; ASHBURNER 1989). In addition, previous studies also showed a high density of TEs in the heterochromatin of several Drosophila species (CARMENA and GONZÁLEZ 1995; CHARLESWORTH et al. 1994; PIMPINELLI et al. 1995; DIMITRI 1997; JUNAKOVIC et al. 1998; DIMITRI et al. 2003). Finally, the

25

colonization by four retrotransposons of the neo-Y chromosome of D. miranda, which arose approximately 1 million years ago is another evidence of the accumulation of TEs in low recombination regions (STEINEMANN and STEINEMANN 1991, 1997; BACHTROG 2003). It has been proposed that regions with high recombination rates contain a higher number of genes that could difficult the survival TE insertions. However, the accumulation of TEs in the dot chromosome contradicts this hypothesis. This chromosome contains an excess of transposable elements, but its genic density is similar to that of regions with high recombination rates in the other chromosomes (CHARLESWORTH

et

al.

1992).

Interestingly,

both

in

D.

melanogaster

(BARTOLOMÉ et al. 2002; KAMINKER et al. 2002) and in D. buzzatii (this work) TEs do not accumulate in telomeric regions, where the recombination rate has also been shown to decrease (CHARLESWORTH 1996). One possible explanation is that ectopic recombination is not reduced in these regions (BARTOLOMÉ et al. 2002), as has been found in subtelomeric regions in yeast (HABER et al. 1991). In addition, the low recombination regions close to the telomere are generally shorter than those in the centromeric regions, and the presence of TE insertions in low recombination regions appears to be related to the distance from the high recombination region (MASIDE et al. 2001). The distribution of TEs with respect to the chromosomal inversions provides additional support that recombination rate is the main factor determining the distribution of TEs in the genome. Recombination rate drops inside the inverted region in heterokaryotypes and this reduction extends approximately for 1 Mb from the breakpoints (ANDOLFATTO et al. 2001). In D. buzzatii, foldback elements clearly tend to accumulate in the inverted regions (Table 6). Notably, this accumulation is mainly due to the two inversions with a lower frequency in natural populations. The

26

majority of chromosomes carrying these inversions will be found in heterozygosis, and thus the effect of the recombination would be more important that in inversions with higher frequency (EANES et al. 1992). Furthermore, the recombination reduction is especially pronounced in the inversion breakpoints (NAVARRO et al. 1997), which could explain the presence of multiple TE insertions in the breakpoints of the 2j and 2q7 inversions (CÁCERES et al. 2001; CASALS et al. 2003). However, a preference of certain TEs to insert inside each other could also be involved (CÁCERES et al. 2001). In general, no differences in the number of foldback element insertions between chromosomes have been found once the dot chromosome and the proximal regions were excluded from the analysis. In addition, no differences between autosomes and the X chromosome have been found either. The larger effect of deleterious TE insertions in the X chromosome due to its hemizygosity in males could predict a lower frequency of elements in this chromosome (MONTGOMERY et al. 1987). However, despite initial studies showing a smaller number of insertions in the X chromosome of D. melanogaster (BARTOLOMÉ et al. 2002), this observation was not later confirmed in a more complete analysis of the chromosomal distribution of transposable elements in this species (KAMINKER et al. 2002). Finally, chromosome 2 contains the vast majority of polymorphic and fixed inversions in the buzzatii species complex (WASSERMANN 1992). A higher number of transposable elements insertions in this chromosome could provide some explanation for its high rate of fixation of paracentric inversions (GONZÁLEZ et al. 2002), but according to our results this chromosome does not contain an excess of foldback elements. The precise localization of a foldback element at one breakpoint of the 4s inversion of D. buzzatii (Figure 4c) and the two breakpoints of the 2a8 inversion of D. serido (Figure 4d), together with the implication of Galileo in the generation of D.

27

buzzatii inversions 2j and 2q7 (CÁCERES et al. 1999; CASALS et al. 2003), suggests that these elements could play a important role in the genome evolution of the buzzatii species complex. Moreover, there is a high correlation between foldback elements insertion sites and the cytological location of the breakpoints of the chromosomal inversions of the buzzatii species complex (Table 7). This kind of association has been interpreted as evidence supporting the implication of TEs in the generation of the inversions in the virilis group of Drosophila (ZELENSTOVA et al. 1999; EVGEN’EV et al. 2000). However, although suggestive of certain tendency of TEs to be inserted in regions harboring breakpoints, indirect evidences based on cytological information must be interpreted with caution in relation to the origin of the inversion. Moreover, as we have mentioned, the location of TEs at the inversion breakpoints may be due to secondary invasions. It has been suggested that the breakpoints of the two D. buzzatii inversions characterized so far are genetic unstable regions probably induced by the presence of foldback elements (CÁCERES et al. 2001; CASALS et al. 2003), where TEs may insert and remain due to the weak effect of natural selection. Studies of the chromosomal distribution of other TEs found in these species may help to corroborate this extent and to check if foldback elements are able to induce hotspots for TE insertions at other genomic regions.

28

Acknowledgments We would like to thank J. M Ranz for initial the in situ hybridization analyses, M. Puig for the information on the Hoppel transposase homology, and M. Badal for helpful comments on the foldback elements structure. Work was supported by grant BMC2002-01708 from the Dirección General de Enseñanza Superior e Investigación Científica (Ministerio de Educación y Cultura, Spain) awarded to A. R. and a personal fellowship from the Fundação de Amparo à Pesquisa do Estado de São Paulo (Brazil) proc.01/06373-6 to M.H.M.

29

LITERATURE CITED ADÉ, J. and F. J. BELZILE, 1999 Hairpin elements, the first family of foldback transposons (FTs) in Arabidopsis thaliana. The Plant Journal 19: 591-597. ANDOLFATTO, P., F. DEPAULIS, and A. NAVARRO, 2001

Inversion

polymorphisms and nucleotide variability in Drosophila. Genet. Res. Camb. 77: 1-8. ASHBURNER, M., 1989 Drosophila. A laboratory handbook. Cold spring Harbor Laboratory Press. Cold Spring Harbor, NY. BACHTROG, D., 2003

Accumulation of Spock and Worf, two novel non-LTR

retrotransposons, on the Neo-Y chromosome of Drosophila miranda. Mol. Biol. Evol 20: 173-181. BARKER, J. S. F., F. M. SENE, P. D. EAST, and M. A. Q. R. PEREIRA, 1985 Allozyme and chromosomal polymorphism of Drosophila buzzatii in Brazil and Argentina. Genetica 67: 161-170. BARTOLOMÉ, C., X. MASIDE, and B. CHARLESWORTH, 2002 On the abundance and distribution of transposable elements in the genome of Drosophila melanogaster. Mol. Biol. Evol. 19: 926-937. BERG, D.E. and M. M. HOWE, 1989

Mobile DNA. American Society for

Microbiology, Washington, D.C. BINGHAM, P.M. and Z. ZACHAR, 1989 Retrotransposons and the FB transposon from Drosophila melanogaster. Pp. 485-502 in D. E. Berg and M. M. Howe, eds. American Society for Microbiology, Washington, D.C BOWEN, N. J. and J. F. MCDONALD, 2001

Drosophila euchromatic LTR

retrotransposons are much younger than the host species in which they reside. Genome Research 11: 1527-1540.

30

BRIERLEY, H. L. and S. POTTER, 1985 Distinct characteristics of loop sequences of two Drosophila foldback transposable elements. Nucleic Acids Res. 13: 485500. CÁCERES, M., J. M. RANZ, A. BARBADILLA, M. LONG, and A. RUIZ, 1999 Generation of a widespread Drosophila inversion by a transposable element. Science 285: 415-418. CÁCERES, M., M. PUIG, and A. RUIZ, 2001 Molecular characterization of two natural hotspots in the Drosophila buzzatii genome induced by transposon insertions. Genome Res. 11: 1353-1364. CAPY, P., C. BAZIN, D. HIGUET, and T. LANGIN, 1998 Dynamics and evolution of transposable elements. Springer-Verlag, Heidelberg, Germany. CARMENA, M. and C. GONZÁLEZ, 1995 Transposable elements map in a conserved pattern of distribution extending from beta-heterocromatin to centromeres in Drosophila melanogaster. Chromosoma 103: 676-684. CASALS, F., M. CÁCERES, and A. RUIZ, 2003

The Foldback-like transposon

Galileo is involved in the generation of two different natural chromosomal inversions of Drosophila buzzatii. Mol. Biol. Evol. 20: 674-685. CHARLESWORTH, B., 1996 Background selection and patterns of genetic diversity in Drosophila melanogaster. Genet. Res. Camb. 68: 131-149. CHARLESWORTH, B., A. LAPID, and D. CANADA, 1992

The distribution of

transposable elements within and between chromosomes in a population of Drosophila melanogaster. II. Inferences on the nature of selection against elements. Genet. Res. Camb. 60: 115-130. CHARLESWORTH, B., P. SNIEGOWSKI, and W. STEPHAN, 1994 The evolutionary dynamics of repetitive DNA in eukaryotes. Nature 371: 215-220.

31

CHENG, C., S. TSUCHIMOTO, H. OHTSUBO, and E. OHTSUBO, 2000 Tnr-8, a foldback transposable element from rice. Genes Genet. Syst. 75: 327-333. COLLINS, M. and G. M. RUBIN, 1983

High-frequency precise excision of the

Drosophila foldback transposable element. Nature 303: 259-260. COLLINS, M. and G. M. RUBIN, 1984 Structure of chromosomal rearrangements induced by the FB transposable element in Drosophila. Nature 308: 323-327. COSTAS, J. and H. NAVEIRA, 2000 Evolutionary history of the human endogenous retrovirus family ERV9. Mol. Biol. Evol. 17: 320-330. DIMITRI, P., 1997

Constitutive heterochromatin and transposable elements in

Drosophila melanogaster. Genetica 100: 85-93. DIMITRI, P., N. JUKANOVIC, and B. ARCÀ, 2003 Colonization of heterochromatic genes by transposable elements in Drosophila. DOOLITTLE, W. F. and F. C. SAPIENZA, 1980

Selfish genes, the phenotype

paradigm, and genome evolution. Nature 284: 601-603. EANES, W. F., C. WESLEY, and B. CHARLESWORTH, 1992 Accumulation of P elements in minority inversions in natural populations of Drosophila melanogaster. Genet. Res. Camb. 59: 1-9. EVGEN’EV, M. B., H. ZELENTSOVA, H. POLUECTOVA, G. T. LYOZIN, and V. VELEIKODVORSKAJA et al., 2000

Mobile elements and chromosomal

evolution in the virilis group of Drosophila. Proc. Natl. Acad. Sci. USA 97: 11337-11342. FELSENSTEIN, J., 1989

PHYLIP: phylogeny inference package (Version 3.2).

Cladistic: 164-166. FINNEGAN, D.J. 1989

Eukaryotic transposable elements and genome evolution.

Trends Genet. 5: 103-107.

32

GONZÁLEZ, J., J. M. RANZ, and A. RUIZ, 2002 Chromosomal elements evolve at different rates in the Drosophila genome. Genetics 161: 1137-1154. GOLDMAN, A. S. H. and M. LICHTEN, 1996

The efficiency of meiotic

recombination between dispersed sequences in Saccharomyces cerevisiae depends upon their chromosomal location. Genetics 144: 43-55. GORDO, I. and B. CHARLESWORTH, 2001

Genetic linkage and molecular

evolution. Curr. Biol. 11: R684-R686. HABER, J. E., W. Y. LEUNG, R. H. BORTS, and M. LICHTEN, 1991 The frequency of meiotic recombination in yeast is independent of the number and position of homologous donor sequences: implications for chromosome pairing. Proc. Natl. Acad. Sci. USA 88: 1120-1124. HANKELN, T. and E. R. SCHMIDT, 1990 New Foldback transposable element TFB1 found in histone genes of the midge Chiromonus thumni. J. Mol. Biol. 215: 477482. HARDEN, N. and M. ASHBURNER, 1990

Characterization of the FOB-NOF

transposable element of Drosophila melanogaster. Genetics 126: 387-400. HASSON, E., C. RODRÍGUEZ, J. J. FANARA, H. NAVEIRA, O. A. REIG, and A. FONTDEVILA, 1995 The evolutionary history of Drosophila buzzatii. XXVI. Macrogeographic patterns of inversion polymorphism in New World populations. J. Evol. Biol. 8: 369-384. HILL, W. G. and A. ROBERTSON, 1966 The effect of linkage on limits to artificial selection. Genet. Res. Camb. 8: 269-294. HOFFMAN-LIEBERMANN, B., D. LIEBERMANN, L. H. KEDES, and S. N. COHEN, 1985 TU elements: a heterogeneous family of modularly structured eucaryotic transposons. Molecular and cellular Biology 5: 991-1001.

33

INTERNATIONAL HUMAN GENOME SEQUENCING CONSORTIUM, 2001 Initial sequencing and analysis of the human genome. Nature 409: 860-921. JORDAN, I.K. and J. F. MACDONALD, 1999

Tempo and mode of Ty element

evolution in Sacharomyces cerevisiae. Genetics 151: 1341-1351. JUNAKOVIC, N., A. TERRINONI, C. DI FRANCO, C. VIEIRA, and C. LOEVENBRUCK, 1998

Accumulation of transposable elements in the

heterochromatin and on the Y chromosome of Drosophila simulans and Drosophila melanogaster. J Mol Evol 46: 661-668. KAMINKER, J. S., C. BERGMAN, B. KRONMILLER, J. CARLSON, and R. SVIRSKAS et al., 2002

The transposable elements of the Drosophila

melanogaster euchromatin: a genomics perspective. Genome Biology 3: research0084.1-0084.20. KAPITONOV, V. V. and J. JURKA, 1996 The age of Alu subfamilies. J. Mol. Evol. 42: 59-65. KAZAZIAN H. H., 2004 Mobile elements: Drivers of genome evolution. Science 303: 1626-1632. KIDWELL, M.G. and D. LISCH, 1997 Transposable elements as sources of variation in animals and plants. Proc. Natl. Acad. Sci. USA. 94: 7704-7711. KUHN, G. C. S., A. RUIZ, M. A. R. ALVES, and F. M. Sene, 1996 The metaphase and polytene chromosomes of Drosophila seriema (repleta group; mulleri subgroup). Rev. Brasil. Genét. 19: 209-216. LAAYOUNI, H., M. SANTOS, and A. FONTDEVILA, 2000 Toward a physical map of Drosophila buzzatii: use of randomly amplified polymorphic DNA polymorphisms and sequence-tagged site landmarks. Genetics 156: 1797-1816.

34

LABRADOR, M. and A. FONTDEVILA, 1994 High transposition rates of Osvaldo, a new Drosophila buzzatii retrotransposon. Mol. Gen.Genet. 245: 661-674. LANGLEY, C. H., E. MONTGOMERY, R. HUDSON, N. KAPLAN, and B. CHARLESWORTH, 1988 On the role of unequal exchange in the containment of transposable element copy number. Genet. Res. 52: 223-235. LEVIS, R. M., I. G. COLLINS, and M. RUBIN, 1982 FB elements are the common basis for the instability of th wDZL and wc Drosophila mutations. Cell 30: 551565. LI, W., 1997 Molecular Evolution. Sianuer, Sunderland, MA. LIEBERMANN, D., B. HOFFMAN-LIEBERMANN, J. WEINTHAL, G. CHILDS, and R. MAXSON et al., 1983. An unusual transposon with long terminal inverted repeats in the sea urchin Stronglyocentrotus purpuratus. Nature 306: 342-347. LOBACHEV, K. S., B. M. SHOR, H. T. TRAN, W. TAYLOR, and J. D. KEEN et al., 1998

Factors affecting inverted repeat stimulation of recombination and

deletion in Saccharomyces cerevisiae. 1998. Genetics 148: 1507-1524. MASIDE,

X.,

C.

BARTOLOMÉ,

S.

ASSIMACOPOULOS,

and

B.

CHARLESWORTH, 2001 Rates of movement and distribution of transposable elements in Drosophila melanogaster: in situ hybridization vs Southern blotting data. Genet. Res. Camb. 78: 121-136. MCDONALD, J.F, 1993 Evolution and consequences of transposable elements. Curr. Opin. Genet. Dev. 3: 855-864. MONTGOMERY, E., B. CHARLESWORTH, and C.H. LANGLEY, 1987 A test for role of natural selection in the stabilization of trasposable element copy number in a population of Drosophila melanogaster. Genet. Res. Camb. 49: 31-41.

35

MONTGOMERY, E., S.-M. HUANG, C. H. LANGLEY, and B. H. JUDD, 1991 Chromosome

rearrangement

by

ectopic

recombination

in

Drosophila

melanogaster: Genome structure and evolution. Genetics 129: 1085-1098. MOUSE GENOME SEQUENCING CONSORTIUM, 2002 Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520-562. NAVARRO, A., E. BETRÁN, A. BARBADILLA, and A. RUIZ, 1997 Recombination and gene flux caused by gene conversion and crossing over in inversion heterokaryotypes. Genetics 146: 695-709. NAVEIRA, H. and A. FONTDEVILA, 1985 The evolutionary history of Drosophila buzzatii. IX. High frequencies of new chromosome rearrangements induced by introgressive hybridization. Chromosoma 91: 87-94. ORGEL, L. E. and F. H. C. CRICK, 1980 The ultimate parasite. Nature 284: 604-607. PETROV, D. A., J. L. SCHUTZMAN, D. L. HARTL, and E.R. LOZOVSKAYA, 1995 Diverse transposable elements are mobilized in hybrid dysgenesis in Drosophila virilis. Proc. Natl. Acad. Sci. USA. 92: 8050-8054. PIMPINELLI, S., M. BERLOCO, L. FANTI, P. DIMITRI, and S. BONACCORSI et al., 1995 Transposable elements are stable structural components of Drosophila melanogaster heterochromatin. Proc. Natl. Acad. Sci. USA. 92: 3804-3808. PLASTERK, R. H. A., 1995 Mechanisms of DNA transposition. In Mobile Genetic Elements. Ed. D. J. Sherratt. Oxford University Press, Oxford. POTTER, S.S., 1982 DNA sequence of a foldback transposable element in Drosophila. Nature 297: 201-204. POTTER, S.S., M. TRUETT, M. PHILLIPS, and A. MAHER, 1980

Eukaryotic

transposable genetic elements with inverted terminal repeats. Cell 20: 639-647.

36

PUIG, M., M. CÁCERES, and A. RUIZ, 2004 Silencing of a gene adjacent to the breakpoint of a widespread Drosophila inversion by a transposon-induced antisense RNA. Proc. Natl. Acad. Sci. USA 101: 9013-9018. REBATCHOUK, D. and J. O. NARITA, 1997 Foldback transposable elements in plants. Plant Molecular Biology 34: 831-835. REISS, D., H. QUESNEVILLE , D. NOUAUD , O. ANDRIEU, and D. ANXOLABEHERE, 2003

Hoppel, a P-like element without introns: a P-

element ancestral structure or a retrotranscription derivative?. Mol. Biol. Evol. 20: 869-879. RODRÍGUEZ-TRELLES, F., L. ALARCÓN, and A. FONTDEVILA, 2000 Molecular evolution of the buzzatii complex (Drosophila repleta group): a maximumlikelihood approach. Mol. Biol. Evol. 17: 1112-1122. ROZAS, J., J. C. SÁNCHEZ-DELBARRIO, X. MESSEGUER, and R. ROZAS, 2003 DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics 19: 2496-2497. RUIZ, A. and A. FONTDEVILA, 1981 Ecología y evolución del subgrupo mulleri de Drosophila en Venezuela y Colombia. Acta Científica Venezolana 32: 338-345. RUIZ, A. and M. WASSERMAN, 1993 Evolutionary cytogenetics of the Drosophila buzzatii species complex. Heredity 70: 582-596. RUIZ, A., A. FONTDEVILA, and M. WASSERMAN, 1982 The evolutionary history of Drosophila buzzatii. III. Cytogenetic relationships between two sibling species of the buzzatii cluster. Genetics 101: 503-518. RUIZ, A., H. NAVIERA, and A. FONTDEVILA, 1984 Drosophila

buzzatii

IV.

Aspectos

citogenéticos

La historia evolutiva de de

su

polimorfismo

cromosómico. Genética Ibérica 36: 13-35.

37

RUIZ, A., A. M. CANSIAN, G. C. S. KUHN, M. A. R. ALVES, and F. M. SENE, 2000 The Drosophila serido speciation puzzle: putting new pieces together. Genetica 108: 217-227. RUSSO, C.A.M., N. TAKEZAKI, and M. NEI, 1995

Molecular phylogeny and

divergence times of drosophilid species. Mol. Biol. Evol. 4: 406-425. SAMBROOK, J., E. F. FRITSCH, and T. MANIATIS, 1989 Molecular cloning, a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. SANMIGUEL, P., B. S. GAUT, A. TIKHONOV, Y. NAKAJIMA, and J. L. BENNETZEN, 1998 The paleontology of intergene retrotransposons of maize. Nature Genetics 20: 43-45. SILBER, J., C. BAZIN, F. LEMEUNIER, S. AULARD, and M. VOLOVITCH, 1989 Distribution and conservation of the foldback transposable element in Drosophila. J. Mol. Evol. 28: 220-224. SIMMEN, M. W. and A. BIRD, 2000 Sequence analysis of transposable elements in the sea squirt, Ciona intestinalis. Mol. Biol. Evol. 17: 1685-1694. SMITH, P. A. and V. G. CORCES, 1991

Drosophila transposable elements:

Mechanisms of mutagenesis and interactions with the host genome. Advances in Genetics 29: 229-300. STEINEMANN, M. and S. STEINEMANN, 1991 Preferential Y-chromosome location of TRIM, a novel transposable element of Drosophila miranda, obscura group. Chromosoma 101: 169-179. STEINEMANN, M. and S. STEINEMANN, 1997 The enigma of Y chromosome degeneration: TRAM, a novel retrotransposon is preferentially located on the Neo-Y chromosome of Drosophila miranda. Genetics 145: 261-266.

38

TEMPLETON, N. S. and S. S. POTTER, 1989

Complete foldback transposable

elements encode a novel protein found in Drosophila melanogaster. EMBO J. 8: 1887-1894. THOMPSON, J. D., D. G. HIGGINS, and T. J. GIBSON, 1994

CLUSTAL W:

improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673-4680. TOSI, D. and F. M. SENE, 1989 Further studies on chromosomal variability in the complex taxon Drosophila serido (Diptera, Drosophilidae). Rev. Brasil. Genet. 12: 729-745. TRUETT, M. A., R. S. JONES, and S. S. POTTER, 1981 Unusual structure of the FB family of transposable elements in Drosophila. Cell 24: 753-763. WASSERMAN, M., 1992 Cytological evolution of the Drosophila repleta species group. A: Drosophila inversion polymorphism, pp. 455-552. Edited by C.B. Krimbas and J.R. Powell. CRC Press, Boca Ratón, FL. WASSERMAN, M. and H. R. KOEPFER, 1979 Cytogenetics of South American Drosophila mulleri complex: the martensis cluster. More sharing of inversions. Genetics 93: 935-946. WASSERMAN, M. and R. H. RICHARDSON, 1987

Evolution of Brazilian

Drosophila mulleri complex species. J. Hered. 78: 282-286. WESSLER, S.R., T.E. BUREAU, and S. E. WHITE, 1995 LTR-retrotransposons and MITEs: important players in the evolution of plant genomes. Curr. Opin. Genet. Dev. 5: 814-821. WINDSOR, A. J. and C. S. WADDELL, 2000 FARE, a new family of Foldback transposons in Arabidopsis. Genetics 156: 1983-1995.

39

YUAN, J., M. FINNEY, N. TSUNG, and R. HORVITZ, 1991 Tc4 Caenorhabditis elegans transposable element with an unusual fold-back structure. Proc. Natl. Acad. Sci. USA 88: 3334-3338. ZELENTSOVA, H., H. POLUECTOVA, L. MNJOIAN, G. LYOZIN, and V. VELEIKODVORSKAJA et al., 1999 Distribution and evolution of mobile elements in the virilis species group of Drosophila. Chromosoma 108: 443-56. ZHOU, Z-H., E. AKGÜN, and M. JASIN, 2001 Repeat expansion by homologous recombination in the mouse germ line at palindromic sequences. Proc. Natl. Acad. Sci USA 98: 8326-8333.

40

Table 1 Drosophila lines used in this study. Line

Species

Geographic origin

Chromosomal arrangement

st-1

D. buzzatii

Carboneras (Spain)

Standard

st-2

D. buzzatii

Carboneras (Spain)

Standard

st-3

D. buzzatii

Vipos (Argentina)

Standard

st-4

D. buzzatii

Guaritas (Brazil)

Standard

st-7

D. buzzatii

Termas de Río Hondo (Argentina)

Standard

st-10

D. buzzatii

Termas de Río Hondo (Argentina)

Standard

j-1

D. buzzatii

Carboneras (Spain)

2j

j-2

D. buzzatii

Carboneras (Spain)

2j

j-7

D. buzzatii

Caldetes (Spain)

2j

j-8

D. buzzatii

San Luis (Argentina)

2j

j-9

D. buzzatii

Quilmes (Argentina)

2j

j-12

D. buzzatii

Guaritas (Brazil)

2j

j-13

D. buzzatii

Guaritas (Brazil)

2j

j-19

D. buzzatii

Ticucho (Argentina)

2j

j-23

D. buzzatii

San Luis (Argentina)

2j, 5I

j-24

D. buzzatii

San Luis (Argentina)

2j, t (5,1)

jq7-1

D. buzzatii

Carboneras (Spain)

2jq7

jq7-4

D. buzzatii

Otamendi (Argentina)

2jq7

jz3-2

D. buzzatii

Carboneras (Spain)

2jz3

jz3-6

D. buzzatii

Carboneras (Spain)

2jz3

jz3-7

D. buzzatii

Pingado, Canary Islands (Spain)

2jz3

41

y3-1

D. buzzatii

Pingado, Canary Islands (Spain)

2y3

s-1

D. buzzatii

Pingado, Canary Islands (Spain)

2y3, 4st/s

H84

D. antonietae

Serrana (Brazil)

2st/y8

J79

D. gouveai

Ibotirama (Brazil)

Standard

KO-2

D. koepferae

Sierra San Luis (Argentina)

Standard

MA-4

D. martensis

Guaca (Venezuela)

Standard

1371.5

D. mulleri

Bowling Green Center, Ohio (EUA) Standard

1611.2

D. repleta

Bowling Green Center, Ohio (EUA) Standard

SD-12

D. serido

Rio Paraguaçu (Brazil)

2a8/b8

D62C2B D. seriema

Mucugê (Brazil)

Standard

1451.0

D. stalkeri

Bowling Green Center, Ohio (EUA) Standard

SM-3

D. starmeri

Curaçao (The Netherlands Antilles)

Standard

UN-2

D. uniseta

Guaca (Venezuela)

Standard

VZ-12

D. venezolana

Los Frailes (Venezuela)

Standard

42

Table 2 Summary of the main characteristics of the different copies of the foldback elements of Drosophila buzzatii. Element

Size (bp)

IRs (bp) a

Target sequence b

Origin

Reference

Galileo-1

1,589

228/443

GTAGTAG

2j inversion proximal breakpoint

Cáceres et al. 1999

Galileo-2

392

106

TTTGTAT

2j inversion distal breakpoint

Cáceres et al. 1999

Galileo-3

2,204

683/684

GAAGAAC

2j inversion proximal breakpoint

Cáceres et al. 2001

Galileo-4

1,948

782/917

GTGATAC

2j inversion distal breakpoint

Cáceres et al. 2001

Galileo-5

794

-

GTAATAT

Library screening

This work

Galileo-6

1,531

-

GTAGTAC

Library screening

This work

Galileo-7

962

163

TTCTAGC

PCR amplification (j-8 line)

This work

Galileo-8

691

-

-

PCR amplification (st-3 line)

This work

Galileo-9

186

-

GTGATAC

Library screening

This work

Galileo-10

1,139

337/331

GTTATAC

2q7 inversion distal breakpoint

Casals et al. 2003

Galileo-11

20

8

CTTGTTC

2q7 inversion proximal breakpoint

Casals et al. 2003

Galileo-12

2,304

1115/959

GTATTAT/GTGGTAT

2q7 inversion proximal breakpoint

Casals et al. 2003

43

Galileo-13

609

-

GCTGAAC

Library screening

This work

Galileo-14

1298

375/393

GTATAAG

Library screening

This work

Kepler-1

722

150

GCCATAT/CATATAT

2j inversion proximal breakpoint

Cáceres et al. 2001

Kepler-2

743

-

GTAGTAT

2j inversion proximal breakpoint

Cáceres et al. 2001

Kepler-3

692

40/42

GTTATAC/GTAGTAG

2j inversion distal breakpoint

Cáceres et al. 2001

Kepler-4

564

28

-

PCR amplification (st-3 line)

This work

Kepler-5

930

346/330

ATATGAT/CCATACA

2q7 inversion proximal breakpoint

Casals et al. 2003

Kepler-6

381

-

GTTTTAG

Library screening

This work

Newton-1

1,510

566

GTAGTAT

2j inversion proximal breakpoint

Cáceres et al. 2001

Newton-2

1,512

575/574

GTGATAC

2j inversion distal breakpoint

Cáceres et al. 2001

Newton-3

567

-

-

PCR amplification (st-3 line)

This work

a

When different, the size of the left and right inverted terminal repeats (IRs) are indicated.

b

We have considered both orientations of the sequences adjacent to different Galileo, Kepler and Newton copies and represented here the one

that fits best to the consensus sequence. When two identical target site duplications have been found at each side of the element their sequence is shown in boldface, and if they differ both sequences are indicated.

44

Table 3 Number of observed insertions of foldback elements in lines of D. buzzatii and other Drosophila species by Southern blot and in situ hybridization analysis. D. buzzatii

Southern In situ

line

blot

hybrid a

st-1

29

50

st-3

28

st-7

Taxon

Species

Southern In situ blot

buzzatii cluster

hybrid b

D. antonietae 36

93

ND

D. buzzatii c

38

ND

22

ND

D. gouveai

35

89

st-10

27

ND

D. koepferae

34

72

j-2

29

46

D. serido

> 40

86

j-9

29

ND

D. seriema

37

83

j-19

29

ND

13

prox

j-23

28

64

D. starmeri

21

prox

j-24

27

62

D. uniseta

20

prox

jq7-1

28

ND

D. venezolana ND

jq7-4

29

62

stalkeri cluster

jz3-6

26

62

jz3-7

25

52

y3-1

21

54

s-1

24

53

Mean

26.73

56.11

martensis cluster D. martensis

prox

10

prox

mulleri subgroup D. mulleri

0

ND

repleta subgroup D. repleta

0

ND

D. stalkeri

prox = in situ hybridization signals distinguishable only in chromosome 6 and centromeric regions of chromosomes. ND = not determined. a

intraspecific hybridizations.

45

b

interspecific hybridizations.

c

st-1 line.

46

Table 4 Observed and expected distribution of foldback elements insertions in the chromosomes of six species of the buzzatii cluster. D. buzzatii Chr.

Obs.

Exp.a

D. antonietae

Exp.b

D. gouveai

D. koepferae

D. serido

D. seriema

Obs.

Exp.a Exp.b

Obs.

Exp.a Exp.b

Obs.

Exp.a Exp.b

Obs.

Exp.a Exp.b

Obs.

Exp.a Exp.b

X

65

95.95 79.13

24

17.67 16.98

29

16.91 16.98

23

13.68 12.93

25

16.34 16.02

23

15.77 15.44

2

91

117.16 96.76

17

21.58 20.77

14

20.65 20.77

14

16.70 15.81

15

19.95 19.59

17

19.26 18.88

3

110

98.48 81.18

22

18.14 17.42

22

17.36 17.42

15

14.04 13.27

20

16.77 16.43

14

16.19 15.84

4

89

91.41 75.44

12

16.83 16.19

13

16.11 16.19

9

13.03 12.33

13

15.57 15.27

15

15.02 14.72

5

55

94.94 77.90

13

17.48 16.72

7

16.73 16.72

6

13.54 12.73

10

16.17 15.77

11

15.60 15.20

6

95

7.58

-

5

1.40

-

4

1.34

-

5

1.08

-

3

1.29

-

3

1.25

-

***

**

**

ns

***

***

***

*

*

ns

ns

ns

Pc a

Expected values according to a random distribution in which the number of insertion per chromosome is proportional to chromosome size.

b

Expected values excluding chromosome 6

c 2

χ test P-values. ns, not significant; *, P < 0.05; **, P < 0.01; ***, P < 0.001.

47

Table 5 Observed and expected distribution of foldback elements insertions in three different regions of the chromosomes of six species of the buzzatii cluster. Distal

Central

Proximal

P-valuea

buzzatii cluster

Distal

Central

Obs.

1

24

40

***

D. antonietae Obs

6

28

54

Exp.

6.5

52

6.5

Exp.

8.80

70.40

8.80

Obs.

2

36

53

Obs.

3

16

66

Exp.

9.1

72.8

9.1

Exp.

8.50

68

8.50

Obs.

0

22

88

7

13

47

Exp.

11

88

11

Exp.

6.70

53.60

6.70

Obs.

5

11

73

Obs

2

24

57

Exp.

8.9

71.2

8.9

Exp.

8.30

66.40

8.30

Obs.

0

4

51

Obs

2

28

50

Exp.

5.5

44

5.5

Exp.

8

64

8

Obs.

8

97

305

D. buzzatii Chr X

Chr 2

Chr 3

Chr 4

Chr 5

Total

***

***

***

***

D. gouveai

D. koepferae Obs.

D. serido

D. seriema

Proximal P-valuea ***

***

***

***

***

48

Exp. a

41.0

328.0

41.0

χ2 test P-value. ns, not significant; *, P < 0.05; **, P < 0.01; ***, P < 0.00

49

Table 6 Observed and expected distribution of foldback elements insertions inside and outside the inverted region of five natural chromosomal inversions and one inversion induced by introgressive hybridization of D. buzzatii Inversion

Geographic

Frequencya

Galileo signals

distributiona 2j

2z3

2q7

2y3

4s

5Id

cosmopolitan

cosmopolitan

mod. frequent

endemic

mod. frequent

-

P

Inv. regionb Non-inverted valuec 0.192 – 1

0 – 0.315

0 – 0.126

0 – 0.040

0 – 0.943

-

Totale

Obs.

6

6

Exp.

4.44

7.56

Obs.

4

1

Exp.

2.15

2.85

Obs.

8

0

Exp.

2.08

5.92

Obs.

14

0

Exp.

6.16

7.84

Obs.

3

2

Exp.

2.15

2.85

Obs.

3

2

Exp.

2.1

2.9

Obs.

35

9

Exp.

16.98

27.02

ns

ns

***

***

ns

ns

***

a

Data from RUIZ et al. 1984 and HASSON et al. 1995.

b

The inverted region includes the inversion plus the additional chromosomal bands

located outside of each of the breakpoints. c

χ2 test P-value. ns, not significant; *, P < 0.05; **, P < 0.01; ***, P < 0.001

50

d

inversion induced by introgressive hybridization (NAVEIRA and FONTDEVILA

1985). e

including only the five natural cromosomal inversions.

51

Table 7 Association between foldback insertion sites and chromosomal inversion breakpoints Co-localization of TE insertions and breakpointsa

Obs.

Exp.b

P-valuec

D. buzzatii polymorphic inversions

4

1.81

ns

D. buzzatii introgressive hybridization inversionsd

2

2.15

ns

D. koepferae polymorphic inversions

1

0.35

ns

D. serido polymorphic inversions

2

0.43

ns

D. seriema polymorphic inversions

0

0.05

ns

buzzatii complex species inversions

33

19.76

***

a

For each species, only the breakpoints of the inversions and the in situ hybridization

signals observed in this species are considered. For the buzzatii complex, the breakpoints of the fixed and polymorphic inversions and the insertions found in all the species are considered (Table S1, supplemental material at http://www.genetics.org/supplemental/). b

The expected values are obtained according to the formula: number of breakpoints ×

number foldback insertions / number of chromosomal bands. The total number of bands (excluding the chromosome 6 and the proximal regions) is 1094. c

P-value was calculated as described in ZELENTSOVA et al. 1999. ns, not significant;

*, P < 0.05; **, P < 0.01; ***, P < 0.001. d

NAVEIRA and FONTDEVILA 1985.

52

Table 8 Foldback elements molecularly characterized in eukaryotic organisms. Element (copy) a

Organism

Size (bp)

IRs (bp) b

Type

TS (bp)

Ref.

Foldback

Drosophila melanogaster

2,437 c

1,018/1,329

4

9

POTTER 1982

∼ 840

1

8

LIEBERMANN et al. 1983.

(FB4) TU (TU1)

Strongylocentrotus purpuratus 2,880

TFB1 (TFB1)

Chiromomus thummi

1,048

162/140

2

10

HANKELN and SCHMIDT 1990

Tc4 (Tc4)

Caenorhabditis elegans

1,605

774

1

5

YUAN et al. 1991

SoFT1 (SoFT1)

Lycopersicon esculentum

697

302/369

1

10

REBATCHOUK and NARITA 1997

SoFT2 (SoFT2)

Lycopersicon esculentum

1,043

360/364

1

-

REBATCHOUK and NARITA 1997

Hairpin (Hairpin-3)

Arabidopsis thaliana

245

116/123

3

5

ADÉ and BELZILE 1999

FARE1 (FARE1.1)

Arabidopsis thaliana

1,122

561/551

2

9

WINDSOR and WADDELL 2000

FARE2 (FARE2.11)

Arabidopsis thaliana

~16,700

~400

2

9

WINDSOR and WADDELL 2000

Unnamed

Ciona intestinalis

2,444

748/727

5

9

SIMMEN and BIRD 2000

53

Tnr8 (Tnr8-1)

Oryza glaberrima

418

189

4

9

CHENG et al. 2000

Galileo

Drosophila buzzatii

2,304

1,115/949

5

7

CASALS et al. 2003

Drosophila buzzatii

930

346/330

6

7

CASALS et al. 2003

Drosophila buzzatii

1,512

575/574

6

7

CÁCERES et al. 2001

(Galileo-12) Kepler (Kepler-5) Newton (Newton-2) a

When data from more than one copy of the element is available, the copy from which data is taken is indicated in brackets. When more than one

copies are fully characterized we have chosen the biggest one that were representative of the rest. b

When different, the two values of the IRs length are showed.

c

Excluding the 1,652 bp of the HB element inserted in FB4 (BRIERLEY and POTTER 1985).

54

Table 9 Nucleotide variation and age estimate for the three foldback elements of D. buzzatii. Element

n

m

S

πa

Age (myr)

Galileo

14

2,442

328

0.0606

3.79

Kepler

6

815

69

0.0494

3.09

Newton

3

1,502

115

0.0762

4.76

All (common sequences)b

23

508

179

0.0881

5.51

n = number of sequences of each element; m = total number of nucleotides being compared; S = number of segregating sites; π = nucleotide diversity. a

Nucleotide diversity (π) was estimated excluding gaps only in pairwise comparisons.

b

Common sequences correspond to those homologous in the three types of elements

shown in Figure 2.

55

Figure legends

Figure 1. Schematic representation of the structure of foldback elements and their different domains. White triangles represent the target site duplications. IR, inverted repeat; FD, flanking domain; OD, outer domain; ID, inner domain; M, middle domain. L and R refer to the left and right sides of the elements, respectively. Arrows below the element include the complete IRs. Based in REBATCHOUK and NARITA (1997) and SIMMEN and BIRD (2000).

Figure 2. Schematic representation of the twenty-three foldback-like elements of D. buzzatii characterized in this study. (A) Galileo elements. (B). Kepler elements. (C) Newton elements. The different copies of each element are represented with respect to the longest copy found (top). The designations and colors of the structural modules of each element are the same as in Figure 1. The dashed and numbered regions correspond to different copies of the internal tandem repeats. Arrows below the elements indicate the regions repeated at each side of the element. Small arrows above Galileo-12 represent primers used for PCR amplification. Small rearrangements inside of some elements are indicated by dup (duplication), and inv (inversion). Only insertions/deletions larger than 20 bp are indicated. The dashed rectangle indicates the localization of sequences with similarity to a transposase. Sequences F1, F2, F3 and F4 are homologous in the tree elements.

Figure 3. Neighbor-joining tree of the 251 bp long homologous regions of the IRs of the Galileo, Kepler and Newton elements described in this work. When available, a and b designate, respectively, the left and right half of the element according to Figure 2.

56

Only the IRs that contained the complete sequences were included in the analysis. Bootstrap percent values at nodes were based on 1,000 bootstrap replicates.

Figure 4. In situ hybridization of a Galileo-12 probe to the salivary gland chromosomes of D. buzzatii lines jz3-7 (a), jq7-4 (b), s-1 (c), D. serido (d), D. antonietae (e), and D. stalkeri (f). Arrows indicate the position of a hybridization signal at the breakpoints of the 2j, 2q7 and 4s inversions of D. buzzatii (a, b and c, respectively), and the 2a8 inversion of D. serido (d).

57

Table S1. Chromosomal inversions described in the buzzatii species complex. Inversiona

Species

Breakpoints

Reference

Xj

-

E2d-G1a

(1)

Xr

-

D3a-F3a

(1)

Xs

D. starmeri

F1e-F3d

(1)

Xq

D. starmeri

D3d-G1a

(1)

Xy

D. starmeri

F1a-E2d

(2)

2m

-

D3d-F2a

(3)

2n

-

F2a-G1g

(3)

2l

-

C7e-D5a

(3)

2e2

-

F6a-F3a

(3)

2u6

-

D1g-F2a

(3)

2w7

-

D1g-G1a

(3)

2x7

-

D5c-F2a

(3)

2y7

-

G1a-D1g

(3)

2z7

-

F2a-F6a

(3)

2e8

-

B2a-C6a

(3)

2j9

-

D4a-F4d

(3)

2f8

D. borborema

C7d-E6a

(4)

2g8

D. borborema

C6c-E2h

(4)

2h8

D. borborema

C6a-C1h

(4)

2j

D. buzzatii

C6b-E5a

(5)

2y3

D. buzzatii

D1a-E3a

(5)

2z3

D. buzzatii

E4b/c-F1f

(5) (6)

2q7

D. buzzatii

D3c-G2f

(5) (7)

58

2c9

D. buzzatii

C1a-D1a

(5)

2d9

D. buzzatii

D5b-G3d

(5)

2e9

D. buzzatii

C3c-C6h

(5)

2f9

D. buzzatii

C6h-G2f

(5)

2g9

D. buzzatii

B3e-C3a

(5)

2h9

D. buzzatii

E1a-E2e

(5)

2i9

D. buzzatii

F3c-G2f

(5)

2r9

D. buzzatii

C5f-D2a

(5)

2s9

D. buzzatii

E1a-G2a

(5)

2k9

D. koepferae

B1d-D4a

(8)

2l9

D. koepferae

A1a-B3e

(8)

2m9

D. koepferae

C3b-E1e

(8)

2n9

D. koepferae

E5c-F4a

(8)

2u9

D. koepferae

F6a-C3b

(3)

2v9

D. koepferae

F1c-F4g

(3)

2w9

D. koepferae

A4d-E5a

(3)

2x9

D. koepferae

E5a-F6a

(3)

2g2

D. martensis

B4c-D5a

(1)

2o9

D. martensis

B4c-C4e

(2)

2p9

D. martensis

C6a-D3d

(2)

2p8

D. richardsoni

F2a-G2b

(3)

2q8

D. richardsoni

E5e-D3e

(3)

2a8

D. serido

D1c-F3a

(4)

2b8

D. serido

B4a-F2a

(4)

2c8

D. serido

C1a-A4a

(4)

59

2d8

D. serido

F2a-E4a

(4)

2w8

D. serido

E4g-D3d

(9)

2x8

D. serido

B2a-C6e

(9)

2y8 (2”e”)

D. serido

E1g-F6a

(9) (10)

2z8 (2”d”)

D. serido

D3d-G1g

(9) (10)

2y9

D. serido

C4f-E1d

(9)

2 “a”

D. serido (IV)

C2c-E1d

(10)

2f2

D. starmeri

B3a-C7e

(3)

2t6

D. starmeri

D1g-D5a

(3)

2w6

D. starmeri

C7e-C6a

(1)

2x6

D. starmeri

F4a-E2e

(1)

2y6

D. starmeri

B1b-B3a

(1)

2z6

D. starmeri

F1c-D5a

(1)

2a7

D. starmeri

E6a-E2e

(1)

2e7

D. starmeri

B4e-C3f

(3)

2b7

D. starmeri

F4a-E6g

(1)

2c7

D. starmeri

E2e-E6g

(1)

2r7

D. starmeri

B1b-C4a

(1)

2q9

D. starmeri

F6b-G4b

(2)

2w6

D. uniseta

B3f-D2e

(1)

2t9

D. venezolana

D2b-E4a

(3)

3k

-

D5a-G1h

(3)

3v

-

D4b-E4a

(3)

3w

-

E4a-F4c

(3)

3r2

-

B1c-C5d

(3)

60

3u

D. borborema

D4b-F4g

(11)

3j2

D. buzzatii

A2c-F2b

(5)

3k2

D. koepferae

D2b-F3g

(8)

3y

D. starmeri

C5d-B5a

(1)

3z

D. starmeri

D1h-E5d

(1)

3a2

D. starmeri

C1b-C5e

(1)

3e2

D. starmeri

F3f-G1a

(1)

4s

D. buzzatii

D1d-F1c

(5)

4m

D. koepferae

E3d-G2c

(8)

5g

-

D3a-F2d

(8)

5d2

-

E1a-F1a

(3)

5c2

D. buzzatii

D4d-E3g

(12)

5w

D. koepferae

D4g-F1a

(8)

5e (5”e”)

D. serido

C3a-F1a

(9) (10)

5d

D. seriema

F2a-G2e

(11)

5q

D. starmeri

C2c-D4a

(1)

(1) WASSERMAN and KOEPFER 1979; (2) RUIZ and FONTDEVILA 1981; (3) RUIZ and WASSERMAN 1993; (4) WASSERMAN and RICHARDSON 1987; (5) RUIZ et al. 1984; (6) LAAYOUNI et al. 2000; (7) CASALS et al. 2003; (8) RUIZ et al. 1982; (9) RUIZ et al. 2000; (10) TOSI and SENE 1989; (11) KUHN et al. 1996; (12) BARKER et al. 1985. a

Previous

denominations

of

some

inversions

are

indicated

in

brackets.

61

Table S2. In situ hybridization signals produced by the Galileo probe in the buzzatii cluster species. Species

Signals a

D. buzzatii (st-1)

XA4b, XC1d, XD3d, XE1d, XG1b, XF1c, [XH1e-cen], 2E1g, 2F5a, 2G1e, [2G5e-cen], 3B4hb, [3G4a-cen], [4G4d-G5b], [4G5e-cen], 5A4a-b, 5G2fb, 5G4c, [5G4i-cen], [6 A1b-f], [6A1i-cen]

D. buzzatii (j-2)

XF1c, [XH1e-cen], 2E1g, 2F5a, [2G5e-cen], 3G4e, 3G4f, [3G5a-cen], [4G4d-G5b], [4G5e-cen], 5G4cb, [5G4i-cenb], [6A1ccen]

D. buzzatii (j-23) XA3c, XA4b, XB1f, XC1db, XD3g, XD3e, XC4h, XF5gb, XF2b, XG2b, [XH1e-cen], 2A1a, 2E2g, 2D3e, 2F1c, [2G5e-cen], 3A4eb, 3A5e, 3D2d, 3D5db, [3G4e-cen], 4A2ab, 4A3cb, 4A3d, 4G1c, [4G4d,-G5b], [4G5e-cen], 5D5b, 5G4cb, [5G4i-cenb] , [6A1b-f], [6A2b-cen] D. buzzatii (j-24) XD2i, XF5g, [XH1e-cen], 2A3a, 2E2fb, 2D3a, 2E5ab, 2F1c, 2G2g, 2G2h, [2G5e-cen], 3B3a, 3B4h, 3E2e, 3F1b, [3G4d-cen], 4A2a, 4A3d, 4E5ab, 4F4i, 4G1g, [4G4d-G5b], [4G5e-cen], 5G2f, 5G2hb, 5G4c, 5G4e, [5G4i-cen], 6A1e-f, 6A1i, [6A2a-cen], 5A4f(t 5,1) D. buzzatii (jq7-4) XF3c, XF1c, XG2i-j, [XH1e-cen], 2G1g, 2D3c, 2G2e, 2G2d, 2G2c, 2F1a, 2E5a, 2G2f, 2G3d, 2G4c, 2G5b, [2G5c-cen], 3E1i, 3G1a, 3G3b, 3G4c, 3G4e, [3G4g-cen], [4G4d-G5b], [4G5e-cen], 5G2fb, 5G2hb, 5G4c, [5G4i-cen], [6A1a-f], [6A2c-cen] D. buzzatii (jz3-6) XD3d, XE1d, XF5g, XF1c, XH1bb, XH1cb, [XH1e-cen], 2E4gb, 2E5a, 2D4a, 2E2g, 2G2d, [2G5e-cen], 3B5c, 3E2e, 3E4c,

62

3F1b, 3G3h, [3G4d-cen], 4E5a, [4G4d-G5b], [4G5e-cen], 5G2f, 5G2h, 5G4c, 5G4d, [5G4i-cen], 6A1a, [6A1d-g], [6A2b-cen] D. buzzatii (jz3-7) XC1d, XD3d, XD3b, XF5g, XF1c, XG2k, [XH1c-cen], 2F5a, [2G5e-cen], 3D5d, [3G4e-cen], 4E5a, 4G2fb, [4G4d-G5b], [4G5e-cen], 5G2fb, 5G2hb, 5G4c, [5G4i-cen], [6A1b-d], [6A2c-cen] D. buzzatii (y3-1) XC3a, XD3a, XG2k, [XH1g-cen], 2E1f, 2D5c, 2D4e-fb, 2D3e, 2F6i, 2F6f, 2C7b, [2G5e-cen], 3A5eb, 3B2c, 3E3a, 3G2a, 3G3c, [3G4d-cen], 4E4g-hb, [4G4d-G5b], [4G5e-cen], 5D5d, 5G2f, 5G2h, 5G4c, 5G4d, [5G4i-cen], [6A1a-f], 6A2c D. buzzatii (s-1)

XD3d, [XH1e-cen], 2E1f, 2D5c, 2D4e-f, 2D3e, 2F6i, 2F6f, 2C7b, [2G5e-cen], 3A5c, 3B2f, [3G4e-cen], 4E5c, 4F1c, 4F2h, 4F4j, [4G4d-G5b], [4G5e-cen], 5G2h, 5G4c, [5G4i-cen], 6A1c-d, [6A2a-cen]

63

D. antonietae

XA1e, XA2i, XB2b, XC3a, XD3f, XD1a, [XG2a-cen], 2A1c, 2C4h, 2C5f, 2C6f, 2D5d, 2D5b, 2G2d, 2G3e-f, [2G5b-cen], 3A2c, 3B1d, 3B4c, 3C1b, 3C3fb, 3C5d, 3D5b, 3E5a, 3F2b, 3G1e, [3G4a-cen], 4A1b-c, 4B1c, 4D5d, [4G4d-cen], 5B1f, 5B5e, 5C1b, 5D1c, 5E2a, 5E3c, 5F2f, 5F2i, 5G1b, [5G4h-cen], 6A1b-c, 6A1f, 6A2b, [6A2h-cen]

D. koepferae

XA3k, XC2e, XD4d, XE2a, XE4g, [XG2a-cen], 2B3d, “C1m, 2F4g, 2F3a, 2G2f, 2G2g, 2G4c, [2G5d-cen], 3A1h, 3A3gb, 3B3e, [3G4a-cen], 4B4f, 4D1a, 4G1c, 4G4d, 4F4a-bb, [4G5e-cen], 5A1c, 5A3a, 5A3c, 5A3e, 5G4cb, [5H1a-cen], 6A1c-d, 6A1g, [6A2g-cen]

D. serido

XA3a, XB1f, XB2b, XC1e, XE1d, XF3d, XF2b, [XG2a-cen], 2B4a, 2E5e-f, 2E1c-d, 2E1a, 2F2a, 2G2f, 2G3d, [2G5c-cen], 3A4f, 3C4b, 3D1g, 3D5cb, 3F1c-d, 3F1f, 3F4a, 3G1g, [3G4a-cen], 4C2g, 4F2h, 4G3c, 4G4a, 4G4b, [4G5a-cen], 5A5d, 5B1eb, 5E3e, 5F3e-f, 5G2g, [5G4g-cen], 6A1e, 6A2g, 6H1a

D. seriema

XA3k, XB1db, XC1h, XC4c, XD3j, [XG2a-cen], 2A1f, 2C4a-b, 2D2a, 2D3f-g, 2F5e, 2F3a, 2G2h, 2G4a, [2G5b-cen], 3A2a, 3C2h, 3D5e, 3D3a-b, 3F3g, 3G1g, 3G3d, 3G4e-f, [3G4g-cen], 4B1h, 4C1gb, 4E3b, 4E4e, 4E4g, 4F3b, 4G1d, 4G3e-g, [4G5bcen], 5B2a, 5B5a-bb, 5C4a, 5D2a, 5F1a-b, 5F3d, [5G4g-cen], 6A1b, 6A1d, 6A2g

cen = centromere. a

When a interval of bands are stained its limits are indicated in squared brackets.

b

Signals of weaker intensity.

64

A

B

C

Figure S1. Predicted secondary structures produced by Galileo-12 (A), Kepler-5 (B), and Newton-2 (C).

65