Molecular Biology, Vol. 35, No. 2, 2001, pp. 157–167. Translated from Molekulyarnaya Biologiya, Vol. 35, No. 2, 2001, pp. 196–207. Original Russian Text Copyright © 2001 by Arkhipova.
UDC 575.11:595.773.4
Transposable Elements in the Animal Kingdom I. R. Arkhipova Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA; E-mail:
[email protected] Received August 23, 2000
Abstract—Transposable elements (TEs) are commonly thought to be of universal occurrence in eukaryotes. Analysis of complete higher eukaryotic genomes confirms the TE status as substantial genome components and provides insights into their role in shaping the genome structure of extant eukaryotes. This review addresses several recently investigated problems in transposon biology, including the potential roles of promoter organization in transposon function and evolution, the ubiquity of TEs in numerous phyla of the animal kingdom, and the possible connection between transposon content and mode of reproduction. Key words: DNA transposons, retrotransposons, animal phyla, sexual/asexual reproduction
INTRODUCTION As the genome sequencing era is rapidly advancing our understanding of what the eukaryotic genomes are in fact composed, it’s time to take a new look at the role which transposable elements (TEs) have played in shaping the present structure of these genomes over millions of years of evolution. In this review, I will address a number of recently investigated problems in the field of transposon research. The emphasis will be placed on TEs in the animal kingdom, in accordance with primary research interests of the author and also because reviews on TEs in other kingdoms were published relatively recently [1–3]. However, a comprehensive review is not the main purpose of this paper; rather, it is concerned with several subjects which were of particular interest to the author in recent years and are especially attractive for presenting speculations rarely affordable within the frameworks of a research paper. For convenience, I will consider transposons and host cells as separate interacting entities, with transposons being foreign DNA and the host cell providing them with the appropriate environment. However, in many cases transposons became so tightly associated with host genomes and intertwined with host systems of transcription, translation etc. that it becomes difficult to make such distinctions. Moreover, it is plausible that transposons have become such an integral part of higher eukaryotic genomes that it would be almost impossible to imagine evolutionary consequences resulting from the loss of the transposon component. One of the main unanswered questions is how transposons got established so profoundly in genomes of higher eukaryotes. In prokaryotes, they do not normally form stable associations with the host genomes.
In lower eukaryotes with streamlined genomes, they proliferate in relatively small numbers, mostly by finding themselves “safe havens” for insertion in which they do not cause much damage to their hosts [4]. With increase in the genome size, however, an increase in transposon content also comes about. What are the properties of higher eukaryotic organisms and their TEs that allow such permanent associations? Some of these properties are discussed below, while others still await further consideration. TRANSCRIPTION AND PROMOTERS One of the first aspects of transposon–cell interaction concerns its ability to function as a transcriptional unit. For the two major classes of eukaryotic transposons, class I or retrotransposons and class II or DNA transposons, transcription would serve slightly different purposes. In DNA transposons, it is primarily needed to generate mRNA serving as a template for synthesis of transposase, in most cases the only enzyme needed for transposition. Hence, their level of transcription need not be as high as for retrotransposons, which in addition to the enzymes such as reverse transcriptase (RTase) also produce large amounts of structural proteins involved in RNP formation, and generate enough RNA to serve as a template for reverse transcription. Consequently, there is often a pronounced difference between levels of transcription for DNA transposons and retrotransposons. Transcript levels of the former can be relatively low or even not reach detectable levels at all, and fusion to inducible promoters is typically used to achieve higher levels of expression [5–8]. In fact, DNA transposon-based constructs are the genetic tools of choice over retrovirus-based constructs, which require prior
0026-8933/01/3502-0157$25.00 © 2001 MAIK “Nauka /Interperiodica”
158
ARKHIPOVA
removal of enhancers, in large-scale enhancer-trap screens based on the ability of weak promoters to acquire expression patterns determined by adjacent enhancers upon insertion in their vicinity [9–13]. Retrotransposon transcripts, on the contrary, are relatively abundant at the stages and in tissues when and where they are produced. This is correlated with the absence or presence of various stage- and tissuespecific enhancer elements, which are far more diversified and individualized in different retrotransposon families than their coding regions. The variety of expression patterns covers virtually all developmental stages and tissues (reviewed in [14–16]), implying that the process of acquisition of such control elements could be more or less random. However, transposition should take place in the germ line in order to establish the transposon in the host genome and transmit it to the progeny. Retrotransposons cannot transpose without an RNA template, and their transcripts/reverse transcripts need to be present in the germ line in order to give rise to heritable transposition events. It is therefore not surprising that a large number of retrotransposons has been demonstrated to possess enhancer elements providing expression in the germ line and/or to generate transcripts detectable in germline cells [17–28]. Let us summarize the strategies available for transposons as transcriptional units with respect to transcription control elements: Class I or retrotransposons: RNA is not only a template for protein synthesis but also is used as genetic material during transposition, therefore transcription of the entire unit should be ensured by either avoiding loss of non-transcribed control regions or compensating for it. Non-LTR retrotransposons could achieve this by: (i) having completely internal promoter/enhancer elements at the 5' end, with an RNA start site located upstream at the 5' boundary of the transposon, with no dependence on adjacent transcriptional control elements, although possibly subject to influence by such elements; (ii) same but at the 3' end, thus requiring a tandem head-to-tail arrangement; (iii) repeating the promoter sequence several times at the 5' end; (iv) having no promoter elements of their own, but inserting site-specifically into a sequence which provides a promoter for readthrough transcription; (v) relying on a “master copy” which had by chance landed in a proximity of a cellular promoter and since then is giving rise to all subsequent copies. LTR-containing retrotransposons:
(vi) promoter (and terminator) is located in the LTR and regenerated after each retrotransposition cycle, so that loss of the 5'-nontranscribed promoter sequences after reverse transcription is avoided. A uniform LTR structure (U3-R-U5) includes location of the transcription start site upstream of the transcription termination site, so that the short region in between gives rise to a terminal redundancy (R) at the ends of the transcript [29]. Strategy (i) appears to be the most common one used by a large number of non-LTR retrotransposons studied (e.g., jockey, F, I, Doc, L1Hs, TRAS1) [30–36]. It is especially widespread in Drosophila, perhaps because downstream RNA polymerase II promoter elements work efficiently in this species and occur in a very large number of cell TATA-less promoters [37, 38]. (ii) is an ingenious invention used by HeT-A, the telomere-associated retrotransposon of Drosophila melanogaster, in which an element transposing to the end of the chromosome attaches the 3'-terminally located promoter to its downstream neighbor, being therefore quite unselfish [39]. (iii) so far has been described in rodent L1 elements [40, 41]. (iv) was hypothesized for the site-specific ribosomal insertion elements [42]. Finally, strategy (v) is usually invoked whenever all the other options are ruled out (e.g., [36]). Promoters of LTR-retrotransposons, while possessing the uniform retrovirus-like structure, usually conform to the standards used by other promoters in the host cell, since they also exploit the same machinery. Thus, the LTRs of retroviruses typically contain TATA boxes [43], while many Drosophila LTR retrotransposons possess TATA-less promoters with well-pronounced initiator sequences and downstream promoter elements [44, 45]. Moreover, even such nonstandard combinations as overlapping RNA polII and polIII promoters have been described, both for LTR and non-LTR retrotransposons [46, 47], strengthening the belief that their acquisition was largely a matter of chance. Transcriptional enhancers and other control elements typically reside in the U3 region of the LTR, although they may also be localized in the 5'-and 3'untranslated regions (reviewed in [14, 15]), offering targeting opportunities for control of transposition at the transcriptional level. Class II elements, or DNA transposons, in principle do not need to carry efficient promoter/enhancer elements and may possess very weak promoters (see above); in fact, a basal level of transcription would do almost as well. A basal promoter could be located anywhere between the 5'-ITR and ORF, and a precise RNA start would not be necessary, because RNA is not a genetic material (even readthrough transcription might occasionally suffice). This offers very limited opportunities to control transposition at the level of transcription, compared to retrotransposons, and posttranscriptional control mechanisms appear to be MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
TRANSPOSABLE ELEMENTS IN THE ANIMAL KINGDOM
favored, including splicing, RNA interference, transposase subunit interaction, etc. [6, 48–53]. TANDEM PROMOTERS AND HYPOTHESES FOR EVOLUTIONARY ORIGINS OF LTR-RETROTRANSPOSONS The tandem type of transcriptional unit organization, such as the one discovered in the HeT-A retrotransposon [39], deserves special attention. It might represent a mechanistically possible pathway of transition from non-LTR to LTR retrotransposons and could have resulted, at some distant point in evolutionary history, in acquisition of the LTR structure previously lacking in non-LTR retrotransposons. In this respect, it is worth mentioning that HeT-A might not be the only element using this strategy: the Penelope transposon of D. virilis [53a] appears to possess a promoter organization which is in fact very similar to that of HeT-A (M.B. Evgen’ev, personal communication). It has been noted that the arrangement of two transcriptional units in tandem bears resemblance to the LTR structure: both promoter and terminator sequences are present in duplicate on either side of the coding region, and the resulting RNA has a small terminal redundancy similar to the transcripts of retroviruses and LTR-retrotransposons [39]. However, the actual transition requires acquisition of a different pathway of reverse transcription. Non-LTR retrotransposons are undoubtedly the most diverse and arguably the most ancient group of retroelements, with significant sequence homology to group II introns and telomerase reverse transcriptases [54–57]. They have been around long before the fusion of the RTase and integrase domains, characteristic of LTR-retrotransposons, took place early in evolutionary history [58], perhaps having occurred independently in the progenitors of copia-like and gypsylike retrotransposons which have a reciprocal arrangement of these domains. If we assume that these fusions indeed occurred independently, it is of interest to trace common features that might have been acquired, making different superfamilies of LTR-retrotransposons so structurally similar. The features that are important for an evolutionary transition from nonLTR to LTR-retrotransposons are acquisition of internal priming (tRNA-dependent, self-priming, etc.), the ability of RTase to perform template jumps, and participation of more than one RNA molecule in the act of reverse transcription. An imaginary intermediate step would be a LINElike element with a tendency to form a tandem headto-tail arrangement of identical transcriptional units. If the propensity of LINE elements for “3' transduction” [59] could simultaneously lead to capture of a cellular promoter at the 3' end, its activity might MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
159
supersede the previously used 5'-terminal promoter, which tends to be lost during incomplete reverse transcription, and result in production of a terminally redundant transcript (Fig. 1). If the RTase previously possessed affinity to the sequence or secondary structure at the 3' end of a nonredundant transcript, as seems to be the case for some LINE-like RTases [60], a terminally redundant transcript, when used by RTase as a template, could have RTase bound to either 3' or 5' R region with equal probability. At this stage, acquisition of an internal priming mode should be postulated. Namely, the enzymatic machinery bound at the 5' end could find a way to introduce a cut into the secondary structure formed by the 5' end (Tf1-like selfpriming mechanism [61]), or to recruit one of the cell tRNAs, especially if an RTase displays a certain affinity for this tRNA (similar to the RTase of a Neurospora mitochondrial plasmid that recognizes a tRNAlike structure at the 3' end of the RNA template [62]). Note that the LTR formation occurs automatically, once the priming is moved to an internal location and the ability to switch templates allows the full-length cDNA synthesis to complete. The process of reverse transcription is eventually relocated from the nucleus to the cytoplasmic nucleoprotein particles containing another molecule of the RNA template to facilitate extension of newly formed strong-stop cDNAs. The RNA binding capacity, which is typically provided by the ORF1 located upstream from the pol gene in both LTR- and non-LTR retroelements, should be an important factor in a transition to sequestering cDNA synthesis within RNP particles. HORIZONTAL TRANSFER AND ITS POSSIBLE PREREQUISITES To relocate into another host species, a TE must undergo an act of horizontal transfer: introduce a copy of itself, previously residing in a TE-containing host species, into the genomic DNA of another, naive species (reviewed in [63, 64]). To succeed, it needs to get established in the host germ line for further vertical transmission to the progeny, and to proliferate in individual genomes and ultimately throughout the entire population. All this, however, should be accomplished without doing a lot of harm to the host, as this will also result in death of a TE. If a transposon starts over as a foreign DNA invading the host cell, the first challenge it faces is to evade any restriction system to which it might be vulnerable. Then, it needs to establish a long-term association with the host by becoming compatible with the host systems involved in transcription, translation, posttranscriptional and post-translational processes. There are several options available to transposons to establish themselves in a new environment:
160
ARKHIPOVA PT (A)
PT (A) A
RNA
R
R
AAA
B RT
RT
R
R
AAA
C RT R
RT 3' AAA R
Fig. 1. Possible steps in evolution of LTR structure from tandem arrays of retroelements. P and T, promoter and terminator sequences arranged to generate terminal redundancy R at both ends of the polyadenylated (AAA) RNA transcript. Step A represents transcription; B, translation; and C, acquisition of internal priming (in this case tRNAmediated) versus target priming by reverse transcriptase (RT), leading to evolution of retroviral pathway of reverse transcription.
(i) If a transposon comes from a species that is not too distantly related to the new host, there is a good chance that its control elements, such as promoters/enhancers, might still be compatible with the new host species, with no need to acquire novel regulatory elements. Introgression-type mechanisms might then be responsible for exchange of genetic material, including TEs, between related species. This would imply that horizontal transfers are more likely to occur between species belonging to the same genus than between species belonging to different phyla. Indeed, the best-documented cases of horizontal transfer are known for host species belonging to the same genus Drosophila [65, 66]. (ii) There is limited chance that regulatory elements, especially transcriptional regulators, would be compatible between very distantly related species. Therefore, such distant horizontal transfers should be quite infrequent. To circumvent the non-functionality of control elements, the easiest way would be to acquire the resident ones by de novo incorporation into transposon structure, rather than to modify the pre-existing sequences. For this event to occur, a retrotransposon must land in the vicinity of a functional promoter/enhancer, and readthrough transcription from the cell promoter or 3'-transduction could lead to formation of a hybrid message containing the necessary regulatory regions. After that, RTase would be able to convert such transcript into cDNA, either in the chromosome or in a nucleoprotein particle, assuming that the synthesized RTase would still preserve its intrinsic ability to initiate reverse transcription of the corresponding RNA. Indeed, most of the conservation is usually observed within open reading frames of a
TE and not in the adjacent untranslated regions which exhibit a lot of variability, and insertions of retroelements into promoter/enhancer sequences are a rule rather than an exception (reviewed in [15]). For LINE-like elements, the RTase and, in most cases, the associated endonuclease activity would be sufficient to perform subsequent integration via target-primed reverse transcription [67]. LTR-containing elements, in addition, would need to incorporate the cis-acting packaging and priming signals compatible with their enzymatic machinery. Ability to associate more than one molecule during reverse transcription would increase the chances of such incorporation by recombinational processes at the RNA level, mediated by intermolecular jumps of RTase within a nucleoprotein particle. In addition to the above-mentioned problem of interchangeability of regulatory elements between different host species, a similar problem would also emerge if intermediate hosts serving as shuttle vectors are used to transfer TEs between host species [68]. It is conceivable that such complex entities as retrotransposons would require vectors compatible with eukaryotic hosts, such as eukaryotic viruses or parasites. In contrast, the simplest transposition units such as mariner-like transposons essentially consist of a single intron-lacking transposase gene embedded between two short inverted terminal repeats, with very small untranslated regions and little room for regulatory elements such as enhancers, insulators, etc. These simplest units could be temporarily maintained in prokaryotic vectors, thereby expanding the range of available host organisms and facilitating horizontal transfers. However, experimental observations capturing such acts are yet to be made. It is therefore not surprising that for DNA transposons horizontal transfer appears to be the predominant mode of transmission [51, 69]. This, however, can easily be reversed if the transposition process, which is essentially host factor-free for mariner-like transposons [70–72], becomes dependent on hostencoded proteins (e.g., [73]). LTR retrotransposons seem to undergo horizontal transfers only sporadically [66, 74, 75], although the gypsy group elements containing an env-like gene have a potential to be transmitted as extracellular virus-like particles, without resorting to any vectors [76, 77]. Finally, non-LTR retrotransposons apparently move horizontally very rarely if ever, probably in part because they do not have a relatively stable extrachromosomal DNA intermediate in their transposition cycle ([78], but see [79]). ARE TRANSPOSONS UBIQUITOUS IN THE ANIMAL KINGDOM? The universality of TE occurrence and the mode of their transmission may be assessed to a certain extent MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
TRANSPOSABLE ELEMENTS IN THE ANIMAL KINGDOM
by analyzing the distribution of TEs on a wide scale in a large number of phylogenetically diverse organisms. Previously, screens of such nature were conducted for copia-like retrotransposons in plants [80, 81] and revealed the ubiquity of copia-like elements throughout the plant kingdom and the predominance of the vertical mode of transmission, with rare occasional cases of suspected horizontal transfers. However, copia-like retrotransposons in the animal kingdom are found only sporadically, in a few isolated species, and the most prominent superfamilies of TEs are LINElike and gypsy-like retrotransposons, as well as DNA transposons of the mariner/Tc1 superfamily. The way that has proven effective to rapidly screen for transposons on a large scale is to employ a single step of PCR amplification with degenerate primers spanning the most conserved domains of RTases or transposases. This approach has successfully been used to amplify those transposon sequences which contain at least two six-amino-acid blocks of similarity to a given query transposon-encoded protein [80−83]. A problem with this approach lies in the relative shortness of the blocks of conserved amino acids in many RTases and transposases, making the detection of distantly related transposons uncertain. RTases contain at least seven characteristic “signature” domains distributed over more than a kilobase with variable spacings (Fig. 2). However, each domain displays only two or three amino acids that are well conserved across the entire superfamily [54, 84]. In DNA transposons, several analogous short conserved blocks of homology exist in most transposases [85]. To compensate for the lack of extensive conservation in reverse transcriptases, a two-step PCR amplification procedure with nested primers was designed [86]. The procedure takes advantage of the most conserved reverse transcriptase domains, designated A, B, C, and E (Fig. 2a). The first set of highly degenerate primers is targeted to domains A and E, and the firstround amplification products, substantially enriched in retrotransposon-related sequences, are subjected to a second round of amplification using highly degenerate primers against the most conserved residues in the internally located superfamily-specific domains B and C. This procedure typically yields bands of sizes diagnostic for the corresponding superfamily. The identity of such bands can be confirmed by cloning and sequencing. A similar strategy can be employed for amplification of DNA transposases, using primer pools directed against conserved residues in domains depicted in Fig. 2b. Retrotransposons are commonly believed to be of universal occurrence in eukaryotes, even though only a few major phyla have actually been examined. We tested the validity of this belief by screening representatives of 24 animal phyla for two retrotransposon superfamilies. PCR assays for LINE-like RTase MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
161
sequences yielded one or more prominent bands of 120−170 bp, the size range of the B–C interval in known LINE-like RTase clades (table). Comparisons of amino acid sequences in the B–C interval place most sequences in the known LINE-like clades R4, L1, RTE, R1, Jockey, or CR1, all of which are believed to be of ancient origin [78]. Groups of LINElike RTase sequences apparently representing new clades were found in an acanthocephalan, a flatworm, and a diplomonad protozoan. Within a given clade, more closely related species typically contain more similar RTase sequences, although it is not always possible to determine phylogenetic relationships because the segments in question are fairly short and in many cases quite divergent. It is notable, however, that the region between the second-step primers exhibits significant clade-specific conservation of amino acid sequences, which only becomes evident in a large-scale comparison between many members of the clade. Similarly, bands of 120–130 bp, diagnostic for gypsy-like RTases, were found in two-step amplifications of DNA from 35 species, representing 21 of the 23 phyla tested (table). Amplicons from the species further investigated by sequencing could be assigned to known clades of gypsy-like RTases [87, 88], such as gypsy, Osvaldo, sushi, and ZAM. These data demonstrate that LINE-like elements are virtually universal and gypsy-like elements are ubiquitous throughout the animal kingdom, being easily detectable by the two-step PCR procedure in the overwhelming majority of the phyla tested. The genome sequence data [89, 90] for model higher eukaryotic organisms such as D. melanogaster, C. elegans, and humans indicate that recognizable TEs, and especially retrotransposons, represent a substantial fraction of the total genomic DNA (10–20%, not to mention plant genomes such as maize where they represent 50–85% of the genome). Comparison of band intensities in DNA from model organisms and from representatives of the major phyla of the animal kingdom indicates that in most of them retrotransposons also form a substantial component of genomic DNA. The multicopy nature of these sequences is also confirmed by the fact that none of the cloned copies from any LINE-like family in any of the species studied had identical nucleotide sequences. They were always present as multiple divergent copies and apparently inhabited these genomes for a long period of time. The gypsy-like elements are generally present in lower copy number than LINE-like elements, since it takes additional 20 cycles of amplification to detect the corresponding bands at the second PCR step. It is also evident that no closely related retrotransposon sequences are present in distantly related species, indicating that horizontal transfers of retroelements occur very rarely. DNA transposons of the mar-
162
ARKHIPOVA (a) LINE
1
2
A
B
C
D
E
Gypsy 1
2
A
B
C
D
E
(b) DE
D
(34)
DE
D
(35)
D
Mariner
Tc E
Fig. 2. Domain structure of reverse transcriptases (a) and transposases (b) indicating location of nested PCR primers. Conserved domains are denoted by filled boxes. Domains A–E [84] correspond to RTase domains 3–7 [54], respectively. One or two feathers on the arrows indicate first- and second-step amplification primers, respectively. Filled bars represent second-step amplification products specific for each superfamily.
iner/Tc1 superfamily, in agreement with previous studies [51, 69, 83, 91], exhibit patchy distribution, and it is not uncommon to find closely related sequences in very distant species, indicating that the horizontal mode of transmission for these elements is predominant. TRANSPOSON CONTENT AND MODE OF REPRODUCTION It has long been noted that sexual reproduction allows mobile elements, even if deleterious, to spread in populations, and the loss of sex was predicted to eventually result in populations free of such elements [92]. Therefore, it was of particular interest to test this expectation on a very special taxonomic group—rotifers of the class Bdelloidea, comprising 4 families, 18 genera, and some 360 species. This is the largest metazoan taxon in which males, hermaphrodites, and meiosis are unknown, and the only taxon in which ancient asexuality was supported by molecular genetic evidence [93]. The molecular data obtained were in agreement with the expectation that after millions of years of evolution without sexual reproduction or genetic exchange individual genomes of bdelloid rotifers would no longer contain closely similar haplotypes and instead would contain highly divergent descendants of former alleles. The PCR-based assays described above were performed on five species of bdelloid rotifers representing three of the four known families of class Bdel-
loidea, and, for comparison, on five species of rotifers belonging to classes in which sexual reproduction is either constitutive (Acanthocephala) or facultative (Monogononta). All five species of non-bdelloid rotifers, as expected, tested positive for LINE-like RTases, and three of them also tested positive for gypsy-like RTases. In contrast, no diagnostic LINElike or gypsy-like bands were visible under the same amplification conditions in any of the five species of bdelloid rotifers tested. Although the complete absence of these retrotransposons cannot be demonstrated without knowing the entire genomic sequence, it is evident that very few if any copies are present in the bdelloid genomes we examined. Transposable elements, even if deleterious, are able to spread through a sexually reproducing population if a given copy can be transmitted to more than half the progeny [92, 94]. If sexual reproduction ceases, spread within a population is limited to rare horizontal transmission events. After a sufficiently long time, if the population has not become extinct, random mutation and selection for lineages with reduced insertional load will eventually result in a population free of deleterious transposons. In bdelloid rotifers, which appear to have abandoned sexual reproduction many millions of years ago [93], any such elements should have been lost or have diverged so greatly as to become undetectable by PCR. Although the apparent lack of retrotransposons may be interpreted as a consequence of long-term MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
TRANSPOSABLE ELEMENTS IN THE ANIMAL KINGDOM
163
Tests for LINE-like and gypsy-like RTases and mariner/Tc1 transposases in diverse eukaryotic species Types Sarcomastigophora Porifera Cnidaria (L, M) Ctenophora Platyhelminthes (M) Acanthocephala Rotifera (Monogononta)
Rotifera (Bdelloidea)
Gastrotricha Nemertea Priapulida Sipuncula Annelida Echiura Mollusca (L) Brachiopoda Bryozoa Phoronida Nematoda (L, G, M, T) Onychophora Arthropoda (L, G, M, T)
Tardigrada Chaetognatha Echinodermata (G) Hemichordata Chordata (L, G, M, T)
Species Giardia lamblia Halichondria bowerbanki Spongilla sp. Hydra littoralis Aurelia aurita Condylactus sp. Dugesia tigrina Moniliformis moniliformis Brachionus plicatilis Brachionus calyciflorus Sinantherina socialis Monostyla sp. Philodina roseola Philodina rapida Habrotrocha constricta Adineta vaga Macrotrachela quadricornifera Lepidodermella sp. Lineus sp. Priapulus caudatus Themiste alutacea Glycera sp. Lissomyema mellita Chione cancellata Glottidea pyramidata Amathia convoluta Phoronis architecta Caenorhabditis elegans Euperipatoides rowelli Drosophila melanogaster Drosophila pseudoobscura Drosophila virilis Lasius niger Formica polyctenum Aphis sp. Milnesium sp. Sagitta sp. Echinometra mathaei Strongylocentrotus purpuratus Saccoglossus kowalevskii Branchiostoma floridae Danio rerio Onchorhynchus keta Xenopus laevis Mus musculus Bos taurus
Mariner/Tc1 + + +S + – +S – + – – – + + +S +S +S + + – – – + +S – + + + + – + +
+ +
+
+S
– – – + – – – – – + +
–
+ + +S + + +
LINE +S + + + + + +S +S +S + +S +S – – – – – +S +S +S +S +S +S +S +S +S + +S +S +S + + +S + +S + +S + + +S +S + + + + +
Gypsy +S +S – + +S + +S + – – – – – – + + + + +S + + + + + +S +S +S +S + +S + + + +S + + + +S + + +S +
Note: Presence or absence of diagnostic PCR bands is indicated by + or -, respectively. S, verified by sequencing; blank, not done. Superfamilies previously reported to be present in representatives of a phylum are indicated in parentheses: L, LINE; G, gypsy; M, mariner; T, Tc1. MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
164
ARKHIPOVA Escherichia coli Giardia lamblia Halichondria bowerbanki Hydra littoralis Dugesia tigrina Lineus sp. Priapulus caudatus Themiste alutacea Glottidea pyramidata Amathia convoluta Phoronis architecta Glycera sp. Lissomyema mellita Chione cancellata Euperipatoides rowelli Drosophila melanogaster Lepidodermella sp. Monostyla sp. Habrotrocha constricta Caenorhabditis elegans Sagitta sp. Saccoglossus kowalevski Branchiostoma floridae 100 bp ladder
Fig. 3. Ubiquity of LINE-like RTases as demonstrated by nested PCR.
asexuality, it is also possible that it has allowed bdelloid rotifers to avoid the early extinction that usually follows the loss of sex in other taxa. This could be the case if sexual reproduction plays a significant role in limiting the load of deleterious insertions, by mechanisms involving recombination [95–97], or by other mechanisms dependent upon sexual reproduction or meiosis. Indeed, a major advantage of sex may be in limiting the deleterious insertional load (see [15], chapter 8). If so, the loss of sex would allow the load to increase, driving the population to extinction. The bdelloid lineage may be unusual in having escaped this fate by somehow becoming free of active retrotransposons, either before or not long after abandoning sex.
It will be of particular interest to obtain sequences of large stretches of genomic DNA from bdelloid rotifers in an attempt to identify remnants of transposons that were apparently present in the genomes of bdelloid ancestors, which are presumed to be sexual diploids. While retrotransposons are not expected to constitute a major component of bdelloid genomes, it is plausible that identification and sequence comparisons of such inactive transposon relics might provide some clues as to the timing and possible mechanisms accounting for their loss. Although deleterious transposons are not expected to be retained in ancient asexual lineages, those which have been co-opted to perform certain functions in the host have the potential to be preserved by natural selection. It is difficult to guess what kind of function, if any, might they perform. So far, the best known example of useful transposons are telomere-associated retrotransposons in Drosophila. Upon loss of telomerase gene, they have apparently taken over the function of telomeric repeats normally synthesized by telomerase to protect chromosome ends from DNA loss during replication. This is not too surprising because telomerase is a specialized reverse transcriptase and the underlying principle of telomere maintenance is restoration of lost DNA by RNAdependent DNA synthesis [57, 98]. While RTases diagnostic for retrotransposons were not detected in bdelloids, bands diagnostic for mariner-like DNA transposases were readily detectable in at least three of the bdelloid species tested. Sequence analysis demonstrated that they belong to the known lineata and elegans subfamilies of mariner-like transposons [91], and Southern analysis revealed that they are present in high copy numbers. The degree of sequence divergence between subfamily members varies from very low to relatively high, indicating that some of the copies proliferated relatively recently but others were present in the genome for a long time (an alternative explanation would be multiple reinvasions from an unknown source). The most interesting feature of these transposase sequences was a strong bias toward synonymous substitution within subfamilies, indicating that they are or recently were under prolonged selection for function. This is very unusual for mariner-like transposases: the majority of copies in a given genome are usually nonfunctional [51, 69], partly owing to the fact that a transposase transposes functional and mutated copies with equal efficiency. While sexually transmitted deleterious transposons are expected to be absent in ancient asexuals, transposons that undergo frequent horizontal transfer and are not significantly deleterious may occur in both sexual and ancient asexual taxa. This may explain why bdelloids lack retrotransposons, which seldom move horizontally and typically possess enhancer and/or suppressor elements that can disrupt the MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
TRANSPOSABLE ELEMENTS IN THE ANIMAL KINGDOM
expression of nearby genes, while mariner-like transposons are often transmitted horizontally, lack such control elements and, as perhaps indicated by the conservation of amino acid sequence in bdelloids, may not be significantly detrimental to their hosts. Are there any other organisms in which these correlations could be tested? There are many species and a few genera for which sexual reproduction is not known, and even more that reproduce asexually but resort to sexual reproduction once in a while. The uniqueness of the bdelloid rotifers is in their ancient asexuality, which manifests itself in changing their allelic structure over many millions of years of evolution. So far, bdelloid rotifers top the list of organisms claimed to represent “ancient asexual scandals” [99], since in many other asexual groups cryptic sex has been discovered upon close examination and they have been removed from the list. The most recent example is the pathogenic yeast Candida albicans, which for a very long time was thought to be completely asexual. Upon progress of the Candida genome sequencing project and discovery of a number of mating-type genes, two groups were able to induce a sexual cycle in this yeast [100, 101]. Interestingly, Candida also differs substantially in the transposon content from the model S. cerevisiae genome. While the latter contains a few dozen Ty retrotransposon copies belonging to five families, most of which are intact and apparently active [102, 103], the Candida genome contains several hundred fragments of highly rearranged retrotransposons belonging to at least 34 families [104]. The vast majority of these are non-functional degenerate elements, and only three copies still remain intact. This peculiar pattern might be connected with the overwhelming predominance of the asexual mode of reproduction in this yeast. It is also of interest to note that the majority of the C. elegans retrotransposons is apparently inactive and have never been reported to cause insertional mutations, while in D. melanogaster they represent the main cause of spontaneous mutation and most of the copies appear to be functional [89, 90]. Whether this is in any way connected with the primarily self-fertilizing mode of C. elegans reproduction and the enormous outcrossing capabilities of fruit flies remains to be investigated. It has been proposed that there is an inverse relationship between transposon aggressiveness and the ratio of asexual to sexual cycles of reproduction in the host, with selection for benign transposons in hosts that are mostly selfing or asexual [105]. CONCLUSIONS Transposable elements continue to bring us new surprises in the field of genome structure and evolution. Remembering the skepticism with which Howard Temin’s hypothesis about the origin of retroMOLECULAR BIOLOGY
Vol. 35
No. 2
2001
165
viruses from cellular movable genetic elements [106] was initially met, it is only fair that these elements are no longer being regarded as products of viral degeneration, and the searches for putative progenitors of retroviruses in sequences of vertebrate genomes hold much promise. Similarly, extensive analyses of TE phylogenies should reveal a complete picture of their evolution from simplest to very complex units via domain acquisition and shuffling, very much in parallel to the way the evolution of protein modules occurred during transition from primitive to more complex organisms. The enormous potential of TEs for genome restructuring is difficult to deny, and future studies will reveal novel examples of TE participation in the evolution of eukaryotic genomes. At the same time, the bulk of RTase-containing elements is apparently dispensable and even deleterious, as indicated by their lack in uniparentally transmitted organelles and in ancient asexual taxa. REFERENCES 1. Kumar, A. and Bennetzen, J.L., Annu. Rev. Genet., 1999, vol. 33, pp. 479–532. 2. Kempken, F. and Kuck, U., BioEssays, 1998, vol. 20, pp. 652–659. 3. Daboussi, M.J., Genetica, 1997, vol. 100, pp. 253–260. 4. Boeke, J.D. and Devine, S.E., Cell, 1998, vol. 93, pp. 1087–1089. 5. Karess, R.E. and Rubin, G.M., Cell, 1984, vol. 38, pp. 135–146. 6. Steller, H. and Pirrotta, V., Mol. Cell. Biol., 1986, vol. 6, pp. 1640–1649. 7. Calvi, B.R. and Gelbart, W.M., EMBO J., 1994, vol. 13, pp. 1636–1644. 8. Lohe, A.R. and Hartl, D.L., Genetics, 1996, vol. 143, pp. 1299–1306. 9. O’Kane, C.J. and Gehring, W.J., Proc. Natl. Acad. Sci. USA, 1987, vol. 84, pp. 9123–9127. 10. Smith, D., Wohlgemuth, J., Calvi, B.R., Franklin, I., and Gelbart, W.M., Genetics, 1993, vol. 135, pp. 1063– 1076. 11. Klimyuk, V.I., Nussaume, L., Harrison, K., and Jones, J.D., Mol. Gen. Genet., 1995, vol. 249, pp. 357– 365. 12. Sundaresan, V., Springer, P., Volpe, T., Haward, S., Jones, J.D., Dean, C., Ma, H., and Martienssen, R., Genes Dev., 1995, vol. 9, pp. 1797–1810. 13. Sablitzky, F., Jonsson, J.I., Cohen, B.L., and Phillips, R.A., Cell Growth Differ., 1993, vol. 4, pp. 451–459. 14. Arkhipova, I.R. and Ilyin, Y.V., BioEssays, 1992, vol. 14, pp. 161–168. 15. Arkhipova, I.R, Lyubomirskaya, N.V., and Ilyin, Y.V., Drosophila Retrotransposons, Austin, TX: R.G. Landes Co., 1995. 16. Labrador, M. and Corces, V.G., Annu. Rev. Genet., 1997, vol. 31, pp. 381–404. 17. Lachaume, P., Bouhidel, K., Mesure, M., and Pinon, H., Development, 1992, vol. 115, pp. 729–735.
166
ARKHIPOVA
18. Tanda, S., Mullor, J.L., and Corces, V.G., Mol. Cell. Biol., 1994, vol. 14, pp. 5392–5401. 19. Tatout, C., Docquier, M., Lachaume, P., Mesure, M., Lecher, P., and Pinon, H., Int. J. Dev. Biol., 1994, vol. 38, pp. 27–33. 20. Smith, P.A. and Corces, V.G., Genetics, 1995, vol. 139, pp. 215–228. 21. Udomkit, A., Forbes, S., McLean, C., Arkhipova, I., and Finnegan, D.J., EMBO J., 1996, vol. 15, pp. 3174–3181. 22. Dupressoir, A. and Heidmann, T., Mol. Cell. Biol., 1996, vol. 16, pp. 4495–4503. 23. Kerber, B., Fellert, S., Taubert, H., and Hoch, M., Mol. Cell. Biol., 1996, vol. 16, pp. 2998–3007. 24. Pasyukova, E., Nuzhdin, S., Li, W., and Flavell, A.J., Mol. Gen. Genet., 1997, vol. 255, pp. 115–124. 25. Haoudi, A., Rachidi, M., Kim, M.H., Champion, S., Best-Belpomme, M., and Maisonhaute, C., Gene, 1997, vol. 196, pp. 83–93. 26. Zhao, D. and Bownes, M., Mol. Gen. Genet., 1998, vol. 257, pp. 497–504. 27. Biessmann, H., Walter, M.F., Le, D., Chuan, S., and Yao, J.G., Insect Mol. Biol., 1999, vol. 8, pp. 201–212. 28. Tchenio, T., Casella, J.F., and Heidmann, T., Nucleic Acids Res., 2000, vol. 28, pp. 411–415. 29. Arkhipova, I.R., Mazo, A.M., Cherkasova, V.A., Gorelova, T.V., Schuppe, N.G., and Ilyin, Y.V., Cell, 1986, vol. 44, pp. 555–563. 30. Mizrokhi, L.J., Georgieva, S.G., and Ilyin, Y.V., Cell, 1988, vol. 54, pp. 685–691. 31. Minchiotti, G. and DiNocera, P.P., Mol. Cell. Biol., 1991, vol. 11, pp. 5171–5180. 32. McLean, C., Bucheton, A., and Finnegan, D.J., Mol. Cell. Biol., 1993, vol. 13, pp. 1042–1050. 33. Contursi, C., Minchiotti, G., and Di Nocera, P.P., J. Biol. Chem., 1995, vol. 270, pp. 26570–26576. 34. Minchiotti, G., Contursi, C., and Di Nocera, P.P., J. Mol. Biol., 1997, vol. 267, pp. 37–46. 35. Svergold, G.D., Mol. Cell. Biol., 1990, vol. 10, pp. 6718–6729. 36. Takahashi, H. and Fujiwara, H., Nucleic Acids Res., 1999, vol. 27, pp. 2015–2021. 37. Arkhipova, I.R., Genetics, 1995, vol. 139, pp. 1359– 1369. 38. Burke, T.W. and Kadonaga, J.T., Genes Dev., 1997, vol. 11, pp. 3020–3031. 39. Danilevskaya, O.N., Arkhipova, I.R., Traverse, K.L., and Pardue, M.L., Cell, 1997, vol. 88, pp. 647–655. 40. Furano, A.V., Robb, S.M., and Robb, F.T., Nucleic Acids Res., 1988, vol. 16, pp. 9215–9231. 41. Severynse, D.M., Hutchison, C.A. III, and Edgell, M.H., Mamm. Genome, 1992, vol. 2, pp. 41–50. 42. George, J.A. and Eickbush, T.H., Insect Mol. Biol., 1999, vol. 8, pp. 3–10. 43. Retroviruses, Coffin, J.M., Hughes, S.H., and Varmus, H.E., Eds., Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1997. 44. Arkhipova, I.R. and Ilyin, Y.V., EMBO J., 1991, vol. 10, pp. 1169–1177. 45. Jarrell, K. and Meselson, M., Proc. Natl. Acad. Sci. USA, 1991, vol. 88, pp. 102–104.
46. Arkhipova, I.R., Nucleic Acids Res., 1995, vol. 23, pp. 4480–4487. 47. Kurose, K., Hata, K., Hattori, M., and Sakaki, Y., Nucleic Acids Res., 1995, vol. 23, pp. 3704–3709. 48. Laski, F.A, Rio, D.C., and Rubin, G.M., Cell, 1986, vol. 44, pp. 7–19. 49. Tabara, H., Sarkissian, M., Kelly, W.G., Fleenor, J., Grishok, A., Timmons, L., Fire, A., and Mello, C.C., Cell, 1999, vol. 99, pp. 123–132. 50. Ketting, R.F., Haverkamp, T.H., van Luenen, H.G., and Plasterk, R.H., Cell, 1999, vol. 99, pp. 133–141. 51. Hartl, D.L., Lohe, A.R., and Lozovskaya, E.R., Annu. Rev. Genet., 1997, vol. 31, pp. 337–358. 52. Kaufman, P.D. and Rio, D.C., Proc. Natl. Acad. Sci. USA, 1991, vol. 88, pp. 2613–2617. 53. Simmons, M.J., Raymond, J.D., Grimes, C.D., Belinco, C., Haake, B.C., Jordan, M., Lund, C., Ojala, T.A., and Papermaster, D., Genetics, 1996, vol. 144, pp. 1529–1544. 53a. Evgen’ev, M.B., Zelentsova, H., Shostak, N., Kozitsina, M., Barskyi, V., Lankenau, D.H., and Corces, V.G., Proc. Natl. Acad. Sci. USA, 1997, vol. 94, pp. 196–201. 54. Xiong, Y. and Eickbush, T.H., EMBO J., 1990, vol. 9, pp. 3353–3362. 55. Nakamura, T.M., Morin, G.B., Chapman, K.B., Weinrich, S.L., Andrews, W.H., Lingner, J., Harley, C.B., and Cech, T.R., Science, 1997, vol. 277, pp. 955–959. 56. Eickbush, T.H., Science, 1997, vol. 277, pp. 911–912. 57. Nakamura, T.M. and Cech, T.R., Cell, 1998, vol. 92, pp. 587–590. 58. Capy, P., Bazin, C., Higuet, D., and Langin, T., Dynamics and Evolution of Transposable Elements, Austin, TX: Landes Bioscience, 1998. 59. Moran, J.V., DeBerardinis, R.J., and Kazazian, H.H., Jr., Science, 1999, vol. 283, pp. 1530–1534. 60. Luan, D.D. and Eickbush, T.H., Mol. Cell. Biol., 1995, vol. 15, pp. 3882–3891. 61. Levin, H.L., Mol. Cell. Biol., 1996, vol. 16, pp. 5645– 5654. 62. Wang, H. and Lambowitz, A., Cell, 1993, vol. 75, pp. 1071–1081. 63. Kidwell, M.G., Curr. Opin. Genet. Dev., 1992, vol. 2, pp. 868–873. 64. Kidwell, M.G., Annu. Rev. Genet., 1993, vol. 27, pp. 235–256. 65. Daniels, S.B., Peterson, K.R., Strausbaugh, L.D., Kidwell, M.G., and Chovnick, A., Genetics, 1990, vol. 124, pp. 339–355. 66. Jordan, I.K., Matyunina, L.V., and McDonald, J.F., Proc. Natl. Acad. Sci. USA, 1999, vol. 96, pp. 12621– 12625. 67. Luan, D.D., Korman, M.H., Jakubczak, J.L., and Eickbush, T.H., Cell, 1993, vol. 72, pp. 595–605. 68. Houck, M.A., Clark, J.B., Peterson, K.R., and Kidwell, M.G., Science, 1991, vol. 253, pp. 1125–1128. 69. Robertson, H.M. and Lampe, D.J., Annu. Rev. Entomol., 1995, vol. 40, pp. 333–357. 70. Lampe, D.J., Churchill, M.E., and Robertson, H.M., EMBO J., 1996, vol. 15, pp. 5470–5479. MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
TRANSPOSABLE ELEMENTS IN THE ANIMAL KINGDOM 71. Gueiros-Filho, F.J. and Beverley, S.M., Science, 1997, vol. 276, pp. 1716–1719. 72. Sherman, A., Dawson, A., Mather, C., Gilhooley, H., Li, Y., Mitchell, R., Finnegan, D., and Sang, H., Nat. Biotechnol., 1998, vol. 16, pp. 1050–1053. 73. Beall, E.L. and Rio, D.C., Genes Dev., 1996, vol. 10, pp. 921–933. 74. Gonzalez, P. and Lessios, H.A., Mol. Biol. Evol., 1999, vol. 16, pp. 938–952. 75. Terzian, C., Ferraz, C., Demaille, J., and Bucheton, A., Mol. Biol. Evol., 2000, vol. 17, pp. 908–914. 76. Kim, A., Terzian, C., Santamaria, P., Pelisson, A., Purd’homme, N., and Bucheton, A., Proc. Natl. Acad. Sci. USA, 1994, vol. 91, pp. 1285–1289. 77. Song, S.U., Gerasimova, T., Kurkulos, M., Boeke, J.D., and Corces, V.G., Genes Dev., 1994, vol. 8, pp. 2046– 2057. 78. Malik, H.S., Burke, W.D., and Eickbush, T.H., Mol. Biol. Evol., 1999, vol. 16, pp. 793–805. 79. Kordis, D. and Gubensek, F., Gene, 1999, vol. 238, pp. 171–178. 80. Flavell, A.J., Smith, D.B., and Kumar, A., Mol. Gen. Genet., 1992, vol. 231, pp. 233–242. 81. Voytas, D.F., Cummings, M.P., Konieczny, A., Ausubel, F.M., and Rodermel, S.R., Proc. Natl. Acad. Sci. USA, 1992, vol. 89, pp. 7124–7128. 82. Wichman, H., and Van Den Bussche, R.A., BioTechniques, 1992, vol. 13, pp. 258–264. 83. Robertson, H.M., Nature, 1993, vol. 362, pp. 241–245. 84. Poch, O., Sauvaget, I., Delarue, M., and Tordo, N., EMBO J., 1989, vol. 8, pp. 3867–3874. 85. Doak, T.G., Doerder, F.P., Jahn, C.L., and Herrick, G., Proc. Natl. Acad. Sci. USA, 1994, vol. 91, pp. 942–946. 86. Arkhipova, I. and Meselson, M., Proc. Natl. Acad. Sci. USA, 2000, vol. 97, pp. 14473–14477. 87. Miller, K., Lynch, C., Martin, J., Herniou, E., and Tristem, M., J. Mol. Evol., 1999, vol. 49, pp. 358–366.
MOLECULAR BIOLOGY
Vol. 35
No. 2
2001
167
88. Malik, H.S. and Eickbush, T.H., J. Virol., 1999, vol. 73, pp. 5186–5190. 89. The C. elegans Sequencing Consortium, Science, 1998, vol. 282, pp. 2012–2018. 90. Adams, M.D. et al., Science, 2000, vol. 287, pp. 2185– 2195. 91. Robertson, H.M., J. Hered., 1997, vol. 88, pp. 195–201. 92. Hickey, D.A., Genetics, 1982, vol. 101, pp. 519–531. 93. Mark Welch, D. and Meselson, M., Science, 2000, vol. 288, pp. 1211–1215. 94. Zeyl, C. and Bell, G., Trends Ecol. Evol., 1996, vol. 11, pp. 10–15. 95. Kondrashov, A.S., J. Hered., 1993, vol. 84, pp. 372– 387. 96. Crow, J.F., Dev. Genet., 1994, vol. 15, pp. 205–213. 97. Muller, H.J., Mutat. Res., 1964, vol. 1, pp. 2–9. 98. Pardue, M.L., Danilevskaya, O.N., Lowenhaupt, K., Slot, F., and Traverse, K.L., Trends Genet., 1996, vol. 12, pp. 48–52. 99. Judson, O.P. and Normark, B.B., Trends Ecol. Evol., 1996, vol. 11, pp. 41–45. 100. Hull, C.M., Raisner, R.M., and Johnson, A.D., Science, 2000, vol. 289, pp. 307–310. 101. Magee, B.B. and Magee, P.T., Science, 2000, vol. 289, pp. 310–313. 102. Hani, J. and Feldmann, H., Nucleic Acids Res., 1998, vol. 26, pp. 689–696. 103. Kim, J.M., Vanguri, S., Boeke, J.D., Gabriel, A., and Voytas, D.F., Genome Res., 1998, vol. 8, pp. 464–478. 104. Goodwin, T.J.D. and Poulter, R.T.M., Genome Res., 2000, vol. 10, pp. 174–191. 105. Bestor, T.H., Sex Brings Transposons and Genomes into Conflict, Proc. of the Georgia Genetics Symp. “Transposable Elements and Evolution,” Atlanta, GA, 1999. 106. Temin, H.M., Cell, 1980, vol. 21, pp. 599–600.