Copyright 2003 by the Genetics Society of America
Continuous Exchange of Sequence Information Between Dispersed Tc1 Transposons in the Caenorhabditis elegans Genome Sylvia E. J. Fischer, Erno Wienholds and Ronald H. A. Plasterk1 Hubrecht Laboratory, Center for Biomedical Genetics, 3584 CT Utrecht, The Netherlands Manuscript received October 1, 2002 Accepted for publication January 14, 2003 ABSTRACT In a genome-wide analysis of the active transposons in Caenorhabditis elegans we determined the localization and sequence of all copies of each of the six active transposon families. Most copies of the most active transposons, Tc1 and Tc3, are intact but individually have a unique sequence, because of unique patterns of single-nucleotide polymorphisms. The sequence of each of the 32 Tc1 elements is invariant in the C. elegans strain N2, which has no germline transposition. However, at the same 32 Tc1 loci in strains with germline transposition, Tc1 elements can acquire the sequence of Tc1 elements elsewhere in the N2 genome or a chimeric sequence derived from two dispersed Tc1 elements. We hypothesize that during double-strand-break repair after Tc1 excision, the template for repair can switch from the Tc1 element on the sister chromatid or homologous chromosome to a Tc1 copy elsewhere in the genome. Thus, the population of active transposable elements in C. elegans is highly dynamic because of a continuous exchange of sequence information between individual copies, potentially allowing a higher evolution rate than that found in endogenous genes.
W
ITH the availability of complete genome sequences, the extent of repetitive DNA can be analyzed in detail. Half of the human genome consists of transposon-derived repeats, of which 7% are remnants of class II transposons, transposons that move as DNA. None of these transposons are thought to be active at present in either the human or the mouse genomes (International Human Genome Sequencing Consortium 2001). The genome of the nematode Caenorhabditis elegans has at least six families of active DNA transposons, Tc1–5 and Tc7 (Moerman and Waterston 1984; Eide and Anderson 1985; Collins and Collins et al. 1989; Yuan et al. 1991; Ruvolo et al. 1992; Anderson 1994; Rezsohazy et al. 1997). Tc1 and Tc3 have been extensively studied (van Luenen et al. 1994; Vos et al. 1996). They are members of the widespread Tc1/mariner superfamily of transposons (Robertson 1995). Movement of these transposons requires the action of one protein, the element-encoded transposase. The catalytic domain of this protein contains the conserved DDE/D motif (Doak et al. 1994), which is required for activity. The transposase protein binds specifically to the terminal inverted repeats of the transposon, cuts the transposon out of its original location, and pastes it into new site. The double-strand break is repaired via either nonhomologous end joining (van Luenen et al. 1994) or homologous recombination, using the sister chromatid or the homologous chromosome as a template (Engels et
1 Corresponding author: Center for Biomedical Genetics, Hubrecht Laboratory, Uppsalalaan 8, 3584 CT Utrecht, The Netherlands. E-mail:
[email protected]
Genetics 164: 127–134 (May 2003)
al. 1990; Plasterk 1991). The simple requirements for transposition of this family of transposons have allowed for successful introduction and mobilization of this type of element in other systems like the zebrafish (Fadool et al. 1998; Raz et al. 1998) and the germline of the mouse (Dupuy et al. 2001; Fischer et al. 2001; Horie et al. 2001). Tc7 is a nonautonomous element (Oosumi et al. 1996); it does not code for its own transposase protein but can be mobilized by Tc1 transposase (Rezsohazy et al. 1997). Tc2 (Ruvolo et al. 1992), Tc4v (Li and Shaw 1993), and Tc5 all have terminal inverted repeats and each contains an element-specific transposase gene. The similarity of these transposase proteins to those of pogo-like elements classifies them into the extended IS630-Tc1-mariner family (Smit and Riggs 1996; Plasterk et al. 1999). Tc4 does not contain an open reading frame and is presumably mobilized by Tc4v transposase. Although somatic Tc1 transposition (Emmons and Yesner 1984) occurs in all C. elegans strains, including the sequenced strain N2 (Harris and Rose 1986), germline activity of these six transposons is regulated in C. elegans (Collins et al. 1987) and requires the presence of a mutation like mut-7(pk204) (Ketting et al. 1999). This mutator mutation activates not only Tc1 and Tc3, members of the Tc1/mariner family, but also Tc4, Tc5, and Tc7. Many Tc1/mariner elements isolated from natural populations are nonautonomous because of insertions, deletions, and also missense mutations. We analyzed all active transposons in C. elegans and show that most elements are intact and that almost every copy has a
128
S. E. J. Fischer, E. Wienholds and R. H. A. Plasterk
unique sequence defined by single-nucleotide polymorphisms (SNPs). In strains that show transposition activity in the germline, we find that Tc1 transposition contributes to genome variation not only by insertion, but also through a continuous exchange of sequence information between dispersed Tc1 copies, presumably as a consequence of template switching during repair. The Tc1 family in C. elegans is rapidly evolving in strains proficient for germline transposition because of the formation of chimeric elements and the occurrence of internal deletions resulting from incomplete repair. MATERIALS AND METHODS Transposon mining: Transposons in the C. elegans genome were identified by BLAST (Altschul et al. 1990, 1997) searches using the Sanger Institute C. elegans blast server. Multiple sequence alignments were done using ClustalX1.81 (Thompson et al. 1994). Neighbor-joining trees were created using ClustalX1.81 and viewed using TreeviewPPC (Page 1996). Statistics: To test the randomness of the distribution profile of the transposons over the chromosomes and of the chromosomal origin of the Tc1 elements inserted into unc-22, chisquare goodness-of-fit tests were performed. As a random distribution a linear correlation between the chromosome size and the number of transposons or between the number of transposons on the chromosome and the number of transposons originating from that chromosome was used. C. elegans strains and culture: C. elegans strains used in this study are (Bristol) N2, NL7 [unc-54::Tc1 (r323); Ketting et al. 1999], NL917 [mut-7(pk204); Ketting et al. 1999], and NL3115 [mut-7(pk204)]. Growth conditions were as described (Sulston and Hodgkin 1987). Independent transposon insertions into unc-22 were derived from NL917 and the mutants were outcrossed once to N2. Analysis of unc-22 insertions: C. elegans genomic DNA was isolated by SDS/proteinase K treatment, phenol/chloroform extraction, and isopropanol precipitation. The nature of the transposons inserted into the unc-22 gene and the exact insertion sites were determined by Southern blotting (Sambrook et al. 1989) and transposon displays (Wicks et al. 2000). Cosmids ZK617 and C18D3 were used as probes for unc-22 and PCR fragments as probes for Tc4 and Tc6. The genomic location of additional Tc1 elements present in NL7 and NL917 was determined using transposon displays. The number of Tc1 elements in NL917 and NL7 was confirmed by Southern blotting. Sequencing: All Tc1 elements present in N2, NL7, NL917, NL3115, eight mut-7 strains derived from NL3115, and NL666 [mut-6(st702)] were sequenced from PCR products. Two PCR products were made per Tc1 locus using nested primers in flanking sequences and nested primers within Tc1. The Tc1 elements that had inserted into unc-22 were sequenced similarly. The sequencing reactions were done using the ABI PRISM Big Dye Terminator cycle sequencing kit and were analyzed on an ABI 377 or an ABI 3700 DNA analyzer. Primer sequences are available upon request. RESULTS
Genomic distribution and copy number of active C. elegans transposons: Six families of DNA transposons are known to be active in C. elegans, Tc1–Tc5 and Tc7.
TABLE 1 Copy numbers of active and newly identified DNA transposons in the C. elegans strain N2
Transposon Tc1 Tc2a Tc3 Tc4a Tc4v Tc5 Tc7a Tc3-CeIIa Tc3-CeIIb Tc9 Tc10
Copy number in N2
Intragenic localization
32 4 22 5 5 4 12 10 3 1 (⫹15) 3 (⫹15)
3 2 2 0 0 0 1 1 1 1 1
Blast searches were done using the published sequences. The cosmid or YAC clone names that contain the transposons can be found in the supplemental information at http:// www.genetics.org/supplemental/. Tc4 and Tc7 do not encode transposase proteins and are presumably mobilized by, respectively, the Tc4v and Tc1 transposase proteins. Similarly, for Tc9 and Tc10 a number of presumable nonautonomous copies were identified (copy number in parentheses). Activity of Tc6 (Dreyfus and Emmons 1991), Tc8 (Le et al. 2001), or mariner-like elements (Sedensky et al. 1994) has not been demonstrated. a Tc2, Tc4, and Tc7 have, besides the indicated copies, more highly diverged members (data not shown; Rezsohazy et al. 1997).
In the sequenced strain N2 these transposons are not active in the germline. However, they do show somatic activity or they can be activated in the germline by mutation of a single gene in the N2 strain (Ketting et al. 1999). A genomic survey of all active transposons was done to determine their chromosomal distribution and to analyze the degree of sequence heterogeneity within transposon families. The sequences of the complete transposons, the transposase proteins, and the terminal inverted repeats were used to screen the genomic sequence of the N2 strain. A total of 84 copies of seven different transposons that are highly similar to the published sequences (Table 1; Tc1–7) were identified. Most of the transposons are found in intergenic regions; only 10% of the transposons were found within introns of predicted genes although intronic sequence constitutes 26% of the genome sequence (C. elegans Sequencing Consortium 1998). The chromosomal distribution of the transposons over the six chromosomes (Figure 1A) deviates significantly (d.f. ⫽ 5, 2 ⫽ 19.9) from a random distribution among the chromosomes. Most notably, chromosome III has relatively few transposons. Other repetitive elements in C. elegans (Surzycki and Belknap 2000; Ganko et al. 2001) do not show this pattern. Intrachromosomally, relatively more transposable elements are found on the autosomal arms than in the middle third of the chromosome, which is relatively
Tc1 Transposons Exchange SNPs in C. elegans
129
Figure 1.—Computational analysis of active transposons in C. elegans. (A) Genomic distribution of all active transposable elements in the C. elegans genome. The positions of the transposons on the physical map are based on the positions of the clones annotated in Wormbase. The sizes of the chromosomes are in megabases. The transposons are color coded as follows: Tc1, blue; Tc2, green; Tc3, red; Tc4, white; Tc4v, black; Tc5, light blue; and Tc7, yellow. (B) Phylogenetic relationship between C. elegans transposase proteins and the related transposases of pogo, Mos1, and Sleeping Beauty. Tc3-CeIIb is not included because it is presumed to be inactive on the basis of sequence information; Tc4 and Tc7 were not included because these elements do not encode transposase proteins. The catalytic domains of the transposases were aligned and a phylogram was constructed using the neighbor-joining method. The number of bootstrap replications out of 1000 confirming the branch point is indicated.
gene rich. However, the clustering on the relatively gene-poor ends is not as striking as what has been observed for some other repetitive elements (C. elegans Sequencing Consortium 1998; Surzycki and Belknap 2000). Identification and activity of new transposons related to Tc3 and Tc4: Analysis of the C. elegans genome sequence revealed several new transposons (Table 1; Tc3CeIIa, Tc3-CeIIb, Tc9, and Tc10). Phylogenetic analysis of the catalytic domains of the predicted transposase proteins shows that these elements are related to Tc3 and Tc4 (Figure 1B). Tc3-CeIIa has recently been described (Tu and Shao 2002) and its transposase is highly similar (74.8%) to Tc3 transposase. Of the 10 copies of Tc3-CeIIa in the genome, 3 encode full-length transposase proteins. The related element Tc3-CeIIb is present in 3 copies, of which only 1 encodes a full-length transposase. This transposase protein is probably inactive because of a missense mutation in the catalytic triad DDE. The Tc3 family has a very high sequence identity between individual copies, indicative of their present activity. The Tc3-CeII family is less conserved and activity of Tc3-CeIIa in a mut-7 mutator strain could not be detected (data not shown). Two additional full-length transposon families related to Tc4(v) were identified, Tc9 (1 copy) and Tc10 (3 copies). Of both these transposons, only 1 copy is predicted to encode a full-length transposase with 65% similarity to Tc4v transposase and each contains a DD(37)D motif. However, for both Tc9 and Tc10, 15 copies of a smaller 1.6-kb transposon were found, which have terminal inverted repeats that are nearly identical to those of Tc9 or Tc10, respectively, but do not encode
a transposase. Possibly, similar to Tc4 and Tc4v, these elements may have been mobilized by Tc9 and Tc10 transposase. These elements may no longer be active: activity in mut-7 strains could not be detected (data not shown). Interestingly, the inverted repeats of 15 Tc4, Tc4v, and Tc9 elements contain a 73-bp tRNA-Lys pseudogene, accounting for 24 Lys-CTT type tRNA pseudogenes predicted in the C. elegans genome. The functional significance of these sequences is unclear. tRNA genes have been found associated with repetitive elements before, but mainly with retrotransposons like mammalian SINEs (Daniels and Deininger 1985) or C. elegans Cer7 (Frame et al. 2001), where they could function in priming the reverse transcription reaction. Individual Tc1 and Tc3 elements almost all have a unique sequence: Most Tc1/mariner elements that have been identified in a wide variety of organisms are inactive as a consequence of deletions, insertions, and substitutions (Robertson 1993; Feschotte and Wessler 2002). The sequences of the most abundant and active transposons in C. elegans, Tc1 and Tc3, were aligned to determine the extent of sequence heterogeneity between individual copies. Out of 22 Tc3 copies, 3 have internal deletions, and 4 do not encode a full-length transposase. Thirty-six SNPs are found within the Tc3 sequence, many of which are shared between different Tc3 copies. However, all Tc3 elements have a unique combination of SNPs. The N2 genome contains 31 Tc1 elements with two terminal inverted repeats and 1 element that misses an inverted repeat (Figure 2). Four copies have internal deletions and 1 has a 55-bp insertion. The remainder of the Tc1 copies are full length, including 3 consensus elements. However, 73 SNPs are
130
S. E. J. Fischer, E. Wienholds and R. H. A. Plasterk
Figure 2.—Alignment of all Tc1 elements present in the N2 genome. The names of the clones on which the Tc1 elements are located and the chromosomes they map to are specified. Mutations are indicated by solid boxes above each line representing a Tc1 copy. The nature of the mutations [deletions (⌬) and insertions (ⵜ), above the line; missense (䉬) and nonsense (*) mutations, below the line] and their nucleotide positions within the Tc1 sequence are indicated. The structure of Tc1, including the inverted repeats (boxes) and the open reading frame, is drawn schematically.
found within the Tc1 sequence; 30 of these are missense mutations, and 5 are nonsense mutations, resulting in 25 copies encoding 16 different polymorphic transposase proteins. Almost all of the SNPs are unique; i.e., they are not found in multiple Tc1 copies. Therefore almost all copies of Tc1 in the N2 genome are unique and can be identified by their sequence. Three out of 32 copies are not unique and share an identical SNP pattern (T21B4, F35E8, and C31A11). The observation that 5 (G557T, G991T, G1107A, C1413T, and T1514A) of the 6 other SNPs that are shared between different Tc1 copies are found on elements that are close together on the same chromosome is suggestive of local transposition. These elements can, however, be distinguished from each other by additional unique SNPs. Donor site analysis of Tc1 transposition into the target gene unc-22: The Tc1-like element Sleeping Beauty has
Figure 3.—Tc1 transposition into the target gene unc-22. (A) Schematic representation of the intron-exon structure of the unc-22(IV) gene and the transposon insertions that were identified. (B) The 53 Tc1 elements present in the mut-7 mutator strain NL917 used in the screen for unc-22 mutants are indicated on the C. elegans physical map. The unc-22 gene is located at position 12.0 Mb on chromosome IV. Arrows indicate the origin of transposons that had inserted into unc-22.
a strong preference (83%) for reintegration into the chromosome of origin in the mouse germline (Fischer et al. 2001). In these Sleeping Beauty local hopping experiments, multiple target sites of a single transposon copy were analyzed. We now wanted to analyze which transposon copies from a set of multiple dispersed transposon copies would integrate into one specific target site. Tc1 elements were trapped into a target gene, and, on the basis of the Tc1 sequence, the donor site of the inserted elements was deduced. The muscle gene unc-22 was used as a target gene: this large gene was previously shown to be a good target for transposon insertion (Moerman et al. 1986), and nematodes with unc-22 mutations can be easily identified by their twitching phenotype. unc-22 mutants were derived from the mut-7 mutator strain NL917, which is proficient for germline transposition. The nature of the insertions was determined using transposon insertion displays (Wicks et al. 2000) and Southern blotting. Out of 35 unc-22 mutants, 26 had Tc1 insertions, 2 had Tc3 insertions, 1 had a Tc4 insertion, and 1 a Tc4v insertion (Figure 3A). Two mutants have an unknown insertion, and 3 mutants had no restriction fragment length polymor-
Tc1 Transposons Exchange SNPs in C. elegans
phism: possibly the transposon excised shortly after inserting, leaving a short footprint. Since the unc-22 mutants were derived from a mutator strain, the number and positions of Tc1 elements in its genome were not identical to those in the N2 genome. Therefore, we determined the position and sequence of all Tc1 elements in the mut-7 strain, from which the unc-22 mutants were isolated (Figure 3B); of the 53 Tc1 copies identified, 35 had a unique sequence. Sequence analysis of the 26 Tc1 insertions into unc-22 showed that 15 could be traced back to a chromosome. The remaining 11 Tc1 elements were not unique and could have come from several different chromosomes. Eight elements came from chromosome IV, where unc-22 is located, while 7 came from other chromosomes. The distribution pattern of the chromosomes of origin of the Tc1 transposons shows that multiple different elements, from dispersed genomic locations, can insert into a specific target site. Furthermore, the distribution pattern is significantly different from a random distribution (d.f. ⫽ 5, 2 ⫽ 12.26). This distribution pattern might be the result of the difference in proximity of Tc1 elements to the target gene (local hopping), combined with a difference in activity (i.e., how prone an element is to mobilization) of Tc1 elements: 2 elements, 1 on chromosome IV and 1 on chromosome V, account for more than half of the insertions into unc-22, suggesting that these 2 elements are relatively prone to mobilization. This difference in excision and/or reintegration capability between elements might be due to the sequence differences between individual elements described in this article. Alternatively, local chromatin structure might influence the ability of a transposon to be excised. Chimeric Tc1 copies arise in situ in mutator strains: Of the 53 Tc1 copies identified in the mutator strain NL917, 31 are located at exactly the same sites as the 31 full-length Tc1 copies in the N2 background. However, the sequence of 2 of the 31 copies differed from the N2 sequence (C28F5 and C31A11 in Figure 4A); 1 of these had a new combination of SNPs that are found in dispersed Tc1 copies in the N2 genome. To rule out that recombination between dispersed Tc1 copies occurs in the N2 background, we sequenced the 31 Tc1 copies in the N2 strain used in our lab, which was maintained isolated for many hundreds of generations from the N2 strain that was sequenced by the C. elegans Sequencing Consortium. No differences were found in our N2 strain compared to the Tc1 elements in the published genomic sequence. All Tc1 elements in the strain NL7 from which the strain NL917 was derived after EMS mutagenesis (Ketting et al. 1999) were sequenced as well. Also in this strain, the Tc1 elements at the two N2 loci did not have the SNP patterns of NL917, and the new SNP patterns observed in NL917 were not present anywhere in the genome of the nonmutator strain NL7.
131
Figure 4.—Exchange of sequence between dispersed Tc1 elements in C. elegans mutator strains. (A) Examples of chimeric Tc1 elements identified in mut-7 strains. The chimeric Tc1 elements at loci C28F5 and C31A11 were found in the mut-7 strain NL917. The other four elements were identified in mut-7 strains that were grown clonally for 30 generations. The Tc1 elements were sequenced in the starting strain (black transposon above the arrow) and after 30 generations (black transposon below the arrow). The red and blue transposons indicate transposons elsewhere in the genome that could have served (indicated with “e.g.”) or that are highly likely to have served as templates for repair after a template switch. The chromosomes on which the transposons are located are indicated in parentheses. Since the flanking sequences were unchanged, in all cases the repair process presumably started in the flanking regions using an allelic template and switched to an ectopic template. (B) The presence of certain SNPs in the left, the right, or both terminal inverted repeats of dispersed Tc3 copies suggests that sequence exchange could have occurred between Tc3 copies. Indicated in blue and red are two SNPs that can be found in the left inverted repeat in one Tc3 copy and/or at the same position in reverse complement sequence of the right inverted repeat in another Tc3 copy. For example, SNP C116A is found in the left inverted repeat of Tc3-F11D11 and in both the left and the right inverted repeats of Tc3ZC239. The inverted repeats are drawn as solid boxes. Other polymorphisms are indicated in black. A schematic representation of Tc3 shows the direction of the transposase open reading frame.
To determine whether sequence exchange between Tc1 elements is dependent on the mutator activity and is not a consequence of double-strand breaks induced by the EMS treatment, the Tc1 elements were followed through 30 generations in a mut-7 mutator background.
132
S. E. J. Fischer, E. Wienholds and R. H. A. Plasterk
First, the mut-7 mutation was extensively outcrossed to N2 to obtain a strain, NL3115, that has a Tc1 pattern similar to that of the N2 strain (verified by sequence analysis of all Tc1 copies). Eight C. elegans lines derived from this strain were then grown clonally for 30 generations, and the Tc1 elements at the N2 loci were again sequenced. Of the 25 Tc1 elements sequenced in eight strains that were grown for 30 generations, 12 had a sequence different from the N2 element at that location (4 are shown in Figure 4A), a frequency of 0.2% per element per generation. Five of these elements were chimeric elements with a new combination of SNPs found in dispersed elements in N2 and 4 had new deletions ranging in size between 330 and 1047 bp, indicative of incomplete repair. Three elements had the sequence of an element elsewhere in the N2 genome. In addition, also in the mut-6 mutator background (Mori et al. 1988), Tc1 elements with new SNP combinations (3 out of 25 elements sequenced) were found at N2 loci (data not shown). Double-strand breaks caused by transposon excision are repaired via nonhomologous end joining (van Luenen et al. 1994; Fischer et al. 2001) or via homologous recombination (gene conversion; Plasterk 1991). It was shown previously that, as a template for homologous recombination in C. elegans, an ectopic template can be used that contains Tc1 with flanking sequences highly homologous to those of the excised Tc1 element (Plasterk and Groenen 1992). A likely explanation for the occurrence of mosaic Tc1 elements is the following: Tc1 excises and the repair process initiates using the homologous flanks of the Tc1 element on the sister chromatid or homologous chromosome as a template; after repair has extended into the Tc1 sequence, template switching occurs to a Tc1 element elsewhere in the genome. This results in a Tc1 element that either has the SNP pattern of another element or is a chimera of two or more different elements. In some cases the repair process is disrupted, resulting in a partial deletion of the Tc1 element. Similarly, partial deletions of transposon sequences were also found to occur in conjunction with P-element and mariner excision (Engels et al. 1990; Lohe et al. 2000). Engels et al. (1994) have shown that, during recombinational repair in Drosophila, preferentially a template in cis (located on the same chromosome) is used. For two of the eight Tc1 elements that presumably resulted from template switching the secondary template could be determined because a unique SNP was introduced (F18C5 and T21B4 in Figure 4A). In these cases a template located on a chromosome other than the chromosome containing the double-strand break was used. An indication that template switching can occur also after Tc3 and Tc4 excision comes from the observation that in the N2 strain certain SNPs in the terminal inverted repeat (TIR) are, in some elements, found in the left TIR (with respect to the open reading frame), in some in the right TIR, and in some in both TIRs
(Figure 4B). Possibly, while the repair machinery is extending from the flanking sequence into the long terminal inverted repeats, the template can be switched to another Tc3 element that is aligned in the inverse orientation. This would result in inversion of the open reading frame with respect to the SNPs in the TIRs. DISCUSSION
Chromosomal distribution and local hopping: The families of active transposons in C. elegans are highly dynamic and rapidly evolving. Their localization on the gene-poor autosomal arms and, on a finer scale, in intergenic regions, might reflect the—for the host— deleterious effects of insertion into coding regions, which could disrupt gene function. On the other hand, it might reflect a preference for integration into DNA of a certain GC content or into certain DNA structures, e.g., the highly repetitive sequences on the autosomal arms that may form centromere-like domains. Seven different Tc1 elements, originating from four chromosomes, were trapped into the target gene unc22. Local transposition has been observed for several DNA transposons (Tower et al. 1993; Machida et al. 1997) and probably reflects the higher probability of encountering DNA of the same chromosome because it is physically closer to the excision site. However, strong differences in activity between individual Tc1 copies might obscure this effect when tracing multiple different transposons for their integration efficiency into a specific target site. Tc1 and Tc3 elements are polymorphic: Many Tc1/ mariner elements isolated from natural populations are nonautonomous because of insertions, deletions, and also missense mutations. Several models have been proposed to explain the abundance of defective elements (Hartl et al. 1997): Tc1/mariner transposons are not selected for function and therefore are expected to be short lived; the ability to be mobilized in trans extends their short life span. Furthermore, there might be a positive selection pressure for defective elements, because they could downregulate transposition, which is favorable for the host. Experiments using the mariner transposon indeed showed that a large proportion of transposons with missense mutations produce inactive transposases that downregulate the activity of wild-type transposase (Lohe et al. 1997). The majority of Tc1 and Tc3 elements in the sequenced N2 genome are full length but do contain a large number of unique SNPs. Because of these SNPs, the Tc1 and Tc3 elements might encode transposase proteins that have a varying degree of activity. At least one polymorphic version of Tc3 transposase (the consensus element) is more active than the element that was used in in vitro experiments (Fischer et al. 2001). Also, mutations in the terminal inverted repeats can affect the susceptibility of the transposon to mobilization. Lampe et al. (2001) showed that a small divergence in inverted repeat sequence of two related
Tc1 Transposons Exchange SNPs in C. elegans
mariner transposons had a dramatic effect on the efficiency of mobilization by the related transposase protein. Four Tc1 elements in the N2 genome have mutations (with respect to the consensus sequence) in the terminal inverted repeats. Chimerization of transposon copies may contribute to rapid evolution: The exchange of Tc1 sequences with sequences of other Tc1 copies or with sequences of chimeric Tc1 elements is probably a consequence of double-strand-break repair after transposon excision. The conclusion that it is dependent on transposon excision is supported by the observation that chimeras do not arise in nonmutator strains. To start the repair process, the repair machinery uses the sequences flanking the double-strand break, which are in this case the sequences flanking Tc1, on an allelic template; once the machinery has reached Tc1 sequences it may switch to another Tc1 template. Recombinational repair of double-strand breaks has been proposed to occur via the synthesis-dependent-strand-annealing (SDSA) model (Nassif et al. 1994) or similar models or via the model originally proposed by Szostak et al. (1983) for yeast meiotic recombination. One important difference between the two models is that the resolution of Holliday junctions in the model by Szostak would lead to chromosome rearrangements (because of crossovers) and heteroduplex products; in the SDSA model, this would not occur. We have not tested for chromosome rearrangements at any of the chimeric Tc1 elements that were identified. In Drosophila, the vast majority of recombinational repair, e.g., after P-element excision, is consistent with the SDSA model (reviewed in Lankenau and Gloor 1998). The population of Tc1 elements in C. elegans can be viewed as a highly dynamic entity in which transposon copies are formed that carry new combinations of SNPs. The number of possible new SNP combinations and, thus, the number of different Tc1 elements that can be formed is large. This allows for rapid evolution and diversification within the transposon family. If SNPs that alter the transposase binding site in the transposon are combined with SNPs that change the DNA-binding domain in the transposase so that it can bind the new binding site, a new subfamily arises. The Tc3 and Tc3CeII families could be the result of such diversification. We are grateful to Sander van der Linden for providing strains and to Henri van Luenen and Rene´ Ketting for help and advice. We thank Stephen Wicks and Rene´ Ketting for comments on the manuscript. We thank Alan Coulson (The Sanger Institute) for providing cosmids and the Caenorhabditis Genetics Center for providing some strains. This research has been financially supported by the Council for Chemical Sciences of the Netherlands Organization for Scientific Research (CWNWO).
LITERATURE CITED Altschul, S. F., W. Gish, W. Miller, E. W. Myers and D. J. Lipman, 1990 Basic local alignment search tool. J. Mol. Biol. 215: 403– 410.
133
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang et al., 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25: 3389– 3402. C. elegans Sequencing Consortium, 1998 Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282: 2012–2018. Collins, J., and P. Anderson, 1994 The Tc5 family of transposable elements in Caenorhabditis elegans. Genetics 137: 771–781. Collins, J., B. Saari and P. Anderson, 1987 Activation of a transposable element in the germ-line but not the soma of Caenorhabditis elegans. Nature 328: 726–728. Collins, J., E. Forbes and P. Anderson, 1989 The Tc3 family of transposable elements in Caenorhabditis elegans. Genetics 121: 47–55. Daniels, G. R., and P. L. Deininger, 1985 Repeat sequence families derived from mammalian tRNA genes. Nature 317: 819–822. Doak, T. G., F. P. Doerder, C. L. Jahn and G. Herrick, 1994 A proposed superfamily of transposase genes: transposon-like elements in ciliated protozoa and a common “D35E” motif. Proc. Natl. Acad. Sci. USA 91: 942–946. Dreyfus, D. H., and S. W. Emmons, 1991 A transposon-related palindromic repetitive sequence from C. elegans. Nucleic Acids Res. 19: 1871–1877. Dupuy, A. J., S. Fritz and D. A. Largaespada, 2001 Transposition and gene disruption in the male germline of the mouse. Genesis 30: 82–88. Eide, D., and P. Anderson, 1985 Transposition of Tc1 in the nematode Caenorhabditis elegans. Proc. Natl. Acad. Sci. USA 82: 1756– 1760. Emmons, S. W., and L. Yesner, 1984 High-frequency excision of transposable element Tc1 in the nematode Caenorhabditis elegans is limited to somatic cells. Cell 36: 599–605. Engels, W. R., D. M. Johnson-Schlitz, W. B. Eggleston and J. Sved, 1990 High-frequency P element loss in Drosophila is homolog dependent. Cell 62: 515–525. Engels, W. R., C. R. Preston and D. M. Johnson-Schlitz, 1994 Long-range cis preference in DNA homology search over the length of a Drosophila chromosome. Science 263: 1623–1625. Fadool, J. M., D. Hartl and J. E. Dowling, 1998 Transposition of the mariner element from Drosophila mauritania in zebrafish. Proc. Natl. Acad. Sci. USA 95: 5182–5186. Feschotte, C., and S. R. Wessler, 2002 Mariner-like transposases are widespread and diverse in flowering plants. Proc. Natl. Acad. Sci. USA 99: 280–285. Fischer, S. E. J., E. Wienholds and R. H. A. Plasterk, 2001 Regulated transposition of a fish transposon in the mouse germ line. Proc. Natl. Acad. Sci. USA 98: 6759–6764. Frame, I. G., J. F. Cutfield and R. T. Poulter, 2001 New BEL-like LTR-retrotransposons in Fugu rubripes, Caenorhabditis elegans, and Drosophila melanogaster. Gene 263: 219–230. Ganko, E. W., K. T. Fielman and J. F. McDonald, 2001 Evolutionary history of Cer elements and their impact on the C. elegans genome. Genome Res. 11: 2066–2074. Harris, L. J., and A. M. Rose, 1986 Somatic excision of transposable element Tc1 from the Bristol genome of Caenorhabditis elegans. Mol. Cell. Biol. 6: 1782–1786. Hartl, D. L., E. R. Lozovskaya, D. I. Nurminsky and A. R. Lohe, 1997 What restricts the activity of mariner-like transposable elements. Trends Genet. 13: 197–201. Horie, K., A. Kuroiwa, M. Ikawa, M. Okabe, G. Kondoh et al., 2001 Efficient chromosomal transposition of a Tc1/marinerlike transposon Sleeping Beauty in mice. Proc. Natl. Acad. Sci. USA 98: 9191–9196. International Human Genome Sequencing Consortium, 2001 Initial sequencing and analysis of the human genome. Nature 409: 860–921. Ketting, R. F., T. H. Haverkamp, H. G. van Luenen and R. H. Plasterk, 1999 Mut-7 of C. elegans, required for transposon silencing and RNA interference, is a homolog of Werner syndrome helicase and RNaseD. Cell 99: 133–141. Lampe, D. J., K. K. Walden and H. M. Robertson, 2001 Loss of transposase-DNA interaction may underlie the divergence of mariner family transposable elements and the ability of more than one mariner to occupy the same genome. Mol. Biol. Evol. 18: 954–961. Lankenau, D. H., and G. B. Gloor, 1998 In vivo gap repair in
134
S. E. J. Fischer, E. Wienholds and R. H. A. Plasterk
Drosophila: a one-way street with many destinations. Bioessays 20: 317–327. Le, Q. H., K. Turcotte and T. Bureau, 2001 Tc8, a Tourist-like transposon in Caenorhabditis elegans. Genetics 158: 1081–1088. Li, W., and J. E. Shaw, 1993 A variant Tc4 transposable element in the nematode C. elegans could encode a novel protein. Nucleic Acids Res. 21: 59–67. Lohe, A. R., D. De Aguiar and D. L. Hartl, 1997 Mutations in the mariner transposase: the D,D(35)E consensus sequence is nonfunctional. Proc. Natl. Acad. Sci. USA 94: 1293–1297. Lohe, A. R., C. Timmons, I. Beerman, E. R. Lozovskaysa and D. L. Hartl, 2000 Self-inflicted wounds, template-directed gap repair and a recombination hotspot: effect of the mariner transposase. Genetics 154: 647–656. Machida, C., H. Onouchi, J. Koizumi, S. Hamada, E. Semiarti et al., 1997 Characterization of the transposition pattern of the Ac element in Arabidopsis thaliana using endonuclease I-SceI. Proc. Natl. Acad. Sci. USA 94: 8675–8680. Moerman, D. G., and R. H. Waterston, 1984 Spontaneous unstable unc-22 IV mutations in C. elegans var. Bergerac. Genetics 108: 859–877. Moerman, D. G., G. M. Benian and R. H. Waterston, 1986 Molecular cloning of the muscle gene unc-22 in Caenorhabditis elegans by Tc1 transposon tagging. Proc. Natl. Acad. Sci. USA 83: 2579–2583. Mori, I., D. G. Moerman and R. H. Waterston, 1988 Analysis of a mutator activity necessary for germ-line transposition and excision of Tc1 transposable elements in Caenorhabditis elegans. Genetics 120: 397–407. Nassif, N., J. Penney, S. Pal, W. R. Engels and G. B. Gloor, 1994 Efficient copying of nonhomologous sequences from ectopic sites via P-element-induced gap repair. Mol. Cell. Biol. 14: 1613–1625. Oosumi, T., B. Garlick and W. R. Belknap, 1996 Identification of putative nonautonomous transposable elements associated with several transposon families in Caenorhabditis elegans. J. Mol. Evol. 43: 11–18. Page, R. D., 1996 TreeView: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12: 357–358. Plasterk, R. H., 1991 The origin of footprints of the Tc1 transposon of Caenorhabditis elegans. EMBO J. 10: 1919–1925. Plasterk, R. H., Z. Izsvak and Z. Ivics, 1999 Resident aliens: the Tc1/mariner superfamily of transposable elements. Trends Genet. 15: 326–332. Plasterk, R. H. A., and J. T. M. Groenen, 1992 Targeted alterations of the Caenorhabditis elegans genome by transgene instructed DNA double strand break repair following Tc1 excision. EMBO J. 11: 287–290. Raz, E., H. G. A. M. van Luenen, B. Schaerringer, R. H. A. Plasterk and W. Driever, 1998 Transposition of the nematode Caenorhabditis elegans Tc3 element in the zebrafish Danio rerio. Curr. Biol. 8: 82–88. Rezsohazy, R., H. G. A. M. van Luenen, R. M. Durbin and R. H. A.
Plasterk, 1997 Tc7, a Tc1 hitch hiking transposon in Caenorhabditis elegans. Nucleic Acids Res. 25: 4048–4054. Robertson, H. M., 1993 The mariner transposable element is widespread in insects. Nature 362: 241–245. Robertson, H. M., 1995 The Tc1/mariner superfamily of transposons in animals. J. Insect Physiol. 41: 99–105. Ruvolo, V., J. E. Hill and A. Levitt, 1992 The Tc2 transposon of Caenorhabditis elegans has the structure of a self-regulated element. DNA Cell Biol. 11: 111–122. Sambrook, J., E. F. Fritsch and T. Maniatis, 1989 Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Sedensky, M. M., S. J. Hudson, B. Everson and P. G. Morgan, 1994 Identification of a mariner-like repetitive sequence in C. elegans. Nucleic Acids Res. 22: 1719–1723. Smit, A. F. A., and A. D. Riggs, 1996 Tiggers and other DNA transposon fossils in the human genome. Proc. Natl. Acad. Sci. USA 93: 1443–1448. Sulston, J., and J. Hodgkin, 1987 Methods, pp. 587–606 in The Nematode Caenorhabditis elegans, edited by W. B. Wood. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Surzycki, S. A., and W. R. Belknap, 2000 Repetitive-DNA elements are similarly distributed on Caenorhabditis elegans autosomes. Proc. Natl. Acad. Sci. USA 97: 245–249. Szostak, J. W., T. L. Orr-Weaver, R. J. Rothstein and F. W. Stahl, 1983 The double-strand-break repair model for recombination. Cell 33: 25–35. Thompson, J. D., D. G. Higgins and T. J. Gibson, 1994 CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673– 4680. Tower, J., G. H. Karpen, N. Craig and A. C. Spradling, 1993 Preferential transposition of Drosophila P elements to nearby chromosomal sites. Genetics 133: 347–359. Tu, Z., and H. Shao, 2002 Intra- and inter-specific diversity of Tc3like transposons in nematodes and insects and implications for their evolution and transposition. Gene 282: 133–142. van Luenen, H. G. A. M., S. D. Colloms and R. H. A. Plasterk, 1994 The mechanism of transposition of Tc3 in Caenorhabditis elegans. Cell 79: 293–301. Vos, J. C., I. de Baere and R. H. A. Plasterk, 1996 Transposase is the only nematode protein required for in vitro transposition of Tc1. Genes Dev. 10: 755–761. Wicks, S. R., C. J. de Vries, H. G. van Luenen and R. H. Plasterk, 2000 CHE-3, a cytosolic dynein heavy chain, is required for sensory cilia structure and function in Caenorhabditis elegans. Dev. Biol. 221: 295–307. Yuan, J., M. Finney, N. Tsung and H. R. Horvitz, 1991 Tc4, a Caenorhabditis elegans transposable element with an unusual foldback structure. Proc. Natl. Acad. Sci. USA 88: 3334–3338. Communicating editor: S. Sandmeyer