Transposable Elements and their Use for Target Site ...

3 downloads 0 Views 1MB Size Report
Mitchell, R. S.; Beitzel, B. F.; Schroder, A. R.; Shinn, P.; Chen, H.; Berry,. C. C.; Ecker ... Sashital, D. G.; Cornilescu, G.; McManus, C. J.; Brow, D. A. and Butcher,.
Current Pharmacogenomics, 2006, 4, 000-000

1

Transposable Elements and their Use for Target Site Specific Gene Delivery Anton Buzdin* Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow 117871, Russia Abstract: Transposable elements (TEs), which occupy nearly 40% of eukaryotic DNA, are selfish repetitive sequences, able to proliferate in the host genomes via either their DNA copies or RNA intermediates utilizing the mechanism termed ‘reverse transcription’ and the RNA-dependant DNA polymerase enzyme, called reverse transcriptase. The newly formed DNA copy of the element then integrates into the genome, using a combination of host and self-encoded proteins, depending on the transposable element origin. Being important model objects for the study of many fundamental molecular biology processes and by actively participating in the gene regulation network, TEs are of great interest for basic researches in molecular genetics and genomics. Their practical use, however, is limited now to some fields of forensic sciences, phylogenetic studies and population genetics. In this mini-review I have tried to put together both theoretical, experimental and speculative data on the use of the transposable elements as tools for the gene delivery into the host eukaryotic genomes, producing stable transgene transformants. The strength of the TE-based constructions as compared with popular viral vectors would be the predictable, genomic target sequence-specific transgene integration, mediated by the enzymatic machinery of some TEs. These and other implications of transposable elements in biomedical sciences will be discussed.

INTRODUCTION Many geneticists and medical scientists share the belief that gene therapy is the way some untreatable illnesses will be cured. Indeed, numerous promising approaches were published to be of a considerable potential usefulness for the treatment of inheritable disorders, infectious diseases, and cancer. However, the existing gene therapy techniques are expensive, sophisticated, labour-intensive, sometimes unsafe and, finally, rather ineffective. Being in its present state underdeveloped for routine clinical use, these multi-factor multi-step approaches are certainly to be further optimized. There are at least three crucial stages, common for all gene therapy techniques: (i) cell type - specific transfection, (ii) gene transfer into a nucleus and (iii) gene expression. This review deals with the latter stage, aimed at the stable controllable transgene expression in the patient body. There are many different vectors for the gene delivery and nuclear expression, which, as a rule, can be assigned to linear double-stranded DNA, plasmid-based sequences or to viral vectors [Tomanin, 2004]. It is essential that transferred gene would have all necessary regulatory sequences required for its normal functioning, such as promoter, enhancer, positive/ negative regulatory elements, transcription terminator and polyadenylation signal. Another important requirement is the vector ability to mediate transgene insertion into the host cell genome: otherwise, transferred DNA sequence will be lost in transfected cell descendants. The mechanisms of such insertion could be both homologous and (far more often) non homologous recombination, as well as random or target sitespecific vector integration. The site-specific gene delivery, which can be provided by internal cellular or external vectormediated integration mechanisms, has some obvious *Address correspondence to this author at the Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow 117871, Russia; Tel.: 7(095)3306329; Fax: 7(095)3306538; E-mail: [email protected] 1570-1603/06 $50.00+.00

advantages over techniques based on the random transgene insertion. First, randomly inserted transgene sequences frequently get into transcriptionally silent or heterochromatin compartments, which will not allow for the efficient gene expression. Second, random integrants may disrupt preexisting host gene exon-intronic structures and alter genomic regulatory networks. Finally, randomly inserted gene may fall under the control of unexpected genomic regulatory sequences, which will unpredictably bias the transgene expression pattern. Although in some cases can this latter shortcoming be waived by introducing of so-called insulator bordering sequences, such as matrix attachment regions (MARs) or scaffold attachment regions (SARs) into the “random” vector [Goetze, 2005; Chernov, 2004], the design of vectors with known integration site specificity seems to be of an incomparably greater value. Most popular approaches for integration site-specific vector design are based on either homologous recombination between vector and unique genomic sequences, or on site-specific gene delivery using adeno-associated viral vectors. It should be noted, however, that homologous recombination in normal mammalian cells is an extremely rare event [te Riele, 1992], thus making efficient transgene delivery into exact genomic location problematic. The use of adeno-associated parvoviral vectors is grounded on the remarkable ability of the viral protein Rep to recognize the unique genomic region on human chromosome 19 and to mediate under special conditions targeted DNA integration at this locus [Goncalves, 2005]. Such integrations are not greatly precise, occurring rather randomly within the ~1 kb – long target locus [Recchia, 2004]. The mechanism of Rep target site recognition remains unknown. Yet, the utilization of viral vectors itself for needs of clinical gene therapy is not welcome by a great part of biomedical community [Burt, 2003]. The introduction of vector viral genes into the patient body may provoke recombination with wild-type virus and, consequently, may ©2006 Bentham Science Publishers Ltd.

2

Current Pharmacogenomics, 2006, Vol. 4, No. 1

lead to the production of more contagious active viruses. Indeed, the use of adenoviral vectors was recently reported to be the cause for the treatment-related deaths of several patients following adenovirus administration [Raper, 2003; Reid, 2002]. Therefore, non-viral highly efficient vectors are needed for safety reasons. The possible candidates for such a role are transposable elements [Burt, 2003]. Transposable elements (TEs) are DNA fragments, able to self-reproduce and to change their location in the host genome, that is to transpose. TEs were discovered about 50 years ago in the maize DNA by Barbara McClintock [McClintock, 1956]. Since then TEs have been found in genomes of almost all organisms. Moreover, they are known now to make up a great portion of eukaryotic DNA. TEs constitute, for example, more than 50% of the maize (Zea mays) genome [Wessler, 1998; Kidwell, 1997], 22% of the Drosophila genome [Kapitonov, 2003] and 42% of human DNA [Lander, 2001]. However, different TE groups are represented by a strikingly different number of representatives: from few copies to millions. TEs differ from each other in structure and in transposition features. They can be subdivided into two principal classes [Wessler, 1998; Kidwell, 1997; Lander, 2001; Smit, 1999; Jurka, 1998; Labrador, 1997; Smit, 1996]. Class (ii) representatives, called “DNA transposons”, transpose using “cut and paste” mechanism, using their DNA copies, whereas class (i) TEs proliferate through their RNA intermediates. They use an RNA-dependant DNA polymerase, called Reverse Transcriptase, an enzyme that synthesizes complementary DNA chain on RNA template. These features make TEs, “natural born genomic transducers”, ideal targets for vector construction [Buzdin, 2004]. This review is dedicated to the possible utilization of transposable elements as vectors for gene delivery. A brief insight into the structure and life cycle mechanisms of TEs will be given here to explain their potential usefulness for vector design. DNA TRANSPOSONS DNA transposons reside in genomes of almost all living organisms. Autonomous DNA transposons have terminal tandem inverted repeats and encode for specific protein called transposase, which mediates their mobility. Briefly, transposase sequence-specifically binds to tandem inverted repeats that flank TE insertion and then catalyzes excision of its single- or double-stranded DNA sequence with the subsequent reintegration into a new genomic locus, followed by DNA reparation using host enzymes. When single DNA strand is transferred, a new copy of the transposon appears in the host genome; this process is termed “replicative transposition”. Non-replicative transposition occurs when double stranded TE sequence moves towards the new genomic location. In this case the number of TE copies remains essentially the same [Mahillon, 1998; Plasterk, 1999; Weil, 2000]. Non autonomous DNA transposons have tandem inverted repeats but don’t harbour transposase gene. For their proliferation they use the enzyme of autonomous elements. DNA transposons are relatively popular objects of vector design (843 PubMed citations to the date), being used as

Anton Buzdin

vectors for gene delivery in a wide range of host organisms, including mammals. Although mammalian genomes lack proper active DNA transposons [Consortium, 2002; Lander, 2001], artificial genetically modified TEs successfully transpose in mammalian DNA in model experiments [Izsvak, 2004]. Most of DNA transposons demonstrate the absence of target sequence preference. For example, the most frequently [Heggestad, 2004] and probably the most successfully used for vertebrate cell transformation Sleeping Beauty transposon [Kaminski, 2002; Zayed, 2004] inserts depending primarily on the target site chromatine structure, rather than sequence-specifically, despite the previously published consensus sequence of Sleeping Beauty target sites [Vigdal, 2002]. Other DNA transposons, efficiently used for transgene delivery into the host cell genome, may be also mentioned: Tol 2 transposon for vertebrates [Kawakami, 2004], P element for insects [O'Brochta, 2004; Ryder, 2003], piggyBac transposon for insects and worms [Gonzalez-Estevez, 2003], Tn916 for bacteria [Roberts, 2003]. All these TEs integrate fairly randomly, except for piggyBac element, which has a very simple insertion site preference, TTAA [Bonin, 2004]. Such short recognition sequence, certainly, doesn’t allow to make vector for sequence-specific gene targeting, as millions of potential target sites reside in any eukaryotic genome. However, a few DNA transposons integrate in a highly sequence-specific manner. For instance, bacterial transposon Tn7, whose transposase recognizes specific 36 bp long sequences. Surprisingly, in in vitro experiments this bacterial transposon efficiently sequence-specifically jumps into the human genome, which has three similar target sites [Kuduvalli, 2005]. Another bacterial transposon, IS903, prefers 21 bp palindromic pattern, that is used in 84% of the inserts [Hu, 2001]. Finally, it was recently shown that transposase of the widely distributed eukaryotic transposon Harbinger is also characterized by a strong DNA-target specificity [Kapitonov, 2004]. RETROELEMENTS The term “retroelement” is applied to a vast class of nucleic acid sequences, whose appearance and/or proliferation in host genome are one way or another dependent on the direct genetic information transfer from RNA to DNA, called reverse transcription. This phenomenon was described for the first time in 1970 by Howard Temin and David Baltimore. They purified and characterized retroviral RNAdependant DNA polymerase (reverse transcriptase, RT) enzyme, which catalyzes the cDNA synthesis on RNA template [Baltimore, 1970; Temin, 1970]. Afterwards, RT sequences were found in many diverse genetic elements. Not only viral realm representatives, such as retroviruses, hepadnaviruses and caulimoviruses, but also many eukaryotic transposable elements, mitochondrial group II introns, bacterial retrointrons and some plasmids did code for the reverse transcriptase. Retroelements having their own RT genes are referred to as autonomous REs. They can be subdivided into two major groups: long terminal repeat (LTR) containing elements, and non LTR retrotransposons. LTRs are usually 1000-1500 bp long sequences flanking the retroelement “body” in genomic DNA [Urnovitz, 1996; Leib-Mosch, 1995]. However, LTR size may vary from few

Transposable Elements and their Use for Target Site Specific Gene Delivery

100 bp to over 3 kb is some plant retrotransposons. Autonomous non LTR REs are generally assigned to LINEs, long interspersed nuclear elements. LINEs are 3,5-8 kb long sequences harbouring RT gene and frequently other genes, encoding proteins necessary for their functioning [Volff, 2001; Finnegan, 1997]. Non autonomous REs lack RT genes and are classified as either SINEs (short interspersed nuclear elements) or processed pseudogenes. SINES are 50-3000 bp long sequences having, as a rule, internal RNA polymerase III promoter. Famous Alu repeats, SVA, SINE-R, B1, B2, MIR and many other elements belong to SINEs [Leib-Mosch, 1995; Schmid, 1998; Brosius, 2005]. Thus, retroelements are subdivided into 3 major systematic classes: LTR containing elements, LINEs and SINEs. There is one more, rather outstanding RE group, called retrointrons or mobile group II introns. The most important hypothesis explaining the RE origin was proposed by Howard Temin [Temin, 1993]. It says that autonomous REs co-evolved with the gene of the reverse transcriptase. The putative pathway of RE evolution started with the RT gene, created first non-LTR retrotransposons, and finally LTR containing elements and retroviruses [LeibMosch, 1995; Craig, 2002]. Indeed, the detailed sequence analysis clearly demonstrates the consecutive structure complication, such as recruitment of new regulatory proteins, additional enzymatic activities – those of RNase H, integrase, protease - from non-LTR to LTR containing retroelements. In that way, retrointrons and LINEs are more ancient RE forms than retroviruses and LTR elements [Malik, 2001]. The mechanism of LINE and retrointron retroposition is also much simpler than that of LTR REs (see Fig. (1)). It is noteworthy that according to certain hypothesis [Eickbush, 1997], another offspring of an ancestral RT sequence is the gene for telomerase - cellular telomere end building enzyme having RNA dependant DNA polymerase activity. Interestingly, in certain species telomerase gene is inactivated, whereas telomere lengths are maintained by retroelement integrations at chromosome termini [Pardue, 2003; Danilevskaya, 1999]. Below I summarise the basic information regarding the main group of retroelement structures and life cycles along with the data on their possible use as tools for site-specific gene delivery. RETROELEMENTS: RETROINTRONS (GROUP II INTRONS) Retrointrons, or group II introns, form one of the two classes of self-splicing introns, which exist in the genomes of prokaryotes and in eukaryotic organelles [Zimmerly, 2001; Martinez-Abarca, 2000]. The RT sequence - based phylogenetic analysis revealed that group II introns are probably the oldest group among retroelements [Boeke, 1997]. It is likely that group II mobile introns appeared for the first time in the genomes of bacteria. It doesn’t seem surprising, as modern eukaryotic mitochondria and plastids where retrointrons were found are bacterial cell descendants according to endosymbiotic theory [Zimmerly, 2001]. Group II introns code for the single protein having both RT and endonuclease activities. Retrointrons are transcribed as parts

Current Pharmacogenomics, 2006, Vol. 4, No. 1

3

of the genes they have inserted. Retrointron RNA has ribozyme activity and self-splices from pre-mRNA. Then spliced intronic RNA can be translated to give functional protein. Group II introns integrate target sequence-specifically. The target site specificity is achieved after the host DNA binding, which occurs nonspecifically. Retrointron ribonucleoprotein searches the bound DNA before undergoing a conformational change that is associated with identification of its specific binding site [Aizawa, 2003]. During retroposition, retrointron RNA serves again as ribozyme making sequence-specific single strand DNA break within the host genome. At the same time RNA covalently binds to the 5’ terminus of the DNA break. The second DNA strand is attacked by the retrointron protein, which next reverse transcribes the RNA (DNA synthesis is initiated at newly formed 3’-hydroxile of a target DNA). The subsequent RNA replacement by DNA followed by reparation of single strand breaks completes mobile intron integration [MartinezAbarca, 2000] (Fig. (1)). Retrointrons most probably gave rise to a very important group of eukaryotic genes – those of small nuclear (sn) RNAs, which take part in pre-mRNA splicing [Boeke, 1997]. Indeed, the spliceosomal snRNA folding reveals considerable structural similarities with one of group II self-splicing intron RNA domains [Sashital, 2004]. Moreover, in vitro this RNA domain can functionally replace the U6 snRNA in the working spliceosome [Shukla, 2002] and vice versa, U5 snRNA may substitute for another retrointron RNA domain, thus giving active ribozyme [Hetzer, 1997]. As group II introns insert into genomic DNA sequencespecifically, using the cleavage of 10 - 25 bp long specific DNA target sites, they might be perfect candidates for the design of integrative vectors with target DNA selectivity. Furthermore, this sequence-specificity is founded on the base-pairing interactions between the intron RNA and the DNA target [Jimenez-Zurdo, 2003]. Retrointrons, therefore, can be genetically engineered to alter their target site recognition preference in order to direct their integrations into genomic loci of the researcher special interest [Karberg, 2001]. Basing on mobile group II intron sequences, Lambowitz and colleagues have developed a novel class of gene targeting vectors, called “targetrons” that can be programmed to insert into virtually any desired target DNA [Zhong, 2003]. It was demonstrated that such retargeted group II introns can be used for rapid, highly specific and very efficient chromosomal gene disruption in bacteria [Karberg, 2001; Perutka, 2004]. Moreover, retrointrons are able to move to their target sites in heterologous hosts [Martinez-Abarca, 2000], or to the plasmid targets [Ichiyanagi, 2003]. Being undoubtedly useful for prokaryotic DNA engineering and, probably, for fighting against infectious bacteria, targetron approach looks promising also for the modification of mitochondrial DNA and, thus, for the treatment of mitochondrial DNA disorders [Taylor, 2005]. Although group II introns have been used to the date mostly for gene disruption purposes, it seems probable that several retrointron RNA domains, playing important role in its selfsplicing, but not relevant to its target-specific integration,

4

Current Pharmacogenomics, 2006, Vol. 4, No. 1

Anton Buzdin

Fig. (1). Schematic representation of the life cycles of the three classes of autonomous retroelements: retrointrons (group II introns), LINEs and LTR elements. At the first stage, genomic copies of the retroelements are transcribed by the cellular RNA polymerase. In the case of retrointrons, RNA then self-splices from the transcribed pre-mRNA. Retroelement RNA is then translated to give proteins for the reverse transcription/integration. At the stage 2), ribonucleoprotein particles are formed to mediate retroelement RNA copy reverse transcription/integration. The major distinction between retrointron, LINE and LTR element life cycles is that the latter group is reverse transcribed in the cytoplasm, using cellular tRNA as the primer for transcription initiation. This multi-step process leads to the formation of double-stranded DNA copy of the retroelement, flanked by the two full-size long terminal repeats (LTRs), whereas RNA copy, which serves as the template for cDNA synthesis, lacks complete LTRs, having only its fragments: “R” and “U5” – at the 5’ terminus, “U3” and “R” – at the 3’ end. The LTR element cDNA is then transferred to the nucleus to integrate rather randomly into the host DNA, in contrast to retrointron/LINE cDNAs, which are primed by sequence-specific genomic DNA nicks using basepair interactions between retroelement 3’terminal nucleotides and nicked genomic DNA. This provides target sequence specificity for retrointron/LINE insertions (see the text).

could be replaced by the protein-coding transgene sequence, thus making targetrons universal vectors. Also, the targetron utilization will not probably be restricted to bacterial or organelle genomes, as it was demonstrated that retargeted

retrointrons entirely retain activity in human cells [Guo, 2000]. In line with this observation, remnants of retrointron retrotranspositions were recently found in some eukaryotic genomes [Mohr, 2003].

Transposable Elements and their Use for Target Site Specific Gene Delivery

RETROELEMENTS: LINES LINEs are widely distributed in eukaryotes. These retroelements were found in genomes of fungi, plants, as well as in vertebrate and invertebrate animals. The number of copies per genome varies dramatically between different LINE families. For example, it is believed that mammalian DNA contains retroelements of the only one LINE family – L1 [Smit, 1995] – but this family is represented by a great number of copies, for instance about 5x105 human L1 elements occupy a total of 17% of human genomic DNA [Lander, 2001]. On the contrary, invertebrate genomes are usually examples of several LINE families’ co-existence (e.g., about ten different LINE families were found in Drosophila: F, Doc, G, R1, R2, HeT, jockey [Priimagi, 1988 #613], BS [Udomkit, 1995], TART [Levis, 1993], etc), but each of them is represented by a few thousand members [Pimpinelli, 1995]. The possible explanation for this phenomenon could be the hypothesis proposed by Petrov et al. [Petrov, 1996], which explains the low retroelement copy number, in particular in Drosophila, by rampant deletion of DNA in unconstrained regions. In such case all “unnecessary” DNA, including REs, is quickly eliminated, and only essential sequences, whose loss leads to lethal mutations, survive in the genome. In mammals this mechanism functions with much slower DNA deletion rates, and numerous “junk” sequences are thus permanently being accumulated [Kapitonov, 2003]. LINEs are transcribed by cellular RNA polymerase II from internal promoter located in their 5’- untranslated region [Mizrokhi, 1988]. LINE 3’-terminal sequences generally have 3’ processing signal AATAAA [Birnstiel, 1985] and oligo (A) sequence which serves as polyadenylation enhancer [Sewell, 1996; Agamalian, 1996; PerepelitsaBelancio, 2003]. The full-size LINE (+) RNA is, as in the case of retrointrons, both template for protein synthesis and transpositional RNA intermediate [Eickbush, 1992]. LINE transposition, which occurs using “target siteprimed reverse transcription” mechanism (Fig. (1 )) is coupled with the reverse transcription, as it is the case for retrointrons. At that, LINE-encoded nuclease cuts genomic DNA and newly-formed nicked single stranded DNA serves as the primer for LINE RNA reverse transcription. At the moment of the reverse transcription initiation, single stranded DNA basepairs with a few complementary LINE RNA terminal nucleotides. The number of such nucleotides, as well as the endonuclease predilection for certain DNA sequences, predetermines LINE integration target specificity. It is known now that different LINEs have very different such target specificity and recognition sequences. This is most probably caused by the different LINE-encoded endonuclease domains. Members of most ancient LINE families (NESL-1, CRE, R2 and R4) use site-specific endonuclease, termed REL, which is homologous to endonuclease domain from group II mobile introns [Malik, 1999]. The “younger” LINE families instead of REL code for AP-endonuclease, which is greatly reduced in sequence specificity (for example, vertebrate L1 element APendonuclease recognition sequence is the hexanucleotide TTAAAA, in contrast to 10-30 bp long sequences for RELendonucleases. From the point of view of retrotransposon

Current Pharmacogenomics, 2006, Vol. 4, No. 1

5

evolution, the lack of target site specificity was probably the benefit, as more potential target sites became available for TE insertions. It was demonstrated, however, that in some cases AP-endonuclease has evolved to the target sequencespecific enzymes (e.g. R1 family; Dre and Tx1 elements from the L1 superfamily) [Burke, 1993; Malik, 1999]. Being extensively studied from the side of molecular genetics, LINEs still remain underestimated as tools for new vector construction. However, some efforts to recruit LINEs for gene delivery should be mentioned. The hybrid L1 element-adenovirus vector system developed by Soifer and Kasahara [Soifer, 2004], where helper-dependent adenovirus serves as a carrier for efficient delivery and transient expression of its encoded L1/transgene cassette, and in the second stage the L1 and its associated transgene permanently integrate into the genome of the adenovirus-transduced cells, was rather efficient as a vehicle for stable transgene delivery, but lacked in target site specificity, as L1 elements have short recognition sequence [Soifer, 2001]. Another attempt was made by Fujiwara team, which created target sitespecific vector, based on the insect telomere-specific LINEs TRAS1 and SART1, which integrate into the chromosomal (TTAGG)n sequences [Matsumoto, 2004; Takahashi, 2002]. Many other sequence-specific LINEs are to be further recruited for vector design: Zebulon from the pufferfish Tetraodon nigroviridis [Bouneau, 2003], Tx1L from Xenopus laevis [Christensen, 2000], NeSL-1 from C. elegans [Malik, 2000], elements R2, CRE1 and 2, SLACS, CZAR, Dong, R4 [Yang, 1999], R1Bm [Feng, 1998] and others [Kojima, 2004]. LTR RETROTRANSPOSONS, RETROVIRUSES AND SINES

ENDOGENOUS

LTR retrotransposons and endogenous retroviruses (LR/ ERVs) are retroelements with the most complex organization. Their long terminal repeats (LTRs) which can be found only in the DNA copies of these elements, contain multiple regulatory sequences. LTRs appear due to a rather complicated mechanism of LR/ERV reverse transcription, reviewed repeatedly in literature [Wilhelm, 2001; Sverdlov, 2000]. The most important LR/ERV life cycle peculiarity is that the reverse transcription occurs in cytoplasm, the completed DNA copy of the element is then transferred into a nucleus followed by its integration into the host genomic DNA (Fig. (1)). To my knowledge, LR/ERV insertions do not display any target sequence specificity, although their integration events are not random. For example, HIV-1 has some preference for local hotspots a few kb in size [Schroder, 2002]. Hematti and colleagues [Hematti, 2004] have shown that MLV and SIV show specific target preference in primate insertions. In line with these data, HIV, MLV, and ASLV have been reported to preferably integrate into specific regions of the genome, probably based on chromatin structure [Mitchell, 2004]. Although retroviral vectors are now of a considerable usefulness for transgene delivery (as mentioned above), no target sequence-specific vectors can be constructed using LR/ERV sequences. SINEs and processed pseudogenes are a very heterogeneous group of retroelemens. Unlike autonomous REs which have common ancestry of at least the reverse

6

Current Pharmacogenomics, 2006, Vol. 4, No. 1

transcriptase gene, SINE representatives were appearing many times in evolution irrespective of each other. They usually lack any protein coding sequences and, therefore, use for retroposition “exogenous” reverse transcriptase, provided by LINEs [Eickbush, 1992; Dewannieux, 2003]. SINEs are thus “parasites of parasites”. SINEs are widely distributed in eukaryotes and exist, as LINEs do, in Plants, Fungi, Vertebrate and Invertebrate animals (reviewed by Jagadeeswaran in [Jagadeeswaran, 1981]). Being non autonomous retroelements, SINEs seem to be not suitable for the design of the vectors for gene delivery. CONCLUDING REMARKS In this review I have tried to find in the world of transposable elements the proper targets for the development of new generation vectors for gene delivery. Although several research teams do successfully use this approach, their efforts appear not to be recognized. Hereby, I would like both to elucidate this activity and to encourage the scientific community to further utilize the unique transposable element peculiarities for the construction of better, target-specific, vectors that could be of great benefit to the needs of the gene therapy. The special attention ought to be done to the vector safety: as any transposable element engineering is a kind of rapid evolution in vitro, it should be kept in mind that at least three groups of active eukaryotic viruses (i.e. retroviruses, hepadnaviruses and caulimoviruses) have been originated by different TE groups in course of the natural evolution. ACKNOWLEDGEMENTS The work was supported by Russian Foundation for Basic Research grants Nos. 05-04-48682-a, 2006.20034, and by the Molecular and Cellular Biology Program of the Presidium of the Russian Academy of Sciences. REFERENCES Agamalian, N. S.; Arkhipova, I. R.; Surkov, S. A. and Il'in Iu, V. (1996) Regulating polyadenylation of jockey mobile genetic element transcripts belonging to the LINE class, in Drosophila cell culture. Mol. Biol. (Moscow) 30, 818-828. Aizawa, Y.; Xiang, Q.; Lambowitz, A. M. and Pyle, A. M. (2003) The pathway for DNA recognition and RNA integration by a group II intron retrotransposon. Mol. Cell 11, 795-805. Baltimore, D. (1970) RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature 226, 1209-1211. Birnstiel, M. L.; Busslinger, M. and Strub, K. (1985) Transcription termination and 3' processing: the end is in site! Cell 41, 349-359. Boeke, J. D. and Stoye, J. P. (1997) Retrotransposons, endogenous retroviruses, and the evolution of retroelements. In Retroviruses; H. J.M. Coffin, S. H.; Varmus, H. E.; Eds.; Cold Spring Harbor Laboratory Press: New York, pp. 343-345. Bonin, C. P. and Mann, R. S. (2004) A piggyBac transposon gene trap for the analysis of gene expression and function in Drosophila. Genetics 167, 1801-1811. Bouneau, L.; Fischer, C.; Ozouf-Costaz, C.; Froschauer, A.; Jaillon, O.; Coutanceau, J. P.; Korting, C.; Weissenbach, J.; Bernot, A. and Volff, J. N. (2003) An active non-LTR retrotransposon with tandem structure in the compact genome of the pufferfish Tetraodon nigroviridis. Genome Res. 13, 1686-1695. Brosius, J. (2005) Waste not, want not--transcript excess in multicellular eukaryotes. Trends Genet. 21, 287-288. Burke, W. D.; Eickbush, D. G.; Xiong, Y.; Jakubczak, J. and Eickbush, T. H. (1993) Sequence relationship of retrotransposable elements R1

Anton Buzdin and R2 within and between divergent insect species. Mol. Biol. Evol. 10, 163-185. Burt, A. (2003) Site-specific selfish genes as tools for the control and genetic engineering of natural populations. Proc. R. Soc. Lond. B. Biol. Sci. 270, 921-928. Buzdin, A. A. (2004) Retroelements and formation of chimeric retrogenes. Cell. Mol. Life Sci. 61, 2046-2059. Chernov, I. P.; Akopov, S. B. and Nikolaev, L. G. (2004) Structure and function of nuclear matrix associated regions (S/MARs). Bioorg. Khim. 30, 3-14. Christensen, S.; Pont-Kingdon, G. and Carroll, D. (2000) Target specificity of the endonuclease from the Xenopus laevis non-long terminal repeat retrotransposon, Tx1L. Mol. Cell. Biol. 20, 1219-1226. Craig, N. L.; Craigie, R.C.; Gellert, M. and Lambowitz, A.; Eds (2002) Mobile DNA II.; ASM Press: New York. Danilevskaya, O. N.; Traverse, K. L.; Hogan, N. C.; DeBaryshe, P. G. and Pardue, M. L. (1999) The two Drosophila telomeric transposable elements have very different patterns of transcription. Mol. Cell. Biol. 19, 873-881. Dewannieux, M.; Esnault, C. and Heidmann, T. (2003) LINE-mediated retrotransposition of marked Alu sequences. Nat. Genet. 35, 41-48. Eickbush, T. H. (1992) Transposing without ends: the non-LTR retrotransposable elements. New Biol. 4, 430-440. Eickbush, T. H. (1997) Telomerase and retrotransposons: which came first? Science 277(5328), 911-912. Feng, Q.; Schumann, G. and Boeke, J. D. (1998) Retrotransposon R1Bm endonuclease cleaves the target sequence. Proc. Natl. Acad. Sci. USA 95, 2083-2088. Finnegan, D. J. (1997) Transposable elements: how non-LTR retrotransposons do it. Curr. Biol. 7, R245-R248. Goetze, S.; Baer, A.; Winkelmann, S.; Nehlsen, K.; Seibler, J.; Maass, K. and Bode, J. (2005) Performance of genomic bordering elements at predefined genomic loci. Mol. Cell. Biol. 25, 2260-2272. Goncalves, M. A.; van Nierop, G. P.; Tijssen, M. R.; Lefesvre, P.; KnaanShanzer, S.; van der Velde, I.; van Bekkum, D. W.; Valerio, D. and de Vries, A. A. (2005) Transfer of the full-length dystrophincoding sequence into muscle cells by a dual high-capacity hybrid viral vector with site-specific integration ability. J. Virol. 79, 31463162. Gonzalez-Estevez, C.; Momose, T.; Gehring, W. J. and Salo, E. (2003) Transgenic planarian lines obtained by electroporation using transposon-derived vectors and an eye-specific GFP marker. Proc. Natl. Acad. Sci. USA 100, 14046-14051. Guo, H.; Karberg, M.; Long, M.; Jones, J. P.; 3rd, Sullenger, B. and Lambowitz, A. M. (2000) Group II introns designed to insert into therapeutically relevant DNA target sites in human cells. Science 289, 452-457. Heggestad, A. D.; Notterpek, L. and Fletcher, B. S. (2004) Transposonbased RNAi delivery system for generating knockdown cell lines. Biochem. Biophys. Res. Commun. 316, 643-650. Hematti, P.; Hong, B. K.; Ferguson, C.; Adler, R.; Hanawa, H.; Sellers, S.; Holt, I. E.; Eckfeldt, C. E.; Sharma, Y.; Schmidt, M.; von Kalle, C.; Persons, D. A.; Billings, E. M.; Verfaillie, C. M.; Nienhuis, A. W.; Wolfsberg, T. G.; Dunbar, C. E. and Calmels, B. (2004) Distinct genomic integration of MLV and SIV vectors in primate hematopoietic stem and progenitor cells. PLoS Biol. 2, e423. Hetzer, M.; Wurzer, G.; Schweyen, R. J. and Mueller, M. W. (1997) Transactivation of group II intron splicing by nuclear U5 snRNA. Nature 386, 417-420. Hu, W. Y.; Thompson, W.; Lawrence, C. E. and Derbyshire, K. M. (2001) Anatomy of a preferred target site for the bacterial insertion sequence IS903. J. Mol. Biol. 306, 403-416. Ichiyanagi, K.; Beauregard, A. and Belfort, M. (2003) A bacterial group II intron favors retrotransposition into plasmid targets. Proc. Natl. Acad. Sci. USA 100, 15742-15747. International human genome sequencing consortium. Nature 409, 860-921. International mouse genome sequencing consortium (2002) Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520-562. Izsvak, Z. and Ivics, Z. (2004) Sleeping beauty transposition: biology and applications for molecular therapy. Mol. Ther. 9, 147-156. Jagadeeswaran, P.; Forget, B. G. and Weissman, S. M. (1981) Short interspersed repetitive DNA elements in eucaryotes: transposable DNA elements generated by reverse transcription of RNA pol III transcripts? Cell 26, 141-142.

Transposable Elements and their Use for Target Site Specific Gene Delivery Jimenez-Zurdo, J. I.; Garcia-Rodriguez, F. M.; Barrientos-Duran, A. and Toro, N. (2003) DNA target site requirements for homing in vivo of a bacterial group II intron encoding a protein lacking the DNA endonuclease domain. J. Mol. Biol. 326, 413-423. Jurka, J. (1998) Repeats in genomic DNA: mining and meaning. Curr. Opin. Struct. Biol. 8, 333-337. Kaminski, J. M.; Huber, M. R.; Summers, J. B. and Ward, M. B. (2002) Design of a nonviral vector for site-selective, efficient integration into the human genome. FASEB J. 16, 1242-1247. Kapitonov, V. V. and Jurka, J. (2003) Molecular paleontology of transposable elements in the Drosophila melanogaster genome. Proc. Natl. Acad. Sci. USA 100, 6569-6574. Kapitonov, V. V. and Jurka, J. (2004) Harbinger transposons and an ancient HARBI1 gene derived from a transposase. DNA Cell. Biol. 23, 311324. Karberg, M.; Guo, H.; Zhong, J.; Coon, R.; Perutka, J. and Lambowitz, A. M. (2001) Group II introns as controllable gene targeting vectors for genetic manipulation of bacteria. Nat. Biotechnol. 19, 11621167. Kawakami, K.; Takeda, H.; Kawakami, N.; Kobayashi, M.; Matsuda, N. and Mishina, M. (2004) A transposon-mediated gene trap approach identifies developmentally regulated genes in zebrafish. Dev. Cell. 7, 133-144. Kidwell, M. G. and Lisch, D. (1997) Transposable elements as sources of variation in animals and plants. Proc. Natl. Acad. Sci. USA 94, 7704-7711. Kojima, K. K. and Fujiwara, H. (2004) Cross-genome screening of novel sequence-specific non-LTR retrotransposons: various multicopy RNA genes and microsatellites are selected as targets. Mol. Biol. Evol. 21, 207-217. Kuduvalli, P. N.; Mitra, R. and Craig, N. L. (2005) Site-specific Tn7 transposition into the human genome. Nucleic Acids Res. 33, 857863. Labrador, M. and Corces, V. G. (1997) Transposable element-host interactions: regulation of insertion and excision. Ann. Rev. Genet. 31, 381-404. Leib-Mosch, C. and Seifarth, W. (1995) Evolution and biological significance of human retroelements. Virus Genes 11, 133-145. Levis, R. W.; Ganesan, R.; Houtchens, K.; Tolar, L. A. and Sheen, F. M. (1993) Transposons in place of telomeric repeats at a Drosophila telomere. Cell 75, 1083-1093. Mahillon, J. and Chandler, M. (1998) Insertion sequences. Microbiol. Mol. Biol. Rev. 62, 725-774. Malik, H. S.; Burke, W. D. and Eickbush, T. H. (1999) The age and evolution of non-LTR retrotransposable elements. Mol. Biol. Evol. 16, 793-805. Malik, H. S. and Eickbush, T. H. (2000) NeSL-1, an ancient lineage of sitespecific non-LTR retrotransposons from Caenorhabditis elegans. Genetics 154, 193-203. Malik, H. S. and Eickbush, T. H. (2001) Phylogenetic analysis of ribonuclease H domains suggests a late, chimeric origin of LTR retrotransposable elements and retroviruses. Genome Res. 1 1 , 1187-1197. Martinez-Abarca, F.; Garcia-Rodriguez, F. M. and Toro, N. (2000) Homing of a bacterial group II intron with an intron-encoded protein lacking a recognizable endonuclease domain. Mol. Microbiol. 35, 14051412. Martinez-Abarca, F. and Toro, N. (2000) Group II introns in the bacterial world. Mol. Microbiol. 38, 917-926. Matsumoto, T.; Takahashi, H. and Fujiwara, H. (2004) Targeted nuclear import of open reading frame 1 protein is required for in vivo retrotransposition of a telomere-specific non-long terminal repeat retrotransposon, SART1. Mol. Cell. Biol. 24, 105-122. McClintock, B. (1956) Controlling elements and the gene. Cold Spring Harb. Symp. Quant. Biol. 21, 197-216. Mitchell, R. S.; Beitzel, B. F.; Schroder, A. R.; Shinn, P.; Chen, H.; Berry, C. C.; Ecker, J. R. and Bushman, F. D. (2004) Retroviral DNA integration: ASLV, HIV, and MLV show distinct target site preferences. PLoS Biol. 2, E234. Mizrokhi, L. J.; Georgieva, S. G. and Ilyin, Y. V. (1988) jockey, a mobile Drosophila element similar to mammalian LINEs, is transcribed from the internal promoter by RNA polymerase II. Cell 54, 685691.

Current Pharmacogenomics, 2006, Vol. 4, No. 1

7

Mohr, G. and Lambowitz, A. M. (2003) Putative proteins related to group II intron reverse transcriptase/maturases are encoded by nuclear genes in higher plants. Nucleic Acids Res. 31, 647-652. O'Brochta, D. A. and Atkinson, P. W. (2004) Transformation systems in insects. Methods Mol. Biol. 260, 227-254. Pardue, M. L. and DeBaryshe, P. G. (2003) Retrotransposons provide an evolutionarily robust non-telomerase mechanism to maintain telomeres. Ann. Rev. Genet. 37, 485-511. Perepelitsa-Belancio, V. and Deininger, P. (2003) RNA truncation by premature polyadenylation attenuates human mobile element activity. Nat. Genet. 35, 363-366. Perutka, J.; Wang, W.; Goerlitz, D. and Lambowitz, A. M. (2004) Use of computer-designed group II introns to disrupt Escherichia coli DExH/D-box protein and DNA helicase genes. J. Mol. Biol. 336, 421-439. Petrov, D. A.; Lozovskaya, E. R. and Hartl, D. L. (1996) High intrinsic rate of DNA loss in Drosophila. Nature 384(6607), 346-349. Pimpinelli, S.; Berloco, M.; Fanti, L.; Dimitri, P.; Bonaccorsi, S.; Marchetti, E.; Caizzi, R.; Caggese, C. and Gatti, M. (1995) Transposable elements are stable structural components of Drosophila melanogaster heterochromatin. Proc. Natl. Acad. Sci. USA 92, 3804-3808. Plasterk, R. H.; Izsvak, Z. and Ivics, Z. (1999) Resident aliens: the Tc1/mariner superfamily of transposable elements. Trends Genet. 15, 326-332. Priimagi, A. F.; Mizrokhi, L. J. and Ilyin, Y. V. (1988) The Drosophila mobile element jockey belongs to LINEs and contains coding sequences homologous to some retroviral proteins. Gene 70, 253262. Raper, S. E.; Chirmule, N.; Lee, F. S.; Wivel, N. A.; Bagg, A.; Gao, G. P.; Wilson, J. M. and Batshaw, M. L. (2003) Fatal systemic inflammatory response syndrome in a ornithine transcarbamylase deficient patient following adenoviral gene transfer. Mol. Genet. Metab. 80(1-2), 148-158. Recchia, A.; Perani, L.; Sartori, D.; Olgiati, C. and Mavilio, F. (2004) Sitespecific integration of functional transgenes into the human genome by adeno/AAV hybrid vectors. Mol. Ther. 10, 660-670. Reid, T.; Warren, R. and Kirn, D. (2002) Intravascular adenoviral agents in cancer patients: lessons from clinical trials. Cancer Gene Ther. 9, 979-986. Roberts, A. P.; Hennequin, C.; Elmore, M.; Collignon, A.; Karjalainen, T.; Minton, N. and Mullany, P. (2003) Development of an integrative vector for the expression of antisense RNA in Clostridium difficile. J. Microbiol. Methods 55, 617-624. Ryder, E. and Russell, S. (2003) Transposable elements as tools for genomics and genetics in Drosophila. Brief Funct. Genomic. Proteomic. 2, 57-71. Sashital, D. G.; Cornilescu, G.; McManus, C. J.; Brow, D. A. and Butcher, S. E. (2004) U2-U6 RNA folding reveals a group II intron-like domain and a four-helix junction. Nat. Struct. Mol. Biol. 11, 12371242. Schmid, C. W. (1998) Does SINE evolution preclude Alu function? Nucleic Acids Res. 26, 4541-4550. Schroder, A. R.; Shinn, P.; Chen, H.; Berry, C.; Ecker, J. R. and Bushman, F. (2002) HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110, 521-529. Sewell, E. and Kinsey, J. A. (1996) Tad, a Neurospora LINE-like retrotransposon exhibits a complex pattern of transcription. Mol. Gen. Genet. 252(1-2), 137-145. Shukla, G. C. and Padgett, R. A. (2002) A catalytically active group II intron domain 5 can function in the U12-dependent spliceosome. Mol. Cell 9, 1145-1150. Smit, A. F. (1999) Interspersed repeats and other mementos of transposable elements in mammalian genomes. Curr. Opin. Genet. Dev. 9, 657663. Smit, A. F.; Toth, G.; Riggs, A. D. and Jurka, J. (1995) Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences. J. Mol. Biol. 246, 401-417. Smit, A. F. A. (1996) The origin of interspersed repeats in the human genome. Curr. Opin. Genet. Dev. 6, 743-748. Soifer, H.; Higo, C.; Kazazian, H. H.; Jr.; Moran, J. V.; Mitani, K. and Kasahara, N. (2001) Stable integration of transgenes delivered by a retrotransposon-adenovirus hybrid vector. Hum. Gene Ther. 12, 1417-1428.

8

Current Pharmacogenomics, 2006, Vol. 4, No. 1

Anton Buzdin

Soifer, H. S. and Kasahara, N. (2004) Retrotransposon-adenovirus hybrid vectors: efficient delivery and stable integration of transgenes via a two-stage mechanism. Curr Gene Ther. 4, 373-384. Sverdlov, E. D. (2000) Retroviruses and primate evolution. Bioessays 22, 161-171. Takahashi, H. and Fujiwara, H. (2002) Transplantation of target site specificity by swapping the endonuclease domains of two LINEs. Embo J. 21, 408-417. Taylor, R. W. (2005) Gene therapy for the treatment of mitochondrial DNA disorders. Expert Opin. Biol. Ther. 5, 183-194. te Riele, H.; Maandag, E. R. and Berns, A. (1992) Highly efficient gene targeting in embryonic stem cells through homologous recombination with isogenic DNA constructs. Proc. Natl. Acad. Sci. USA 89, 5128-5132. Temin, H. M. (1993) Retrovirus variation and reverse transcription: abnormal strand transfers result in retrovirus genetic variation. Proc. Natl. Acad. Sci. USA 90, 6900-6903. Temin, H. M. and Mizutani, S. (1970) RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature 226, 1211-1213. Tomanin, R. and Scarpa, M. (2004) Why do we need new gene therapy viral vectors? Characteristics, limitations and future perspectives of viral vector transduction. Curr. Gene Ther. 4, 357-372. Udomkit, A.; Forbes, S.; Dalgleish, G. and Finnegan, D. J. (1995) BS a novel LINE-like element in Drosophila melanogaster. Nucleic Acids Res. 23, 1354-1358. Urnovitz, H. B. and Murphy, W. H. (1996) Human endogenous retroviruses: nature, occurrence, and clinical implications in human disease. Clin. Microbiol. Rev. 9, 72-99. Received: 25 March, 2005

Vigdal, T. J.; Kaufman, C. D.; Izsvak, Z.; Voytas, D. F. and Ivics, Z. (2002) Common physical properties of DNA affecting target site selection of sleeping beauty and other Tc1/mariner transposable elements. J. Mol. Biol. 323, 441-452. Volff, J. N.; Korting, C.; Froschauer, A.; Sweeney, K. and Schartl, M. (2001) Non-LTR retrotransposons encoding a restriction enzymelike endonuclease in vertebrates. J. Mol. Evol. 52, 351-360. Weil, C. F. and Kunze, R. (2000) Transposition of maize Ac/Ds transposable elements in the yeast Saccharomyces cerevisiae. Nat. Genet. 26, 187-190. Wessler, S. R. (1998) Transposable elements and the evolution of gene expression. Symp. Soc. Exp. Biol. 51, 115-122. Wilhelm, M. and Wilhelm, F. X. (2001) Reverse transcription of retroviruses and LTR retrotransposons. Cell. Mol. Life. Sci. 58, 1246-1262. Yang, J.; Malik, H. S. and Eickbush, T. H. (1999) Identification of the endonuclease domain encoded by R2 and other site-specific, nonlong terminal repeat retrotransposable elements. Proc. Natl. Acad. Sci. USA 96, 7847-7852. Zayed, H.; Izsvak, Z.; Walisko, O. and Ivics, Z. (2004) Development of hyperactive sleeping beauty transposon vectors by mutational analysis. Mol. Ther. 9, 292-304. Zhong, J.; Karberg, M. and Lambowitz, A. M. (2003) Targeted and random bacterial gene disruption using a group II intron (targetron) vector containing a retrotransposition-activated selectable marker. Nucleic Acids Res. 31, 1656-1664. Zimmerly, S.; Hausner, G. and Wu, X. (2001) Phylogenetic relationships among group II intron ORFs. Nucleic Acids Res. 29, 1238-1250.

Revised Received: 27 June, 2005

Accepted: 08 July, 2005