A Novel Additional Group II Intron Distinguishes the ... - Springer Link

1 downloads 0 Views 237KB Size Report
However, in contrast to the other land plant species, the rps3 gene in Cycas mitochondria is unique in possessing a second intron: rps3i2. Reverse tran-.
J Mol Evol (2005) 60:196–206 DOI: 10.1007/s00239-004-0098-4

A Novel Additional Group II Intron Distinguishes the Mitochondrial rps3 Gene in Gymnosperms Teresa M.R. Regina,1 Ernesto Picardi,1 Loredana Lopez,1 Graziano Pesole,2 Carla Quagliariello1 1 2

Dipartimento di Biologia Cellulare, Universita` degli Studi della Calabria, 87036 Arcavacata di Rende, Italy Dipartimento di Scienze Biomolecolari e Biotecnologie, Universita` degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy

Received: 29 March 2004 / Accepted: 9 September 2004 [Reviewing Editor: Dr. Rafael Zardoya]

Abstract. Comparative analysis of the ribosomal protein S3 gene (rps3) in the mitochondrial genome of Cycas with newly sequenced counterparts from Magnolia and Helianthus and available sequences from higher plants revealed that the positional clustering with the genes for ribosomal protein S19 (rps19) and L16 (rpl16) is preserved in gymnosperms. However, in contrast to the other land plant species, the rps3 gene in Cycas mitochondria is unique in possessing a second intron: rps3i2. Reverse transcription–polymerase chain reaction (RT-PCR) analysis of the transcripts generated from the rps19– rps3–rpl16 cluster in Cycas mitochondria demonstrated that the genes are cotranscribed and extensively modified by RNA editing and that both introns are efficiently spliced. Despite remarkable size heterogeneity, the Cycas rps3i1 can be shown to be homologous to the group IIA introns present within the rps3 gene of algae and land plants, including Magnolia and Helianthus. Conversely, sequences similar to the rps3i2 have not been reported previously. On the basis of conserved primary and secondary structure the second intervening sequence interrupting the Cycas rps3 gene has been classified as a group II intron. The close relationship of the rps3i2 to a group of different plant mitochondrial introns is intriguing and suggestive of a mitochondrial derivation for this novel intervening sequence. Interestingly, the rps3i2 appears to be conserved at the same gene location in other gymnosperms. Furthermore, the Correspondence to: Carla Quagliariello; email: c.quagliariello@ unical.it

pattern of the rps3i2 distribution among algae and land plants provides evidence for the evolutionary acquisition of this novel intron in gymnosperms via intragenomic transposition or retrotransposition. Key words: Gymnosperm mitochondrial genome — Ribosomal protein rps19–rps3–rpl16 gene cluster — Sequence analysis — mRNA editing — Group II intron structure model — Novel group II intron gain

Introduction 1 The complete mitochondrial DNA (mtDNA) sequences of one bryophyte, Marchantia polymorpha (Oda et al. 1992), and four angiosperms, Arabidopsis thaliana (Unseld et al. 1997), Beta vulgaris (Kubo et al. 2000), Oryza sativa L. (Notsu et al. 2002), and Brassica napus (Handa 2003), is currently our blueprint for the comparative analysis of land plant mitochondrial genomes. It is apparent that this representation of land plants is particularly skewed towards angiosperms and model systems and a broader perspective is needed to enhance our comparative understanding of the evolutionary dynamics of land plant mitochondrial genomes. Comparisons between liverwort and the aforementioned flowering plant mitochondrial genomes suggest an evolutionary trend towards a decrease in coding capacity (Oda et al. 1992; Handa 2003). Therefore, the basal groups of land plants are more likely retaining a similar com-

197

plement of mitochondrially encoded genes as their green algal ancestor (Turmel et al. 2003). These mitochondrial genes encode components of respiratory complexes (nad, sdh, cob, and cox), rRNAs and tRNAs, ribosomal proteins (rps and rpl), a group of proteins involved in cytochrome c biogenesis (ccb), and a number of other open reading frames (orfs) (Handa 2003 and references therein). During angiosperm evolution, distinct lineages have experienced frequent independent transfers of mitochondrial genes to the nucleus. This has led, particularly in the case of ribosomal proteins, to a variable distribution of mitochondrially encoded genes among extant angiosperms (Adams and Palmer 2003; Sandoval et al. 2004). Intriguingly, it has recently been shown that genes lost early during evolution can be reacquired by the mtDNA in individual angiosperm lineages either via intracellular gene transfer (IGT) or via plant-toplant horizontal gene transfer (HGT) (Bergthorsson et al. 2003). In land plant mitochondria, protein-coding genes frequently contain introns of groups I and II (Lehmann and Schmidt 2003). In the Marchantia mtDNA 32 introns of both types are present (Oda et al. 1992), while most of the approximately 25 introns identified to date in flowering plant mitochondria (Unseld et al. 1997; Kubo et al. 2000; Notsu et al. 2002; Handa 2003) have been categorized as group II introns. The sole group I intron reported to date in the mtDNA of all vascular plants is located in the gene encoding the subunit 1 of cytochrome oxidase (cox1) (Cho et al. 1998). Plant mitochondrial group II introns are found in either cis or trans configuration (Morawala-Patell et al. 1998) and can assume a canonical secondary structure with six helical domains (dI–dVI) radiating from a central core (Lehmann and Schmidt 2003 and references therein). However, group II introns can be further subdivided into groups IIA and IIB according to anatomical and structural differences (Lehmann and Schmidt 2003). The definition of functional domains of higher plant introns is based by analogy on data from yeast self-splicing intron models (Lehmann and Schmidt 2003). Plant mitochondrial group II introns are poorly catalytic RNAs and are incapable of self-splicing in vitro. Although many group II introns contain an orf, which encodes a multifunctional protein involved in splicing and mobility, plant mitochondrial group II introns are orf-less and most of them are derivatives of mobile introns (Lehmann and Schmidt 2003). Up to now only nad1i4 has been found to contain a maturase-type (matR) gene (Wahleithner et al. 1990). It is intriguing that several group II introns evolved from cis- to trans-splicing during pteridophyte and angiosperm evolution (Malek et al. 1997; Qiu and Palmer 2004). The pattern of intron composition and distribution among mtDNAs of green algae and land plants

is consistent with the idea that the great majority of introns originated after the emergence of land plants and that the majority of the liverwort introns arose independently from their counterparts in angiosperms (Turmel et al. 2003). Although among angiosperms mitochondrial group II introns are mostly vertically inherited, the particular distribution of introns among land plants has been interpreted as suggesting plant-to-plant horizontal transfer (Won and Renner 2003). Uncovering the history of introns in plant mitochondria is thus complicated by their different modes of inheritance. Therefore, the evolutionary history of introns and the relative contributions of intron loss and acquisition in the evolution of mitochondrial genes remains on the whole poorly understood. Regulation of land plant mitochondrial gene expression is posttranscriptionally mediated both at the level of splicing of intronic sequences and through an mRNA editing process that rectifies DNA genetic information so that functionally competent and evolutionarily conserved polypeptides are translated (reviewed by Gray 2003). These nucleotide amendments are enzymatic transformations of cytidines (Cs) to uridines (Us) and occasionally Us-toCs in the mRNAs (Gray 2003). Editing of mitochondrial transcripts is widespread within the land plants, occurring in all major groups, including the Bryophyta (Gray 2003). Nonetheless, as studies of RNA editing have been extended to a broader selection of land plants, some intriguing differences have started to emerge (Perrotta et al. 1996; Lu et al. 1998). To date, no extensive molecular studies have considered the mitochondrial genomes of extant gymnosperms. These land plants occupy an evolutionary position between ferns and angiosperms—a key node in plant evolution particularly with respect to the study of the seed plant origins and early divergence (Pryer et al. 2001). Therefore, our analysis has been extended to one of the major, but understudied, land plant lineages: the cycads, historically treated as the most primitive extant seed plants and, thus, often termed living fossils (Rai et al. 2003). Here we focus on the rps3 gene in Cycas revoluta mitochondria and report its structure, genomic organization in cluster and cotranscription with the rps19 and rpl16 genes. Furthermore, RT-PCR analysis demonstrated that the Cycas rps19–rps3–rpl16 transcripts undergo accurate processing by RNA editing and splicing. To gain insights into evolutionary affiliations between land plants, we also first report in the present study the rps3 locus in other distantly related flowering plants such as Magnolia and Helianthus.

198 Table 1. Plant species sampled for this study and sizes and GenBank accession numbers of the rps3i1 sequences Classification Angiosperms Monocots Eudicots

Gymnosperms

Species

rps3i1 size (bp)

GenBank accession No.

Oryza sativa Zea mays Petunia hybrida Oenothera berteriana Alnus maritima Betula pendula Carpinus caroliniana Quereus rubra Arabidopsis thaliana Helianthus annuus Magnolia liliiflora Cycas revoluta

1847 1843 1475 1649 1732 1599 1672 1650 1580 976 1866 2984

D21251 U96618 X67028 X69140 AF080077 AF080081 AF080083 AF080088 NC001284 AF319170 AF319171 AY345867

Unexpectedly, comparative analysis revealed that in contrast to other plant species, the rps3 orf in Cycas mitochondria harbors a novel second cis-splicing intron in addition to the group II intron present at the same location in angiosperms. This extra rps3 intron seems to be positionally conserved in other representatives of gymnosperms. These original findings and their implications concerning the evolutionary history of land plant mitochondrial genome and its group II introns are discussed. Materials and Methods Plant Material Ovules and leaves of Cycas revoluta were mostly provided by the Dipartimento di Scienze Botaniche at the Universita` degli Studi of Pisa (Italy) and, partly, by the Orto Botanico of the Facolta` di Scienze at the Universita` degli Studi ‘‘Federico II’’ of Naples (Italy).

Isolation, Analysis, and Sequencing of Nucleic Acids Total cellular DNA from Cycas leaves was extracted following the CTAB procedure according to Doyle and Doyle (1990). Mitochondrial nucleic acids were extracted from ovules of C. revoluta as already described (Perrotta et al. 1996). All DNA and RNA manipulations were performed following standard techniques (Sambroock and Russel 2001).

PCR Amplification, cDNA Synthesis, and Sequencing About 0.1 lg of Cycas mtDNA was amplified with external primers specific for the rps19–rps3–rpl16 locus (see supplementary Fig. 1) and Ampli Taq Gold (Perkin Elmer). Amplification conditions were as follows: denaturation for 9 min at 94C, followed by 30 cycles of 30 s at 94C, 30 s at 58C, and 3 min at 72C, and a last elongation step of 7 min at 72C. PCR amplification products of Cycas mtDNA were purified using a QIAquick PCR Purification Kit or QIAquick Gel Extraction Kit (QIAGEN) and sequenced on both strands with an automated DNA sequencer (ABI Prism 310, Applied Biosystems, USA).

First-strand cDNA synthesis was performed on 1 to 5 lg of CsCl-purified total mitochondrial RNA from Cycas after extensive treatment with RNase-free DNase I (GIBCO-BRL) with random examers and according to the Instruction of the Superscript Preamplification System for First Strand cDNA Synthesis. The resulting cDNAs were amplified by PCR with the same primers (see supplementary Fig. 1) and conditions as described for mtDNA amplification, except that the number of amplification cycles was raised to 35. RT-PCR products were directly sequenced as described above.

Sequence Analysis The novel Cycas revoluta, Helianthus annuus and Magnolia liliiflora DNA and cDNA sequences were deposited in the database under accession nos. AY345867, AF319170, and AF319171, respectively. The Cycas as well as the Helianthus and Magnolia mtDNAs, cDNAs, and deduced amino acid sequences were compared with the sequences available in the databases (EMBL, GenBank). To analyze the rps3i1 in land plants, nine sequences were retrieved from GenBank using the accession numbers listed in Table 1. The rps3i1 sequences were pairwise analyzed with the DotPlot program of the GCG package version 9.1 to first identify common regions. These results were used to manually refine ambiguous regions of a Clustal W multiple alignment using the sequence editor GeneDoc (Higgins et al. 1994; Nicholas et al. 1997). To determine the secondary structure model for the Cycas rps3i1, the ‘‘domainby-domain’’ approach of Kelchner was adopted (Kelchner 2000). According to this approach, each domain was folded using the domain boundary sequences and specific structural elements available from the previously predicted secondary structure of the Alnus rps3 intron (Laroche and Bousquet 1999). In addition, to validate each folded domain and to identify new base pairing when the identification of domain boundaries was not detectable, the PFOLD program was used (Knudsen and Hein 1999). The complete sequences of the two rps3 introns from Cycas revoluta were submitted as independent queries to a BLAST (Altschul et al. 1997) search against a nonredundant primary database. In addition, FASTA3 (Pearson and Lipman 1988) and BLASTX (Altschul et al. 1997) were used to identify sequences related to rps3i2. Values of thermodynamic stability for some regions of the rps3i1 and for the Cycas rps3i2 dV were estimated by means of the energy minimization method implemented in Mfold by Zuker (2003). The clustering analysis for the same dV of the Cycas rps3i2 was performed with MEGA2 (Kumar et al. 2001). Finally, direct and inverted repeats within both Cycas rps3 intron sequences were identified by means of the Repfind program (Betley et al. 2002).

199

Fig. 1. Comparison of the genomic organization of the rps19, rps3, and rpl16 gene cluster in Cycas, Magnolia, and Helianthus mitochondrial DNA. Coding regions for the three ribosomal protein genes are shown as open boxes, while grey triangles indicate the two rps3 intervening sequences (rps3i1 and rps3i2). Vertical lines

below each coding region identify the editing sites found in the sequenced rps19–rps3–rpl16 cDNAs of Cycas, Magnolia, and Helianthus, while vertical bars with filled squares indicate the editing events found in the rps3–rpl16 overlap in Cycas and Helianthus. Arrows show Cs found edited in all compared transcripts.

Results and Discussion

rps3 second intron is unique to the Cycas mitochondrial genome. The rps3i2 in Cycas contributes, thus, to novel changes in the gene-structure giving rise to genetic variation upon which natural selection may act. As deduced by Southern hybridization analyses, parallel sequencing, and nucleotide and amino acid sequence comparisons, the Cycas, as well as the Magnolia and Helianthus mitochondrial rps3 gene share the same genomic context with an upstream rps19 and a downstream rpl16 gene maintaining a pattern which is highly conserved among prokaryotic, chloroplastid, and plant mitochondrial genomes (Fig. 1) (Kumar 1995; Turmel et al. 2003). It is noteworthy that the rps19 encoded in sunflower mitochondria is most likely a nonfunctional pseudogene, because of scattered deletions that disturb the reading frame, introducing a frame shift and creating several consecutive stop codons. Likewise, the rps19 gene is disrupted in Arabidopsis mitochondria, where the functional S19 protein is nuclear-encoded and imported into the mitochondrion to sustain the translation apparatus of the organelles (Sa´nchez et al. 1996). Altogether our findings suggest that the rps19– rps3–rpl16 cluster connects intact genes in the Cycas and Magnolia mitochondrial genome but not in Helianthus.

Genomic Environment and Sequence Analysis of the rps3 Locus in Cycas Mitochondria The complete sequence of the rps3 gene and its flanking regions on the Cycas (about 7-kb) mtDNA was determined by a PCR-based strategy employing sets of rps3-specific primers (see supplementary Fig. 1). The sequences of the rps3 locus from Magnolia (more than 4-kb) and sunflower (6-kb) mitochondria were also determined for further comparison. Unexpectedly, sequence analysis revealed that in contrast to the rps3 orf in Magnolia, sunflower, and other higher plants investigated to date, the Cycas rps3 coding region consists of three exons and two intervening sequences of 2984 and 1985 bp, respectively (Fig. 1). The three exons of 74 (exon 1), 193 (exon 2), and 1473 (exon 3) bp (Fig. 1), respectively, specify a 579 S3 polypeptide. The two Cycas rps3 introns were named rps3i1 and rps3i2 by order of appearance in the orf (Pruchner et al. 2001). Exon–intron boundaries for each intron were determined by comparing Cycas genomic and cDNA sequences from the spliced rps3 transcripts (see below). Both Cycas introns are located near the 5¢-end of the orf, but while the rps3i1 is a phase 2 intron, the rps3i2 is a phase 0 intron. The insertion site of this

200

Processing of rps19–rps3–rpl16 Transcripts in Cycas Mitochondria Cycas rps19–rps3–rpl16 cDNAs covering the entire coding region were obtained by RT-PCR using a specific primer set (see supplementary Fig. 1), sequenced, and then compared to the genomically encoded rps19, rps3, and rpl16 orfs. RT-PCR analysis established that the rps19, rps3, and rpl16 genes were transcribed together as polycistronic mRNAs in Cycas, Magnolia, and Helianthus mitochondria (Fig. 1), as well as in other higher plants (Kumar 1995). In addition, our RT-PCR approach (see supplementary Fig. 1) did not yield any detectable PCR amplicon derived from partially spliced rps3 mRNA molecules (data not shown), indicating that rps3 transcripts are efficiently spliced in Cycas as well as in Magnolia and Helianthus mitochondria (Fig. 1). Correct excision of both Cycas rps3 group II introns, rps3i1 and rps3i2, from the primary transcript in vivo appears, thus, to play an important role in the evolution of the rps3 gene in mitochondria and the host organism. It is likely that the insertion of an additional intron within the rps3 gene may be crucial for the mRNA stability or provide a new advantageous mechanism to control gene expression at the posttranscriptional level in Cycas mitochondria. Sequence analysis of the cDNA population derived from the Cycas rps19–rps3–rpl16 transcripts established that transcripts undergo accurate mRNA processing by mRNA editing. Pronounced variations in the frequency of RNA edited sites were observed between the rps19–rps3–rpl16 transcripts from Cycas, Magnolia, and sunflower (Fig. 1). Forty-seven edited positions occur in the rps19–rps3–rpl16 transcript of Cycas, versus the 30 and 18 sites found in Magnolia and Helianthus, respectively (Fig. 1). Interestingly, the Cycas rps3 gene is the most extensively edited, involving 28 C-to-U nucleotide transitions with a rather uneven distribution along the three rps3 exons (Fig. 1). Most of the edits observed are nonsynonymous substitutions restoring phylogenetically conserved amino acid residues. As a result, the overall similarity of the Cycas encoded S19, S3, and L16 polypeptides increases in the comparison with their counterparts in other plant and nonplant mitochondria and in eubacteria (Bock et al. 1994). This overall high degree of nucleotide conservation of the Cycas rps19–rps3–rpl16 transcripts reflects strong functional constraints on the encoded amino acid sequences. As a corollary, the Cycas rps19–rps3– rpl16 gene cluster whose transcript undergoes processing in the form of splicing and RNA editing is likely to be functional. Our results confirm that mRNA editing occurs more frequently in the Cycas rps19, rps3, and rpl16

transcripts than in the counterparts of several flowering plants so far investigated, further supporting previous observations on the high frequency of RNA editing in gymnosperm mitochondria (Perrotta et al. 1996; Lu et al. 1998). All of the aforementioned evidence suggests that, in gymnosperms, mitochondrial gene expression is regulated much more efficiently at the RNA than at the DNA level. It appears that the mitochondrial rps19–rps3– rpl16 gene cluster is evolving much faster in the longlived Cycas than in the other investigated angiosperms. The editing-related RNA sequence evolution in gymnosperm mitochondria might induce an accelerated evolution of the Cycas rps19–rps3–rpl16 genomic locus by allowing accumulation of T-to-C transitions at the mtDNA level, compensating for the slowdown caused by the long generation time of these land plants (Perrotta et al. 1996; Lu et al. 1998).

Structural and Molecular Evolutionary Features of the Two rps3 Introns A BLAST search (Altschul et al. 1997) indicated that the Cycas rps3i1 is most similar to the previously well-characterized group II intron present at the same position within the rps3 gene from several other angiosperms (Fig. 1) (Laroche and Bousquet 1999) Interestingly, comparison among a set of previously characterized 12 mitochondrial rps3 introns from different monocots and dicots, including the here-reported Magnolia and sunflower, revealed that the rps3i1 of the gymnosperm Cycas, with a size of 2984 bp, was the largest rps3i1 discovered up to now (Table 1). Conversely, the rps3 intron sequence from the mitochondrial genome of Helianthus is only 976 bp long and appears to be the most reduced in size among the analyzed eudicots (Table 1). The remarkable size heterogeneity between the rps3i1 sequences is mainly attributable to large indels. Only few differences in primary structure and a small number of differences in arrangements of direct and inverted repeats were detected among the compared mitochondrial rps3i1 sequences. A secondary structure model for the rps3i1 intron sequence of Cycas was obtained by means of the ‘‘domain-by-domain’’ approach combined with the PFOLD program (Knudsen and Hein 1999) (see Materials and Methods). These combined approaches allowed us to identify for the Cycas mitochondrial rps3i1 a novel reliable secondary-structure model in complete accordance with the group IIA intron model proposed by Michel et al. (1989) (Fig. 2).

201

Fig. 2. RNA secondary structure model of the mitochondrial rps3i1 intron in Cycas revoluta. The model was predicted from the rps3 gene nucleotide sequence. Roman numbers (I–VI) indicate the conserved domains of group II introns (dI to dVI in the text) According to the accepted secondary structure model for group II A introns (Michel et al. 1989), the rps3i1 shows six major structural helices radiating from a central wheel of single stranded segments. External and internal binding sites (EBS and IBS), the dVI bulging adenine shown with an asterisk and the c-c¢ interactions are also depicted in the above model. Large indels of 190, 155, 208, and 123 bp, distinguishing the Cycas rps3i1, were found located in the loop of dIV, which appeared to be the largest and the most variable. In contrast, the dII was the smallest but the most conserved. Loops are not drawn to scale and numbers inside the loops indicate their size. With respect to the dIII of the Alnus rps3 intron (Laroche and Bousquet 1999), a more specific and reliable base-pairing was identified for the corresponding domain in Cycas, showing a better fit with the mitochondrial consensus as indicated by boldfaced nucleotides.

According to this model the dII was the smallest domain, while the dIV, including the 70% of the total intronic sequence length, was the largest (Fig. 2). Interestingly, several differences between the predicted folding of the dIII of the Cycas rps3i1 and the structural model previously proposed for the Alnus rps3 intron (Laroche and Bousquet 1999) were noted. However, given that this different base-pairing was the only one feasible and the most conserved for the dIII of the Cycas rps3i1, we believe that the dIII folding of the Alnus rps3 intron reported earlier (Laroche and Bousquet 1999) can be considered an exception rather than the rule. Furthermore, a site similarity plot (Fig. 3) of the multiple structural alignment for rps3i1 from the plant species listed in Table 1 demonstrated that sites involved in domain base-pairing are among those most highly conserved. As a consequence, a different substitution pattern should be expected among the six domains of the rps3i1. A higher substitution rate was,

indeed, detected by MEGA2 software (Kumar et al. 2001) in domains I, VI, and the large and variable dIV, whereas comparable evolutionary dynamics were detected for the II and III domains (Z-test, p < 0.05). On the whole, the overall number of substitutions per site calculated for the rps3i1 of the analyzed plant species (0.105 ± 0.006) (supplementary Table 1) was in agreement with the previously estimated rate values for several mitochondrial group II introns (Laroche et al. 1997), but it was also comparable to the rate of nonsynonymous nucleotide substitutions per site of different mitochondrial exons (Laroche et al. 1997). Relative rate tests, conducted by RRTREE program (Robinson-Rechavi and Huchon 2000) with Cycas as reference taxon, revealed a higher substitution rate per site in monocots than in eudicots (p < 0.05) (see the matrix in supplementary Table 1). Alongside, an analysis of primary and secondary structural features of the 1.985-kb additional inter-

202 Fig. 3. Similarity plot of mitochondrial rps3i1 sequences from land plants. The mitochondrial rps3i1 sequences listed in Table 1 were aligned according to the combined DotPlot and Clustal W approach (Higgins et al. 1994; Nicholas et al. 1997). The highest identity scores (>0.8) were found for the 3¢-portion of the dI (P1), the dII and dIII (P2), and the dV and VI (P3) within the multiple alignment of rps3i1 sequences from the species listed in Table 1. Reduced similarity scores were detected in the dIV, where two hypervariable regions (Hp1–2) with identity less than 0.2 were identified. The unique large insertions (I1-4) of Cycas rps3i1 were also located in the large dIV and they corresponded to four regions with no identity in the similarity plot. A bar indicating nucleotide positions and domain boundaries is depicted above the plot.

vening sequence within rps3 in the Cycas mitochondrial genome (rps3i2) has been undertaken. Sequence analysis revealed an unexpectedly high similarity (97%) between a 38-bp stretch of rps3i2 and rps3i1, showing a secondary structure (Fig. 4A) consistent with those previously published for dV of plant mitochondrial group II introns (Knoop et al. 1994). Most surprisingly, FASTA results revealed that the 3¢-portion of the Cycas rps3i2 was highly similar to other group II introns interrupting plant mitochondrial genes (Fig. 4A). The highest score was on a 233-bp stretch belonging to the cobi2 of Marchantia (Ohyama et al. 1993). Furthermore, matches higher than 70 and 80% were observed on a 120-bp stretch in the nad5i1 either in Vicia (Scheepers et al. 2001) or in Beta (Kubo et al. 2000) and on a 76-bp stretch of the nad7i2 of the moss Takakia mitochondria (Pruchner et al. 2001), respectively. Interestingly, secondary structure modeling of all the aforementioned sequences revealed the presence of integral and wellconserved dV of group II introns (Fig. 4A). This high fit with the consensus (only 3 mismatches on 34 compared base pairs), and the above-reported essential features of the dV in the novel identified Cycas rps3i2 demonstrate unequivocally that it belongs to the well characterized group II introns. The significance of the identification of a dV is confirmed by the presence of an adjacent dVI (Fig. 4B) that, although less canonical, shows a six-nucleotide helix possessing a bulging adenosine which allows prediction of the 3¢-splice site (Lehmann and Schmidt 2003). Indeed, the helical dV and dVI near the splice site are the most highly conserved structures of group II introns and have already been successfully used as a specific marker for group II intron identification

(Knoop et al. 1994). Unfortunately, despite a conserved sequence at the 5¢- (GGGYG) and 3¢- (AT) ends, an extensive search failed in identifying other intron structural motifs or tertiary interactions such as EBS1–IBS1 and EBS2–IBS2, confirming the high structural flexibility existing among organellar group II intervening sequences (Lehmann and Schmidt 2003). As a result, the lack of similarity at the rps3i2 5¢-portion compared with any well-known group II introns registered in databases prevented the application of standard comparative methods to obtain a complete secondary structure for this Cycas rps3i2. However, the nonrandom nature of the expected FASTA parameters (E values ranging from 2.6e-10 to 3.8e-05) together with the conservative nature in primary and secondary RNA structure of the retrieved stretches suggests a strong selective pressure on the rps3i2 dV due to functional constraints. Even in the absence of additional strong similarities with other intron domains, the significance of rps3i2 dV similarities could possibly be explained by a common origin for both rps3 introns and the other cis-splicing group II introns or, at least, for their 3¢-region including dV (Fig. 4A). The evolutionary relationship between Cycas rps3i1 and rps3i2 dV as well as between these Cycas domains and their counterparts in Marchantia cobi2 (Ohyama et al. 1993) and Takakia nad7i2 (Pruchner et al. 2001) is depicted by the cluster analysis in Fig. 4A. Unlike the Cycas rps3i1, the second rps3 intervening sequence contains numerous repeats, mainly located in its 5¢-region. Over a stretch of 123 nucleotides four direct repeats, each 28 bp long, have been found, suggesting a high potential for recombination for this intronic region.

203

Fig. 4. A Similarity relationships between the dV regions of homologous land plant cis-group II introns detected by FASTA. B The secondary structure model of Cycas rps3i2 dV and dVI. The dV consists of one proximal and one distal stem, of 9 and 5 bp, respectively, joined by a dinucleotide insertion (AC), with the distal stem closed by a fourbase loop. The strictly conserved positions (nucleotides 2–4), the AGC triads, and nucleotides 18 (A) and 31 (G), important for dV function, have been also detected (Qin and Pyle 1998). An asterisk indicates the bulged A residue in dVI.

In addition, an advanced search in protein databases, and further investigations by means of InterPro, showed a high similarity between a stretch of 214 bp upstream of the dV of the Cycas rps3i2 and the ORF760 present on the Chara mtDNA (Turmel et al. 2003) harboring two functional domains for a maturase and a reverse transcriptase. It thus appears that the ancestral group II intron, which gave rise to the Cycas rps3i2, likely, encoded a multifunctional and now partly degenerated orf. Therefore, the Cycas rps3i2 could belong to a family of retroelements of still unknown origin (Lehmann and Schmidt 2003).

Evolutionary Considerations on the rps3 Intron Composition and Distribution During evolution of land plants the rps3 locus has undergone multiple changes in its intron content. In the liverwort Marchantia (Oda et al. 1992) the rps3 gene is devoid of introns, while the rps3 from the Chara and the angiosperms investigated to date (Handa 2003; Turmel et al. 2003), including the here-reported Magnolia and Helianthus, harbor one positionally conserved intron. Only in the Beta mtDNA rps3 is without introns (Kubo et al. 2000). Surprisingly, the rps3 orf in Cycas harbors two

204

group II introns, rps3i1 and rps3i2. To the best of our knowledge this is the first time that a novel group II intron has been found to be present within the mitochondrial rps3 gene in plants. According to a PCR assay with rps3i2-specific primers (see supplementary Fig. 1 and data not shown), the presence of rps3i2 appears to be a shared feature of the mitochondrial rps3 gene in Cycas and Ginkgo and, thus, a distinctive intron signature in gymnosperms. An upcoming larger-scale survey of additional gymnosperms, closest relatives to seed plants, and basal lineages of vascular plants for the presence of rps3i2 will allow us to determine the distribution pattern and to identify the point of acquisition of this intron. Consistent with the results reported in this study, the currently known distribution of the rps3 introns raises several alternative evolutionary scenarios, although the lack of data in extant relatives to seed plants do not allow a selection between them. The presence of closely related rps3i1 at identical positions in Chara and land plant rps3 gene suggests an earlier gain of this intron in a common ancestor of algae and land plants (Turmel et al. 2003). Although an intron positionally and structurally homologous with rps3i1 is not present in the Marchantia mtDNA (Oda et al. 1992), it has been suggested that the charalean rps3 intron might have given rise to its seed plant homolog via vertical descent (Turmel et al. 2003). To complete this picture, we have to consider a subsequent step of complete loss of the rps3i1 at least once during angiosperm evolution within the time frame of the evolutionary diversification of the lineage leading to Beta. Several cases of intron loss have, indeed, been documented for land plant species (Qiu et al. 1998; Beckert et al. 1999; Pruchner et al. 2002). The serendipitous finding of a second stable intron (rps3i2) at a novel insertion site of the rps3 gene in gymnosperms seems to suggest that this intron was independently gained in gymnosperms likely at the time or early after the divergence of the angiosperms. The absence of a rps3i2 homolog from relatives in either algal lineages or Marchantia also seems to be consistent with a relatively recent acquisition of the rps3i2 by gymnosperms. Another possible evolutionary scenario is an ancestral gain of the rps3i2 in the last common progenitor of seed plants followed by its subsequent loss in distinct lineages at the time of the evolutionary appearance of angiosperms. Therefore, since Cycads and angiosperms seem not to share a direct common ancestor (Rai et al. 2003), both rps3i1 and rps3i2 would have been retained exclusively throughout gymnosperm evolution. Lineage-specific selective pressure or dependence on specific hostencoded splicing factors would have contributed to

the loss of rps3i2 in angiosperms and other land plant lineages. Therefore, our current knowledge about the distribution of the rps3i2 in land plants does not allow the inference of how this intron was inherited. Thus, the origin of the second rps3 intron present in the gymnosperm mtDNA remains enigmatic. However, the lack of similarity of Cycas rps3i2 to any known sequences registered so far in databases does not provide direct support for horizontal transfer and points to a vertical descent from an ancestral mitochondrial genome containing it during the evolution of land plants. Interestingly, a mitochondrial ancestry of the rps3i2 is also suggested by the fact that its dV is closely related to the corresponding domain of a group of other introns, some of which are located in different genes or even located in the same mitochondrial gene at a different insertion site in mitochondria of bryophytes such as Marchantia (Oda et al. 1992) and Takakia (Pruchner et al. 2001), as well as in angiosperms (Fig. 4A). This result, in addition to the similarity to the Chara ORF760 with functions of maturase and reverse transcriptase (Turmel et al. 2003), suggests a common evolutionary origin of all those group II introns and also suggests that they have characteristics of elements mobile between different mitochondrial genes (Ohyama et al. 1993; Zanlungo et al. 1995). Limiting the number of speculations and considering the best-known mechanisms cited to explain intron gain or loss in plant mitochondria (Lehmann and Schmidt 2003), we believe that the simplest explanation is that rps3i2 have been acquired in gymnosperms via intragenomic transposition or, alternatively, through reverse transcriptase–mediated movement into novel mtDNA sites (retrotransposition). The origins of most of the group II introns in Chara mtDNA and several group II introns in Marchantia mtDNA (Ohyama et al. 1993) have also been attributed to intragenomic transposition events (Turmel et al. 2003). The evolutionary dynamics of the rps3 gene and its structural changes are likely to provide us with a broader evolutionary perspective for new mitochondrial genomic endeavors and diverse molecular innovations that have characterized the rich botanical diversity that dominates our terrestrial ecosystem. Acknowledgments. T.M.R.R. is the recipient of a postdoctoral fellowship from Italian Ministero dellÕIstruzione, Universita` e Ricerca (MIUR). E.P. is the recipient of a fellowship from the Plant Biology Ph.D. Programme of the Universita` degli Studi della Calabria. The financial support of the Universita` degli Studi della Calabria and the Italian MIUR (Progetti di Rilevante Interesse Nazionale/1999) is gratefully acknowledged. The authors thank the anonymous reviewers for their helpful comments on the manuscript.

205

References Adams KL, Palmer JD (2003) Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol 29:380–395 Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402 Bergthorsson U, Adams KL, Thomason B, Palmer JD (2003) Widespread horizontal transfer of mitochondrial genes in flowering plants. Nature 424:197–201 Betley JN, Frith MC, Graber JH, Choo S, Deshler JO (2002) A ubiquitous and conserved signal for RNA localization in chordates. Curr Biol 12:1756–1761 Beckert S, Steinhauser S, Muhle H, Knoop V (1999) A molecular phylogeny of bryophytes based on nucleotide sequences of the mitochondrial nad5 gene. Plant Syst Evol 218:179–192 Bock H, Brennicke A, Schuster W (1994) Rps3 and rpl16 genes do not overlap in Oenothera mitochondria: GTG as a potential translation initiation codon in plant mitochondria? Plant Mol Biol 24:811–818 Cho Y, Qiu YL, Kuhlman P, Palmer JD (1998) Explosive invasion of plant mitochondria by a group I intron. Proc Natl Acad Sci USA 95:14244–14249 Doyle JJ, Doyle JL (1990) Isolation of plant DNA from fresh tissue. Focus 12:13–15 Gray MW (2003) Diversity and evolution of mitochondrial RNA Editing systems. IUBMB Life 55:227–233 Handa H (2003) The complete nucleotide sequence and RNA editing content of the mitochondrial genome of rapeseed (Brassica napus L.): comparative analysis of the mitochondrial genomes of rapeseed and Arabidopsis thaliana. Nucleic Acids Res 31:5907–5916 Higgins D, Thompson J, Gibson T, Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680 Kelchner SA (2000) The evolution of noncoding chloroplast DNA and its application in plant systematics. Ann Missouri Bot Gard 87:482–498 Knoop V, Kloska S, Brennicke A (1994) On the identification of group II introns in nucleotide sequence data. J Mol Biol 242:389–396 Knudsen B, Hein JJ (1999) Using stochastic context free grammars and molecular evolution to predict RNA secondary structure. Bioinformatics 15:446–454 Kubo T, Nishizawa S, Sugawara A, Itchoda N, Estiati A, Mikami T (2000) The complete nucleotide sequence of the mitochondrial genome of sugar beet (Beta vulgaris L.) reveals a novel gene for tRNACys (GCA). Nucleic Acids Res 28:2571–2576 Kumar R (1995) Mitochondrial ribosomes and their proteins. In: Leving III CS, Vasil IK (eds) The molecular biology of plant mitochondria. Kluwer Academic, Dordrecht, pp 131–138 Kumar S, Tamura K, Jakobsen IB, Nei M (2001) MEGA2: Molecular Evolutionary Genetics Analysis software. Bioinformatics 17:1244–1245 Laroche J, Bousquet J (1999) Evolution of the mitochondrial rps3 intron in perennial and annual angiosperms and homology to nad5 intron 1. Mol Biol Evol 16:441–452 Laroche J, Li P, Maggia L, Bousquet J (1997) Molecular evolution of angiosperm mitochondrial introns and exons. Proc Natl Acad Sci USA 94:5722–5727 Lehmann K, Schmidt U (2003) Group II introns: structure and catalytic versatility of large natural ribozymes. Crit Rev Biochem Mol Biol 38:249–303

Lu MZ, Szmidt AE, Wang XR (1998) RNA editing in gymnosperms and its impact on the evolution of the mitochondrial cox1 gene. Plant Mol Biol 37:225–234 Malek O, Brennicke A, Knoop V (1997) Evolution of trans-splicing plant mitochondrial introns in pre-Permian times. Proc Natl Acad Sci USA 94:553–558 Michel F, Umesono K, Oseki H (1989) Comparative and functional anatomy of group II catalytic introns—A review. Gene 82:5–30 Morawala-Patell V, Gualberto JM, Lamattina L, Grienenberger JM, Bonnard G (1998) Cis- and trans-splicing and RNA editing are required for the expression of nad2 in wheat mitochondria. Mol Gen Genet 258:503–511 Nicholas KB, Nicholas HB Jr, Deerfield DW II (1997) GeneDoc: analysis and visualization of genetic variation. EMBNEW.NEWS 4:14 Notsu Y, Masood S, Nishikawa N, Kubo N, Akiduki G, Nakazono M, Hirai A, Kadowaki K (2002) The complete sequence of the rice (Oriza sativa L.) mitochondrial genome: frequent DNA sequence acquisition and loss during the evolution of flowering plants. Mol Genet Genomics 268:434–445 Oda K, Yamoto K, Ohta E, Nakamura Y, Takemura M, Nozato N, Akashi K, Kanegae T, Ogura Y, Kohchi T, Ohyama K (1992) Gene organization deduced from the complete sequence of liverwort Marchantia polymorpha mitochondrial DNA. J Mol Biol 223:1–7 Ohyama K, Oda K, Ohta E, Takemura M (1993) Gene organization and evolution of introns of a liverwort, Marchantia polymorpha, mitochondrial genome. In: Brennicke A, Kuck U (eds) Plant mitochondria. Verlag Chemie, Weinheim, Germany, pp 115–129 Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85:2444–2448 Perrotta G, Regina TMR, Ceci LR, Quagliariello C (1996) Conserved organization of the mitochondrial nad3 and rps12 genes over evolutionarily distant angiosperms. Mol Gen Genet 251:326–337 Pryer KM, Schneider H, Smith AR, Cranfill R, Wolf PG, Hunt JS, Sipes SD (2001) Horsetails and ferns are a monophyletic group and the closest living relatives to seed plants. Nature 409:618–622 Pruchner D, Beckert S, Muhle H, Knoop V (2002) Divergent intron conservation in the mitochondrial nad2 gene: signatures for the three bryophyte classes (mosses, liverworts, and hornworts) and the lycophytes. J Mol Evol 55:265–271 Pruchner D, Nassal B, Schindler M, Knoop V (2001) Mosses share mitochondrial group II introns with flowering plants, not with liverworts. Mol Genet Genomics 266:608–613 Qin PZ, Pyle AM (1998) The architectural organization and mechanistic function of group II intron structural elements. Curr Opin Struct Biol 8:301–308 Qiu Y-L, Palmer JD (2004) Many independent origins of trans splicing of a plant mitochondrial group II intron. J Mol Evol 59:80–89 Qiu Y-L, Cho Y, Cox JC, Palmer JD (1998) The gain of three mitochondrial introns identifies liverworts as the earliest land plants. Nature 394:671–674 Rai HS, OÕBrien HE, Reeves PA, Olmstead RG, Graham SW (2003) Inference of higher-order relationships in the cycads from a large chloroplast data set. Mol Phylogenet Evol 29:350–359 Robinson-Rechavi M, Huchon D (2000) RRTree: relative-rate tests between groups of sequences on a phylogenetic tree. Bioinformatics 16:296–297 Sambroock J, Russel DW (2001) Molecular cloning. A laboratory manual, 3rd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY Sa´nchez H, Fester T, Kloska S, Schroder W, Schuster W (1996) Transfer of rps19 to the nucleus involves the gain of an RNP-

206 binding motif which may functionally replace RPS13 in Arabidopsis mitochondria. EMBO J 15:2138–2149 Sandoval P, Leon G, Gomez I, Carmona R, Figueroa P, Holuigue L, Araya A, Jordana X (2004) Transfer of RPS14 and RPL5 from the mitochondrion to the nucleus in grasses. Gene 324:139–147 Scheepers DGJM, Luo H, Boutry M (2001) Variant mitochondrial transcripts of a broad bean line are associated with two point mutations located upstream of nad5 exon c. Plant Sci 129:203– 212 Turmel M, Otis C, Lemieux C (2003) The mitochondrial genome of Chara vulgaris: insights into the mitochondrial DNA architecture of the last common ancestor of green algae and land plants. Plant Cell 15:1888–1903

Unseld M, Marienfeld JR, Bret P, Brennicke A (1997) The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides. Nature Genet 15:57–61 Wahleithner JA, MacFarlane JL, Wolstenholme DR (1990) A sequence encoding a maturase-related protein in a group II intron of a plant mitochondrial nad1 gene. Proc Natl Acad Sci USA 87:548–552 Won H, Renner S (2003) Horizontal gene transfer from flowering plants to Gnetum. Proc Natl Acad Sci USA 100:10824–10829 Zanlungo S, Quinones V, Moenne A, Holuigue L, Jordana X (1995) Splicing and editing of rps10 transcripts in potato mitochondria. Curr Genet 27:565–571 Zuker M (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 31:3406–3415