Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
RESEARCH ARTICLE
Open Access
Evolution of plant phage-type RNA polymerases: the genome of the basal angiosperm Nuphar advena encodes two mitochondrial and one plastid phage-type RNA polymerases Chang Yin2, Uwe Richter3, Thomas Börner1, Andreas Weihe1*
Abstract Background: In mono- and eudicotyledonous plants, a small nuclear gene family (RpoT, RNA polymerase of the T3/T7 type) encodes mitochondrial as well as chloroplast RNA polymerases homologous to the T-odd bacteriophage enzymes. RpoT genes from angiosperms are well characterized, whereas data from deeper branching plant species are limited to the moss Physcomitrella and the spikemoss Selaginella. To further elucidate the molecular evolution of the RpoT polymerases in the plant kingdom and to get more insight into the potential importance of having more than one phage-type RNA polymerase (RNAP) available, we searched for the respective genes in the basal angiosperm Nuphar advena. Results: By screening a set of BAC library filters, three RpoT genes were identified. Both genomic gene sequences and full-length cDNAs were determined. The NaRpoT mRNAs specify putative polypeptides of 996, 990 and 985 amino acids, respectively. All three genes comprise 19 exons and 18 introns, conserved in their positions with those known from RpoT genes of other land plants. The encoded proteins show a high degree of conservation at the amino acid sequence level, including all functional crucial regions and residues known from the phage T7 RNAP. The N-terminal transit peptides of two of the encoded polymerases, NaRpoTm1 and NaRpoTm2, conferred targeting of green fluorescent protein (GFP) exclusively to mitochondria, whereas the third polymerase, NaRpoTp, was targeted to chloroplasts. Remarkably, translation of NaRpoTp mRNA has to be initiated at a CUG codon to generate a functional plastid transit peptide. Thus, besides AGAMOUS in Arabidopsis and the Nicotiana RpoTp gene, N. advena RpoTp provides another example for a plant mRNA that is exclusively translated from a non-AUG codon. In contrast to the RpoT of the lycophyte Selaginella and those of the moss Physcomitrella, which are according to phylogenetic analyses in sister positions to all other phage-type polymerases of angiosperms, the Nuphar RpoTs clustered with the well separated clades of mitochondrial (NaRpoTm1 and NaRpoTm2) and plastid (NaRpoTp) polymerases. Conclusions: Nuphar advena encodes two mitochondrial and one plastid phage-type RNAP. Identification of a plastid-localized phage-type RNAP in this basal angiosperm, orthologous to all other RpoTp enzymes of flowering plants, suggests that the duplication event giving rise to a nuclear gene-encoded plastid RNA polymerase, not present in lycopods, took place after the split of lycopods from all other tracheophytes. A dual-targeted mitochondrial and plastididal RNA polymerase (RpoTmp), as present in eudicots but not monocots, was not detected in Nuphar suggesting that its occurrence is an evolutionary novelty of eudicotyledonous plants like Arabidopsis.
* Correspondence:
[email protected] 1 Institut für Biologie, Humboldt-Universität zu Berlin, Chausseestr. 117, 10115 Berlin, Germany Full list of author information is available at the end of the article © 2010 Yin et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
Background In the mitochondria of all eukaryotes, with the exception of jacobids, the bacterial-type RNA polymerase of the former endosymbiont has been replaced by a T-odd phage-type RNA polymerase (for review, see [1]). The mitochondrial genome of the jacobid Reclinomonas americana encodes a bacterial-type RNAP [2,3], whose expression has still to be demonstrated. Likewise, chloroplast genomes have retained the rpoA, B, and C genes of their cyanobacterial ancestor, which encode the core subunits of the plastid-encoded plastid RNAP (PEP). Additionally, mono- and eudicotyledonous plants were found to require a second, nuclear gene-encoded plastid RNAP activity (NEP) to transcribe their chloroplast genes [1,4,5]. Phage-type RNA polymerases were identified as representing this NEP activity [6-8]. Thus, in mono- and eudicots, nuclear gene-encoded phage-type RNA polymerases (RpoT polymerases) not only transcribe the mitochondrial genome but are also involved in the transcription of the plastid genome [1,5,9]. Genes encoding phage-type RNA polymerases have been identified in the nuclear genomes of various flowering plants, like Chenopodium album [10], Arabidopsis thaliana [7,11], Nicotiana ssp. [12-14], Zea mays [15], wheat [16], barley [17], and rice [18]. The moss Physcomitrella patens contains three RpoT genes [19,20], genome project data, http://www.phytozome.net/physcomitrella. Two of the Physcomitrella RpoTs are potentially capable of being targeted to both mitochondria and chloroplasts [19], whereas the third gene encodes an RNAP of exclusively mitochondrial localization (U. Richter, unpublished data). Eudicots like Arabidopsis and Nicotiana harbor three phage-type RNA polymerases as well, but their localization within the cell differs from the Physcomitrella enzymes. Eudicots possess a mitochondrial (RpoTm), a plastid (RpoTp) and a dual-targeted phage-type RNA polymerase (RpoTmp; [11,13,14]), the latter involved in the transcription of mitochondrial and plastid genes [21-24]. No phage-type NEP has been detected in algae thus far. In Chlamydomonas, only one RpoT gene was identified (Weihe et al., unpublished data; genome project data, http://genome.jgi-psf.org/ Chlre4/Chlre4.home.html), presumably encoding a mitochondrial-localized RNAP. The single-copy RpoT genes identified in the genomes of other green algae (Ostreococcus, Micromonas), most likely, encode mitochondrial RNA polymerases. Multiple phage-type RNA polymerases are only found in land plant species. Maier and colleagues [25] proposed that this feature could either be a prerequisite for the spatio-temporal regulatory needs of embryophytes and an adaption to the peculiar requirements of a terrestrial life style or it might be the mere result of the specifics of the plant organelle genetic
Page 2 of 10
systems in interaction with the nuclear genome (transgenomic suppression of point mutations). In this context it is interesting to note that the lycophyte Selaginella moellendorffii possesses also only a single RpoT polymerase, which likely is exclusively active in mitochondria [26]. Thus, there seems to be no NEP activity in the lycophytes. Like the Physcomitrella RpoTs, the Selaginella polymerase is separated in phylogentic trees from the angiosperm clade, which forms two groups: plastid-localized enzymes on one hand, and mitochondrial and dual-targeted polymerases on the other [1,5]. The origin of the NEP activity as found in mono- and eudicots and of the dual-targeted RpoT polymerases observed in eudicots remains unclear. To gain a deeper insight into the evolution of phagetype RNA polymerases in the plant lineage and to deepen our understanding of the significance of multiple phage-type RNAP activities in both mitochondria and plastids we have investigated the waterlily Nuphar advena. Together with Amborella, Liriodendron and Acorus, Nuphar is one of the most studied basal angiosperms. As one of the deepest branching angiosperms, Nuphar has become an important model plant for understanding the origin of key angiosperm innovations. Here, we report the identification and characterization of three RpoT genes from Nuphar advena. Our data indicate that Nuphar advena (and possibly other basal angiosperms) possesses two mitochondrial-localized phage-type RNAPs as well as already a plastid-localized polymerase.
Results Nuphar advena possesses three RpoT genes
Screening of a BAC library identified three different RpoT genes in N. advena. 24 BAC clones hybridized with an RpoT cDNA fragment from Selaginella used as probe. PCR and sequencing suggested that they represented three similar, yet individual genes. Two of these genes have been sequenced completely, the third one in large portions, including all exons (see Figure 1). The genes were named, according to subcellular localization (see below) of their gene products, NaRpoTm1, NaRpoTm2, and NaRpoTp. The sequences of the three NaRpoT genes were deposited in the EMBL database under accession numbers FN811768 (NaRpoTm1), FN820498 (NaRpoTm2) and FN811769 (NaRpoTp), respectively. The lengths of the three genes were 28.5 kb for NaRpoTm1, > 16.2 kb for NaRpoTm2, and 13.6 kb for NaRpoTp. Isolation of Nuphar RpoT cDNAs
Full-length cDNAs were obtained by RACE (rapid amplification of cDNA ends) reactions using specific primers
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
Page 3 of 10
NaRpoTm1 [FN811768] 0
ATG
1000
TAA
996 aa 0
1000
NaRpoTm2 [FN820498] TGA
ATG
990 aa
NaRpoTp [FN811769] CTG
TGA
985 aa
Figure 1 Nuphar advena encodes three phage-type RNA polymerases. Schematic representation of the three NaRpoT genes. Coding (black) and non-coding (gray) regions are specified on the genomic sequences. Corresponding cDNA sequences, comprising the complete RpoT reading frames, are shown next to the genomic sequences. Positions of start (ATG, CTG, see text) and stop codons (TAA, TGA), as well as the length of derived polypeptides are indicated for cDNAs.
(for primer sequences, see Additional file 1) derived from the genomic sequences as shown in Figure 1. All angiosperm nuclear RpoT genes identified thus far comprise 18 introns at conserved positions [1]. Comparison of genomic and cDNA sequences (see Figure 1) shows that these 18 introns are present as well, at the same insertion sites (see Figure 2), in the three Nuphar RpoT genes. None of the additional introns found in the 5’ part of the Physcomitrella and Selaginella RpoT genes, respectively, were found in the Nuphar genes. The lengths of the introns vary considerably among the three Nuphar RpoTs, and most of the introns are much longer than those of other land plant RpoT genes. All exon-intron junctions contain conserved GT and AG sequences at the 5’- and 3’- ends of the introns, respectively. Remarkably, NaRpoTp did not exhibit the canonical translation start codon ATG (AUG). Instead, a CTG (CUG) codon was found at position +148, from which translation could be initiated. The following findings are indicative of a translation start from this position: Stop codons in the 5’ region exclude further upstream translation initiation sites. The methionine encoded by the most upstream in-frame ATG (nt 466 of NaRpoTp) aligns to amino acid residue 125 of Arabidopsis RpoTp, and the amino terminus derived from this position displayed neither plastid nor mitochondrial targeting
properties (see below). On the other hand, the deduced amino acid sequence starting at +148 is enriched in hydroxylated amino acids, but is virtually lacking acidic residues, thus exhibiting features of stroma-targeting plastid transit peptides [27]. Interestingly, a translational start from a CUG codon has been found in the RpoTp gene of tobacco [12]. Thus, we assume that translation of NaRpoTp starts from a non-canonical CUG at position +148. The predicted NaRpoT proteins comprise 996 (NaRpoTm1), 990 (NaRpoTm2) and 985 (NaRpoTp) amino acids, respectively. NaRpoTm1 and NaRpoTm2 exhibit a remarkably high identity of 96.8%, NaRpoTp has 63.1% and 64.6% identical residues compared with NaRpoTm1 and NaRpoTm2, respectively. The alignment of the RpoT polymerases from N. advena with those from Arabidopsis, Physcomitrella and Selaginella (see Figure 2) demonstrates a high degree of conservation at the amino acid sequence level, most striking in the C-terminal part, including all functionally crucial regions and residues known from the phage T7 RNA polymerase [28,29]. Targeting of the N. advena RpoTm1 and RpoTm2 polymerases
Subcellular localization of the Nuphar RpoT gene products was predicted using the algorithms TargetP [30] http://
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
AtRpoTm
1
AtRpoTmp
1
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
1
AtRpoTm AtRpoTmp AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
1 1 1 1 1 1 1
24 54 34 101 85 48 50 39 37 25
AtRpoTm
104
AtRpoTmp
149
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
131 198 183 131 141 134 129 121
Page 4 of 10
_ v - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -M W R N IL G R A SL R K V K F L S D S S - - - - - - - - - - - - - S S - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M S S A Q T P L F L A N Q T K V F D H L I P L H K P F I S S P N P V S Q S F P M W R N IA K Q A I S R S A - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M A SA A A A S P S L S L N P T S H F Q H Q T S L V T W L K P - - - - - P P M V A IG V L E P I SA V G R T G R R I S D V K L V G L R S H Q D CA P T F N L G G R V R G G M W R A A V R Q L SR Q P R E G L R G A G N C S S L F W S Q S L Q SR W T S G SA A A A V G Q V HV R A V - - - - - - - M P A E V C W T K G - - - - - I L S T T A C I F P E H V K Q V L L T G Y P V A G M W R S A A Q Q L A R - - - - Q K L H G V R S G R IA S N F P L L R Q V T T S Q S T A H Q A S A S V Q S L - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M W R A A A R Y F K S - - - - E L L H C G G S R S R IA Q V D L Y A L A Q H R L L C S A T A E V Q T - - V - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M W R A A G G R L K H - - - -R G L R T G R A D F Y R W N T TA A G P E S F H E L T D L S L H I HA S TA - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - M W R IA K K Y H P L S W V F G L A H R S I T L Q S S V S D D F S S L A - - - - - - - - - S S -----------------------------------------------------MWRA KK --HT SLW TL D IIYR SGVVR SSF SEEF SSLA ---------SS - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -MA S TA A A F L P C P P P F Q N H SR ICR - - - - - - - - - -R
G T H Y PV NR V R G IL S S - - - - - - - - - - - - - -V N L SG V R N G L S IN PV N E M G G L S S FR HG - - - -Q C YV F E G YA TA A QA ID S T D P E D E S SG S D EV N E L IT E M E - A R L NV S SQ TR G L L V S S P E S IF SK N L S FR F PV L G S P C HG K G FR C L SG ITR R E E F SK S ER C L SG T LA R G Y T SV A E E EV L S - - - T DV E E E P EV D E L L K E M K - S SA L F R R K T L P F F E R H S L P I SA S S S S S - - - S S S S T S L SV H E K P I S N SV H F H G N L I E S F E N Q D S S YA G T IK G A S L I E E L E N P V E R N G L S G R R R L F M Q D P P W D K D V N V DA R W P L R D T K V G F SR S S S D SV P S S I D Q T L DA A L H N V V N L P W L F L Q G V PA Q D V G N F HA R P G E R PA L D Y C R T YA SA A EA - - -V I D D D D E S E E E CA P D R A S E S D R S - S E F K S G L R F L K P S T S S S P S S L T P - L D SA F I E L V D L L P K C W Q S T S T T D F N H D H S K L Q T P L Q V F D SR A Y S SA L E E D E L C D E D I IG K E N E HA R D R L C G S S S - - - - - D H I F S P D S S IL S Q L N T P L E D IL DA A L L G V V SR - - - - - - - - - Y D N R D Y F R S L N P Y R SA V F S S TA E PA E Y IG - - - Y G D P Y E G T E V D Y D E Y S Q L R S S D R - - - - - SR P C S Y SA A A A P L E H S L T T - L DA A L S SV IQ L P - - - L F S N P L L Q IL P P D G E S S SR T V F R I D Q K W P F Q L K A V A T EA C F G E E E E V G V L P SR Y I H SR L A L G F R D R C S E V G T R R F L G S - I E G L G D R G V G L Y R R S I E D Q A S F L R F M NA S F R C R V G SV Q G L A N V A EA V A S T D - - E D V SA I E E IQ G L L E E L S - L R R N L H P F IL G F R D G C S E M S -R R Y K D S - M L D L G -R G V G L Y R R SV E N Q A S F L C F D D D - F R R R A G E M Q G FA SV A EA V S S T D - - E D T SA I E E IK G L V E R F S - R R SV L S S P P L P L F PR T SV CR C IQ T N P F IF S S S E F E DR S H FR N L L L Y L F P S S N F P P SL T P IDA D L V L D PA A D G FR W TA R R - -V R E IL P E E IP PR V F L L D - -
K E T E R I R K K A R L A A I P P K R V IA G M G A Q K F Y M L K Q R Q V K M E T E E W E R A A R E C R E I L A D M C E Q K L A P N L P Y M K S L F L G W F E P V R N A I Q D D L D T F K I K K G - - K E K K - - R E S H R S W R M K K Q D Q F G - M G R T K F Q N L W R R Q V K I E T E E W E R A A A E Y M E L L T D M C E Q K L A P N L P Y V K S L F L G W F E P L R D A IA K D Q E L Y R L G K S - - I SA L F L K G L S K M V D Q T L K I E R K D I D K R K F D S L R R R Q V K E E T EA W E R M V D E Y R D L E K E M C E K N L A P N L P Y V K H M F L G W F Q P L K D V I E R E Q K L Q K N K S K K V E E P P Q S V D F N K A K T A K E R A A E S R K A R W K E K A L R M R Q F K I E T E A W H Q A A A E Y K E L V A E M C K K N L A P N L P A T R S L L L G W F E P L R D A IA Q E Q K D Y E E R N F R E H EA L P N G S NW D T G L DA D S -L L E Q K SR K R KA R E L HK R Q V K IE T EA W Q QA A T E YR E L M T E M CR K SLA P N L P FA Q SL L L SW F E P L R D G IL E E Q R A Y G NR E HR E H R G I E F F M S E H A G A A V E A K E M E R K K G H N K L Q E L R Q R Q I I N E T E A W T N A E A Q Y E E F IA E M C R K K L A P N L P A S Q L L L L G W Y E P L R D A IA E E Q R A F S E L E F R G E E L E E I H D I S L E R S S K F E G G K A L K A A R A R A R R L F N R Q Q K L E L D A W D A A V R E Y R K I L V E M C R K K L A P N L P F A K S L M V S W F E P V R D E IA K E L K A I E K D E P G E D E E V A - - R Q K T G E F D A K N K P N F G D L G F G K Y K A Q K R R Q I K I E T E A W E Q A A R E Y R E L F V D M C K Q K L A P N L P Y M K S L F L G W F E P L R D R IA A E Q E S Y R K G K S - - E E V F - - K E K E G E F R G K - R T K F G N L G Y A K Y N A L K R R Q I K I E T E A W E Q A A K E Y K E L M M D M C K N K L A P N L P Y I K S L F L G W F E P L R D R IA S E Q E A C R R G K C - - P PW L P G G L V Q ER G R PA R ER ER E - - - S SK Y N L L R R R Q V K M E T EA W E DA A R E YR E L IK E M C E K K LA P HL P YV K SL F L G W F E P FR DA IK E D Q E L Q R R R K K N L -
23 53 33 100 84 47 49 38 36 24
103 148 130 197 182 130 140 133 128 120
200 242 229 297 281 230 240 228 222 216
▼ AtRpoTm AtRpoTmp AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
201 243 230 298 282 231 241 229 223 217
K I P YA P F M E - - Q L PA D K M A V I T M H K M M G L L M T NA E G V G IV K L V NA A T Q IG EA V E Q E V R I N S F L Q K K - - - - - - - - - N K K NA T D K T I N T EA E N V S E E IV A K E K A T YA H Y L D - - Q L PA D K I SV I T M H K L M G H L M T G G D - N G C V K V V HA A C T V G DA I E Q E IR I C T F L D K K - - - - - - - - - K K G D D N E E S G G V E N E - - - - - T S M K E R A A Y A P H I E - - L L P A D K M A V I V M H K M M G L V M S G H E - D G C I Q V V Q A A V S I G IA I E Q E V R I H N F L K R T - - - - - - - - - R K N N A G D - - - - S Q E E - - - - - - - L K E R A I Y G P F L A - - K L P A D M L A V I T M H R L M S L L M S D Q E - H G C V K V V H A A L Q I G E A V E Q E V G I Y K L L R S K R K V A K K V K N K V S G D A L D D L D T N T D D S IA N S V L D S R S M Y G P Y M C - - Q L PA D M L A V I T M HR L M G L L M C D Q E - H G C V K V I HA A V V IG EA V E Q E V R I F Q L M N S Q - - - K K S K DA E G N T N S L G D V A N N E E - - - L EA T R K A R A S Y G P F L C - - Q L P P S Q L A V I T M H C L L A L V M S N E K - M G Y V K V IQ A A L H IG EA V E Q E V L IR K L R IG K - - -A K K N E T K R K G I T K D D G I S T I E S G K V E L T T S P R S H Y G P L L K G A G L TA D V L A V I T M HR L V A L M M Q D V D - S G C IR L A N TA V L IG DA V E Q E IK IR R A L K K R - - - - - - - - - - - - - - K P K E L A E EA E P P T P E E E K K E R A A YA R Y I D - - H L PA D M V A V I T M H K L M G L M M T S N D - N G CA R V V Q A A C Q IG EA I E N EA R I HR F L E K T - - - - - - - - - K K K PA NA N K A EA E P E - - - - -A V T K E R A A Y G P Y L N - -R L PA E M M A V I T M H K L V G L M M T S G D - H G CA R V V Q A A C R IG DA I E N E SR I H S F L E K T - - - - - - - - - K K K HA R G N M A E E E P E - - - - - T V T K E A A A YA P F IG - - H L PA D M M A V IV M H K M M V L L M N G Q D - D G C V R L V HA A C H IG EA I E Q E V R I Y N F F Q K X - - - - - - - - - K K S T R E N Q IL S E N H E - - - - - N E S K N ▼
AtRpoTm
290
AtRpoTmp
326
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
307 395 373 325 326 312 306 300
AtRpoTm
385
AtRpoTmp
420
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
402 494 471 417 415 405 400 394
____ ✱ ___Block I___ ✱______ ▼____________Block II___________
485
AtRpoTmp
520
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
502 594 571 517 515 505 499 494
AtRpoTm
585
AtRpoTmp
620
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
602 694 671 617 615 605 599 594
___________________________ ✱ ___Block V_______________________________
685
AtRpoTmp
720
AtRpoTp
702
PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
794 771 717 714 704 699 694
784
AtRpoTmp
819
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
801 893 871 816 809 803 795 793
AtRpoTm
882
AtRpoTmp
917
AtRpoTp PpRpoT1mp PpRpoT2mp PpRpoT3 SmRpoTm NaRpoTm1 NaRpoTm2 NaRpoTp
899 993 971 916 908 899 886 891
___________________Block IX____________ ▼ __
305 299
384 419 401 493 470 416 414 404 399 393
484 519 501 593 570 516 514 504 498 493
584 619 601 693 670 616 614 604 598 593
684 719 701 793 770 716 713 703 698 693
_____ ▼ __ ✱ ✱ _____________________Block XI______________________▼ _
783 818 800 892 870 815 808 802 794 792
_
____ ▼__________Block X_____ v_ ✱ _
D S L T F H A S C Y A A K I T L K A L E E M F E A A R A I K S W F G D C A K I IA S E N N A V C W T T P L G L P V V Q P Y R K P G R H L V K T T L Q V L T L S R E T D K - - V M A R R Q M T A F A P N F E K E V F G A A C Y A A K V T L A A I D E M F Q A A R A I M R W F G E C A K I IA S E N E T V R W T T P L G L P V V Q P Y H Q M G T K L V K T S L Q T L S L Q H E T D Q - - V I V R R Q R T A F P P N F E R M L F A A A C Y S A K V T L A A L G E I F E A A R A I M S W L G D C A K I IA S D N H P V R W I T P L G L P V V Q P Y C R S E R H L I R T S L Q V L A L Q R E G N T - - V D V R K Q R T A F P P N F Q T E M Y R A A C Y A A K T T L N A L G E M F K E A R C I M S W L G D C A K I IA S N G E T V K W T T P L G L P V V Q P Y R K P G R H L V K T S L Q V L A L R N L D A D Q P V L V Q R Q K S A F P P N F P V D T Y R A A C Y A A K V T L D A L G E M F K E A R C I M S W L G D C A K I IA A A G H T V R W T S P L G L P I V Q P Y R K H S R H L V K T S L Q V L A L R N T D D N H P V L A S R Q R S A F P P N F G G E L F K A S V Y A A K V T L D A L G E G F R E A R C I M N W L S E C A Q I IA H S G N S V K W T T P L G L E V V Q P Y R N P S R H L V K T A L Q D L H I R S V D V D S P V L K T R Q R S A F A P N F E P T L Y R T G C Y A A K V T L N A L G E M F G E A R L I M N W L G Q C A K V IA N D G D S V R W I T P L G L P V V Q P Y R R P G R H V V K T C L Q C L I L R - T D T D Q P V L A A R Q R S A F P P N F D -Q V F GA A C YA A K T T L TA L E E M F QA A R A IM SW L G D CA K V I S - E N E PV R W T T P L G L PV V Q P Y K K K G R C L V K T SL Q V LA L K R E T D K - -V M IK R Q R TA F P P N F D - Q V F G A A C Y A A K T T L A A L E E M F Q A A R G I M S W L G D C A K V IA S E N E S V R W T T P L G L P V V Q P Y K K Q G R C L V - - - - - - L A L Q K E T D K - - V M I K R Q R T A F P P N F D K I L F S A A C Y A A K V T L S A L E E M F Q A A R S I M S W L G D C A K V IA S E N E P V R W T T P L G L P V V Q P Y W K Q S R H L V R T S L Q V L A L Q R E S N K - - V L V K R Q K T A F P P N F _ ✱ ___ ✱__________________
324
311
___ ▼ __ ✱ ____ ✱ ____ ✱ ____ ✱ ✱ _Block VII_______________
H Y A A L G R D K L G A D A V N L V T G E K P A D V Y T E IA A R V L K I M Q Q D A E E D P E T F P N A T Y A K L M L D Q V D R K L V K Q T V M T S V Y G V T Y S G A R D Q I K K R L K E R G - T F E D H Y A A L G R D T L G A E A V N L V A G E K P A D V Y S G IA T R V L D I M R R D A D R D P E V F P E A L R A R K L L N Q V D R K L V K Q T V M T S V Y G V T Y I G A R D Q I K R R L K E R S - D F G D H YA A L G R D S F EA A A V N L V A G E K PA DV Y S E I SR R V H E IM K K D S SK D P E S N P TA A LA K IL IT Q V DR K L V K Q TV M T SV Y G V T YV GA R E Q IK R R L E E K G -V IT D H Y A A L G R D R I G A E S V N L V A G D K P A D V Y S G IA A R V R E I M E R D A Q K D P K T S R H A A N A K L L L P E I D R K L V K Q T V M T S V Y G V T F V G A R M Q I F N R L K E R G - T I Q E H Y A A L G R D R I G A E S V N L IA G D K P A D V Y S G IA E R V E I I M E K D A L K N P L T S R N A A S A R L L Q G Q I D R K L V K Q T V M T S V Y G V T F I G A R M Q I L N R L K E R S P IA V D H Y A A L G R D R I G A G T V N L L A G D V P A D V Y S A IA D R V H R T I E K A A L K N P E A S K H A A IA R V L L G Q I D R K L V K Q T I M T S V Y G V T F V G A R I Q I L N R L K E R G - I I Q D H Y A A L G R D K T G A R A V N L I G A D Y P A D V Y S G IA A R V R T L V E E D A R K D P - - - - A A V Y A K L L V G H V D R K L V K Q T V M T S V Y G V T Y V G A R N Q I T N R L R D K G - F I L D H Y A A L G R D K L G A A A V N L V A G E K P A D V Y S G IA E R V F D I M R R D S Q K D P T N N P S A A R A R L L I D Q V D R K L V K Q T V M T S V Y G V T Y I G A R D Q I K R R L K E R D - A I P D H Y A A L G R - K L G A A S V N L V A G E K P A D V S - G IA S G V L E I M R R D S Q K - P A N N P S A V R A R L L I D Q V D R K L V K Q T V M T S V Y G V T Y I G A R D Q I K K R L K E R D - V I P E H Y A A L G R D R L G A A S V N L V S G E K P T D V Y S G IA A R V M E I V V R D S K K D P A V H P T S L L A R I L I D Q V D R K L V K Q T V M T S V Y G V T Y I G A R D Q I K R R L K E R D - F I T D ________▼ ____________Block VIII___________________▼ ___
AtRpoTm
372
325
__________________________________Block VI____________ ▼ __ ✱ ________ ✱
K P L G K S G L R W L K I H IA N L Y A G G V D K L A Y E D R IA F T E S H L E D I F D S S D R P L E G K R W W L N A E D P F Q C L A A C I N L S E A L R S P F P E A A I S H I P I H Q D G S C N G L Q R P M G I S G L R W L K I H L A N L YA G G V D K L S L D G R L A F T E N H L D D I F D SA D R P L E G SR W W L Q A E D P F Q C L A V C I S L T EA L R S P S P E T V L S H I P I H Q D G S C N G L Q R P L G K S G L HW L K I H L A N L YA G G V E K L S H DA R L A F V E N H L D D IM D SA E N P I H G K R W W L K A E D P F Q C L A A C V IL T Q A L K S P S P Y SV I S H L P I H Q D G S C N G L Q R A L G P T G L R W L K I H L A N L Y G G K V G K M S F DA R V A W V D E V M E K V F D SA D R P L D G SR W W L DA E D P F Q F L A T C L D IR NA IK S G N P E T Y V S H L P V H Q D G S C N G L Q K P L G P S G L R W IK IQ L A N L Y G G SV G K M S F D D R A A FA E D R M E E IL D SA E R P L D G SR L W L K A E D P F Q F L A A C I D L R DA L A S G N P E T F V S H L P V H Q D G S C N G L Q R P L G E T G L R W L K I H L A N L Y G G S I S K L S F DA R V A HV D T H M D D V F D SA E N P M N G N R W W L K A E D P F Q F L A A C I D IR NA V K S G N P K T Y N S F L P V H Q D G S C N G L Q R P L G P T G L R W L K I H IA N V F A N G A D K L P F D G R V A F A E S N L E H V V A S A N Q P L K N - R W W L K A E D P F Q C L A A C I D L R N A M H S P N P E Y Y I S H L P V H Q D G S C N G L Q R P L G K SG L HW L K I HV A N L YA G G ID K L S Y D G R LA FV E N HL D D IF D - S DR P L E G SR W W L GA E D P F Q C LA A C IN L S DA L R S S S P D T T I S HV P I HQ D G S C N G L Q R P L G K S G L R W L K I H L A N L YA G G V D M L S Y D G R L A F V E N H L D D I F D SA D K P L E G SR W W L G A E D P F Q C L A A C I N L S EA L R S S S P D T T I S HV P V H Q D G S C N G L Q K P L G K T G L R W L K I H L A N L Y G G G V D K L S F D G R L T F V E N H L A D I F D SA E R P I E G R X W W L NA E D P F Q C L A A C I D L S EA L K S S S P E DA I S H I P V H L D G S C N G L Q ______________▼ ________ ✱ ✱ ___________ ✱ ____________▼_________
AtRpoTm
394
___ ▼✱ _________________________ ✱ _ ✱ _ ✱ _ ✱ __Block IV____________________________
IG G L V DR E DV P IP E E P ER E D Q E K F K NW R W E SK KA IK Q N N ER H SQ R C D IE L K L EV A R K M K D E E G F Y Y P H NV D FR G R A Y P I H P Y L N HL G S D L CR G IL E F C E G V A D M V D R S D V P L P E K P D T E D E G IL K K W K W E V K SA K K V N S E R H S Q R C D T E L K L SV A R K M K D E EA F Y Y P H N M D F R G R A Y P M P P H L N H L G S D L C R G V L E FA E G IA G L V N R E D V P I P E K P S S E D P E E L Q S W K W S A R K A N K I N R E R H S L R C D V E L K L S V A R K M K D E E G F Y Y P H N L D F R G R A Y P M H P H L N H L S S D L C R G T L E F A E G L A D L V DA E D V P L P D K P E S D D F E E IR N W R R H TA SA K R T N S E R H SV R C D T E L K L A A A R K L R D E E G F F Y P H N L D F R G R A Y P I H P H L N H L G S D M C R G IL Q FA E G IA D L V E A D D V P I P E R P D T S D K E V W H K W K V A V S Q A K R T N S E R H S L R C D T E L K L G V A D K L I D E E A F Y Y P H N L D F R G R A Y P M H P H L N H L G N D L C R G L L I F A D G L A N L V DA E D V L V PA K P E T N N L D E L K SW R R E V G IV K R T N Y E R Y S L R C D V E L K L A V A R K L V N E DA F Y L P H N L D F R G R A Y P M H P N L N H L G S D M C R G V L E FA K G IG G L V D R K D V P L P P K P N T E D EA E M R SW R K V F Y K G K R T N S E R N S Q R C D L E L K L A V A R S L K D E E C F Y Y P H N L D F R G R A Y P M HA H L N H L G S D V C R G M L L FA K G LA G L V D C Q DV P L P E K P D T E D EA V L R K W K W SL R NA R K E N S ER Y SQ R C DV E L K LA V A R K M K D E E G F Y Y P H N L D FR G R A Y P M H P Y L N HL G S D L CR G V L E FA E G LA G L V D C E DA P L P E K P D T E D EA V L R K W K W SL R N T K K E N S E K H SQ R C DV E L K L SV A R K M K E E E G F Y Y P H N L D FR G R A Y P M H P Y L N HL G S D L CR G IL E FA E G IA G L V D R S D I P L P E K P C T E D E A H M R R W R W S V K K V K K E N X E R H A Q R C D I E L K L S V A R K M R K E D G F Y Y P H N L D F R G R A Y P M H P H L N H L G S D L C R G I L E F A E G ______
306
__________ ▼ ____Block III________ ✱ ______
C D P L V L K G L D K SA R H M V I P Y L P M L I P P Q N W T G Y D Q G A H F F L P S Y V M R T H G A K Q Q R T V M K R T P K E Q L E P V Y EA L D T L G N T K W K I N K K V L S L V D R IW A N G G R C D P L V R K G L E K S G R YA V M P Y M P M L V P P L K W S G Y D K G A Y L F L T S Y IM K T H G A K Q Q R EA L K SA P K G Q L Q P V F EA L D T L G S T K W R V N K R V L T V V D R IW S S G G C C D S L L L A G L D K SA K H M L I P Y V P M L V P P K R W K G Y D K G G Y L F L P S Y IM R T H G S K K Q Q DA L K D I S H K TA HR V F EA L D T L G N T K W R V N R N IL D V V E R L W A D G G N C D PA V M SA L D K SV Q H M V M P Y M P M L V K P R A W T G F Y D G G Y L H L K S T IM R T H G A K E L R D T I I S T L R Q D M IK IV Q A L DA L G S T Q W K I N N V V L D V L E Q V W K D G G R C D Q L V LA E ID Q SV K HM V M P Y M P M L V K P L PW K G F N E G G Y L Y L K S S IM R T Q GA K E Q R MA V ID T PR K HM K V V V EA L NV L G E T G W R V N K R V L EV V E K L W KA G G G C N P L V L E Q I D K SV K Y V IM P Y M P M V S K P K HW K G F H D G G Y L F L K S S IM R T H G S K E Q Y D I F K N T P R E N M K K I F Q A L N V L G E T G W R V N K P V L A V L E Q IW K E G G R C D P H L R G R I D D SA K F IV M P Y M P M L I P P R A W K G Y H N G A Y L H L R SV IM R T H G S K Q Q R EA V R N T P R A Q L Q Q V F R A L D T L G A A SW K I N R E V F E V I E K L W SA G G G C D P L V R K G L D R TA R H M V I P Y M P M L V P P L G W T G Y D K G A H L F L P S Y V M R T H G A R Q Q R EA V K SA P R K Q L Q SV F EA L D T L G S T K W R V N K R V L A V V D R IW A S G G H C D P L V R K G L D K TA R H T - I P Y M P M L V P P L C W T G Y D K G A H L F L P S Y IM R T H G SR Q Q R EA V K R T P T Q Q L Q S I F EA L D T L G S T K W R I N K R L L A V V D R IW A S G G N C D P L V R K G L DA TA R HL V IP Y M P M L IP P K K W T G Y D K G G HL F L P S YV M R T HGA K Q Q R V A L K S IP K E Q L K K V F EA L D T L G S T K W R V N K R V L DV ID SL W A SG R P ▼
AtRpoTm
325
▼
T E K A R - - - - K Q V T V L M E K N K L R Q V K A L V R K H D S F K P W G Q EA Q V K V G A R L IQ L L M E NA Y IQ P PA E Q F D D G P P D IR PA F K Q N F R T V T L E N - T K T SR R Y G C I E Q D K L R - - - - K K V N E L I K K Q K L S A V R K I L Q S H D Y T K P W IA D V R A K V G S R L I E L L V R T A Y I Q S P A D Q Q D N D L P D V R P A F V H T F K - V A K G S - M N S G R K Y G V I E K Q L L R - - - - K R V N S L IR R K R I I DA L K V V K S - E G T K P W G R A T Q A K L G SR L L E L L I EA A Y V Q P P L T Q S G D S I P E F R PA F R HR F K T V T K Y P G S K L V R R Y G V I E D L K L K -LA K E K V K K L V K Q Q K L R L V G K V V Q QA N G D E PW G PA IQ V K V G SR L L E L M L E T SV M R S PA D Q NA Q D M T E L R PA F K H T L R N Y P IK N K N N M NR IY G V IE D L K S K K V V R D K V K K L V K Q N K L R R V D S IL K HA S T D E P W S T V I HV K L G SR L L E L M L E T S F V R A PA C Q D G D D F G E L R PA F Q H K F K N HV L R Q - - N V N R I Y G V I E E L P - - - - - - T Q A T G M Q K L R K L R L W R S M L Q K A S G S D P W G P T I YA K V G SR L L E L F M E TA V IR V P S D D P HV DA S - F E P V F Q H T N K K F V C T S - G R S S T G F G V V E Y L K - - - - - - D K L R R A I I E R - P R N V K E IL K S L D P S Q P W P S F IQ A K L G C R L I D IM M S N S H I N V P V S D C P E D G T E IR PA F R HV L K - -V P K G - - G L L R S Y G A IV K E F L R - - - - K K V T S L IK K Q K L R Q V T K IV K G K D D S E P W G T E G HA K V G - C R L I E L L M K P L I SA P L D Q TA D G - H L Y S SA F R H S L R S P S N H Q - Q N N SR R Y G V I E K E F L R - - - - K K V T N L IK K Q K L R Q V T K L V K G K D D SA P W G V EA HA K V G - C L I E L L I E TA Y IQ P P L D Q T V D G P P D IR PA F R H S L R N V S K E Q - S S N SR R Y G V I E Q EA L X - - - - K R V T N L L K K K K F R D V Q L L V EA D E M E X -W G R D S HA K L G SR L I D L L IQ TA Y V Q P P L N Q L A E G P P E IR PA F R H T F K T I X K E P - D N SV K R Y G I I E ▼
289
881 916 898 992 970 915 907 898 885 890
________________Block XII_________________
I H S L D G S H M M M TA V A C N R A G L S FA G V H D S F W T HA C D V D V M N T IL R E K F V E L Y E K P IL E N L L E S F Q K S F P D I S F P P L P E R G D F D L R K V L E S T Y F F N I H S L D G S H M M M TA V A C K R A G V C FA G V H D S F W T HA C D V D K L N I IL R E K F V E L Y S Q P IL E N L L E S F E Q S F P H L D F P P L P E R G D L D L K V V L D S P Y F F N V H S L D G T H M M M TA V A C R EA G L N FA G V H D S Y W T HA C D V D T M N R IL R E K F V E L Y N T P IL E D L L Q S F Q E S Y P N L V F P P V P K R G D F D L K E V L K S Q Y F F N V H S L D S T H M M M TA L A C Q EA G L T FA G V H D S Y W T HA G D V E Q M N S L L R E K F V E L Y S Q P V L E N L L K S F Q E R F P T L V F P E V PA R G D L D L K E V L R A P Y F F N V H S L D S S H M M M TA L A C S K A G L T FA G V H D S Y W T HA G D V E N M N V IL R K N F V K L Y K Q P IL E N L L L D F Q T Q F P D L V F P E V PA R G D L D L K E V L K S P Y F F N I H S L D S T H M M L TA L A S N Q A G I S FA G V H D S F W T HA G D V D V L N K L T R E K F V E L Y S Y P IL E N L L L G F Q R R YA D L T F P P V P E R G V L D IR E V L K A P Y F F N V H S M D G S H M M M TA V A C Q EA G L T FA G V H D S F W T HA G D V E R M N V IL R E K F V E L Y E Q P IL E N L L E S F Q K R W P K L K F P D L P V R G D L D L K E V L S S P Y F F N V H S L D G S H M M M TA V A C K K A G L K FA G V H D S Y W T HA C D V D E I N R IL R E K F V E L Y E Q P IL E N L L E G F Q K S F P K F S F P P L P D R G D F D L K E V L Q S T Y F F N - H S L D G S H M M M T A IA C K K I G L N F A G V H D S Y W T H A C D V D E M N R I L R K K F V E L Y E Q P I L E N L L E - S Q K S F P K L S F P P L P D R G D F D L K E V L K S P Y F F Q V H S L D G S H M M M TA V A C K L A G L N FA G V H D S Y W T HA C D V D D M SR IL R L K F V E L Y S M P IL E N L L E S F Q T S F P T L V F P P L P D R G D F D L Q E V L E S P Y F F N
976 1011 993 1087 1065 1010 1002 993 978 985
Figure 2 Comparison of the deduced amino acid sequences of RpoT polymerases. Sequences from Nuphar (NaRpoTm1, NaRpoTm2 and NaRpoTp), Selaginella (SmRpoTm), Arabidopsis (AtRpoTm, AtRpoTp and AtRpoTmp) and Physcomitrella (PpRpoT1mp, PpRpoT2mp and PpRpoT3) were aligned using ClustalW. Accession numbers are as follows: AtRpoTm, P92969; AtRpoTmp, CAC17120; AtRpoTp, O24600; PpRpoTmp1, CAC95163; and PpRpoTmp2, CAC95164. PpRpoT3 is an RpoT amino acid sequence derived from the database of the Physcomitrella patens genome project http://www.phytozome.net/physcomitrella. In silico analysis of the genome as well as expressed sequence tag (EST) data strongly suggest that the sequence, designated as PpRpoT3, is a product of an RpoT gene with the conserved intron-exon structure of land plants that encodes a functional RNA polymerase (U. Richter, unpublished data). Black lines indicate conserved blocks in the RpoT polymerase family; functionally crucial residues [28,29] are indicated by asterisks. The position of common introns is designated by filled triangles and PpRpoT2mp-specific introns by open triangles. Conserved amino acid positions (60%) are shaded.
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
Page 5 of 10
www.cbs.dtu.dk/services/TargetP and Predotar [31] http:// urgi.versailles.inra.fr/predotar/predotar.html. For NaRpoTm1 and NaRpoTm2 both algorithms specified a mitochondrial import of the proteins, whereas analysis of NaRpoTp clearly indicated plastid targeting properties. To verify the subcellular localization, the amino termini of the Nuphar RpoT sequences were translationally fused to GFP (Figure 3). Assuming that translation starts from the first encoded methionine, the following constructs were generated: Na-RpoTm1met-GFP and Na-RpoTm2met-GFP with the first encoded methionine cloned immediately downstream of the 35 S promoter for forced translation initiation, Na-RpoTm1 utr -GFP and Na-RpoTm2 utr -GFP containing the whole 5’ untranslated region, and NaRpoTm1mut-GFP and Na-RpoTm2mut-GFP, in which the encoded methionine had been substituted by isoleucine (see Figure 3). The fusion proteins were expressed in Arabidopsis protoplasts. The results of the subcellular import studies are presented in Figure 4. Transformation with the mitochondrial control CoxIV-GFP [32] resulted in accumulation of GFP in punctuate structures of about 1 μm size (Figure 4A) identified as mitochondria [7,11]. A GFP fusion of the amino terminus of Arabidopsis RecA [32] was employed as a plastid control (Figure 4B). In accordance with the targeting predictions, both Na-RpoTm1-
RpoTm1utr -GFP
5' UTR
RpoTm1mut -GFP
5' UTR
1 1
RpoTm2met -GFP RpoTm2utr -GFP RpoTm2mut-GFP
5' UTR
1 5' UTR
1
RpoTpmet* -GFP RpoTputr -GFP RpoTpmut -GFP
5' UTR
1 5' UTR
1
Nuphar RpoTp translation is efficiently initiated at a CUG codon
Examination of NaRpoTp upstream sequences revealed a CTG triplet at nucleotide position +148 (see above). Translation initiation at this CUG codon would give rise to an RpoTp protein of 985 residues, the amino terminus of which was predicted in silico to possess plastid
GFP S65C
RpoT amino terminus
RpoTm1met -GFP
GFP and Na-RpoTm2-GFP constructs exhibited the same characteristic subcellular localization: in the case of NaRpoTm1met-GFP (Figure 4D) and Na-RpoTm2met-GFP (Figure 4G), with forced translation from the first encoded methionine, GFP fluorescence was observed exclusively in mitochondria. The constructs containing the full-length of the 5’ untranslated leader sequence, Na-RpoTm1utr-GFP (Figure 4E) and Na-RpoTm2utr-GFP (Figure 4H) showed exclusive mitochondrial targeting as well. When the mutated (preventing recognition of the AUG codon) transit peptides Na-RpoTm1mut (Figure 4F) and Na-RpoTm2mut (Figure 4I) were used, GFP fluorescence was detectable neither in mitochondria, nor in chloroplasts. It was concluded that the AUG at position +177 (NaRpoTm1) and +253 (NaRpoTm2), respectively, are the only available RpoT start codons, from which translation of polypeptides with mitochondrial targeting properties is initiated.
Met 177
295
Met 177
295
Met 177
295
Met 253
368
Met 253
368
Met 253
368
Met * 88
236
Met * 88
236
Met * 88
236
NaRpoTm1
NaRpoTm2
NaRpoTp
Figure 3 GFP fusion constructs for targeting experiments. Amino-terminal RpoT sequences (white bars) were translationally fused to GFP S65C (green bars) in plasmid pOL (see “Methods”). The lengths of the fragments are given by nucleotide numbers (+1 is the 5’ end of the 5’UTR). The translation start is indicated by Met or Met* (CUG-coded start codon); the crossed Met (Met*) position designates the mutation introduced at that position to prevent initiation of translation.
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
A
B
C
Page 6 of 10
Na-RpoTp mut -GFP (Figure 4L) completely abolished import of the GFP to the chloroplasts. These data provide convincing evidence that translation of NaRpoTp is solely initiated from the CUG codon at position +148. Phylogenetic analysis
D
E
F
G
H
I
J
K
L
Figure 4 Subcellular localization of NaRpoT gene products. Confocal laser scanning microscopy of transformed Arabidopsis protoplasts. The images depict fluorescence patterns (merged green and red channels) of control constructs targeting GFP to mitochondria (A), plastids (B), vector control containing no transit peptide (C), Na-RpoTm1met-GFP (D), Na-RpoTm1utr-GFP (E), Na-RpoTm1mut-GFP (F), Na-RpoTm2met-GFP (G), Na-RpoTm2utr-GFP (H), Na-RpoTm2mut-GFP (I), Na-RpoTpmet*-GFP (J), Na-RpoTputr-GFP (K) and Na-RpoTpmut-GFP (L). Scale bar = 10 μm.
targeting properties. To experimentally test whether translation indeed initiates at this non-canonical codon, the following three Na-RpoTp-GFP constructs were generated (see Figure 3): Na-RpoTpmet*-GFP, with the wild-type CUG (+148) cloned immediately downstream of the 35 S promoter for forced translation; Na-RpoTputr-GFP containing the whole 5’ untranslated region of 236 nt and thus preserving the sequence context, known to be crucial for initiation at non-AUG codons in plants [33]; and Na-RpoTp mut -GFP, in which the CUG was modified to CAC to prevent the recognition of CUG as a startcodon. The Na-RpoTp met *-GFP construct gave rise to green GFP fluorescence in chloroplasts which overlapped with the red chlorophyll autofluorescence, clearly confirming co-localization of red and green fluorescence in chloroplasts (Figure 4J). An identical fluorescence pattern was observed using construct Na-RpoTputr -GFP (Figure 4K), whereas expression of
Using the Bayesian algorithm, maximum-likelihood (ML) as well as maximum parsimony (MP), phylogenetic trees were reconstructed to elucidate the molecular phylogeny of the RpoT polymerases and to determine the evolutionary position of the polymerases identified and described in the present study. Tree reconstruction was based on a multiple alignment of 41 RpoT sequences (see “Methods”). Bayesian as well as ML and MP analysis resulted in essentially the same topology (not shown). Figure 5 shows the consensus tree of a Bayesian analysis in which angiosperm RpoT polymerases constitute two clearly discernible groups: one consisting of plastid-localized polymerases, and the other of mitochondrial-localized and dual-targeted enzymes. Whereas the Selaginella and Physcomitrella polymerases do not belong to the branches of well separated plastid and mitochondrial (and dual targeted) polymerases, the RpoT polymerases from the basal angiosperm N. advena cluster with the branches of plastid and mitochondrial/ dual targeted sequences: NaRpoTm1 and NaRpoTm2 within the mitochondrial, and NaRpoTp within the plastid branch.
Discussion Genes encoding phage-type mitochondrial and plastid RNA polymerases have been identified from numerous monocotyledonous and eudicotyledonous angiosperm species (for review, see [1]). In contrast, knowledge on RpoT polymerases of deep branching land plants is so far limited to the moss Physcomitrella patens [19,20] and the lycophyte Selaginella moellendorfii [26], and no information at all is available about phage-type RNA polymerases from the basal angiosperm lineages that precede the monocot-eudicot divergence. Here we show that the waterlily Nuphar advena, a basal angiosperm, encodes three RpoT polymerases. The encoded proteins of 996, 990, and 985 amino acids, respectively, exhibit the characteristic domains that are highly conserved between all RpoT polymerases, including the residues shown to be essential and located within the catalytic pocket of the polymerase (D537, K631, Y639, G640, D812, residue numbers as given for T7 RNA polymerase). The high conservation of amino acid sequences and the identical position of the introns in the RpoT genes of Selaginella, Physcomitrella, Nuphar and monocotyledonous and eudicotyledonous angiosperms (see Figure 2) suggests a common ancestral gene giving rise to all land plant RpoT
100
100
52
94 100 100 100 100
66 68 100 100 100 100 100 99
100
100
100
100
100 100
100
100 100 100
100 93
100
79 100 100 100
100
RpoT Proteins of Flowering Plants
100
Ostreococcus tauri RpoT Ostreococcus lucimarinus RpoT Selaginella moellendorfii RpoTm Physcomitrella patens RpoTmp2 Physcomitrella patens RpoTmp1 Physcomitrella patens RpoTm Nuphar advena RpoTp Oryza sativa RpoTp Sorghum bicolor RpoTp Zea mays RpoTp Triticum avestivum RpoTp Hordeum vulgare RpoTp Arabidopsis thaliana RpoTp Spinacia oleracea RpoTp Vitis vinifera RpoTp Nicotiana sylvestris RpoTp Populus trichocarpa RpoTp2 Populus trichocarpa RpoTp1 Nuphar advena RpoTm2 Nuphar advena RpoTm1 Sorghum bicolor RpoT1 Zea mays RpoTm Oryza sativa RpoTm Hordeum vulgare RpoTm Triticum avestivum RpoTm Vitis vinifera RpoTmp Nicotiana sylvestris RpoTmp Spinacia oleracea RpoTmp Populus trichocarpa RpoTmp Arabidopsis thaliana RpoTmp Cleome spinosa RpoTmp Vitis vinifera RpoTm Nicotiana sylvestris RpoTm Chenopodium album RpoTm Brassica oleracea RpoTm Arabidopsis thaliana RpoTm Ricinus communis RpoTm Populus trichocarpa RpoTm2 Populus trichocarpa RpoTm1 Micromonas pusilla RpoT Micromonas sp. RCC299 RpoT
Plastid RpoT Proteins
Page 7 of 10
Mitochondrial and Dual RpoT Proteins
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
0.2
Figure 5 Phylogenetic analysis of RpoT sequences. ML (Bayesian) tree of plant RpoT protein sequences based on an alignment of conserved blocks (see “Methods”). For accession numbers of the sequences, see Additional file 2.
genes. Phylogenetic analysis (see Figure 5) confirms this hypothesis. Although Physcomitrella (one mitochondrial and two dual-targeted) and eudictos (one mitochondrial, one plastid and one dual-targeted) possess also three phagetype RNA polymerases, the localization of the three Nuphar RpoT polymerases shows a new pattern. The N-termini of two of the three RpoT genes of N. advena show properties of mitochondrial transit peptides. Using translational fusions of the putative NaRpoT transit peptides with GFP, we demonstrated that these transit peptides confer exclusively mitochondrial import.
Mitochondrial import of NaRpoTm1- and NaRpoTm2GFP was also maintained when the fusion constructs contained the full-length 5’-UTRs of the genes (Figure 4). We included these constructs in our study since the presence of the 5’-UTR may alter the targeting of proteins [34]. Thus, we conclude that N. advena encodes two phage-type mitochondrial RNA polymerases. Phylogenetic analysis (see Figure 5) indicates that the third RpoT gene of Nuphar, NaRpoTp, encodes a plastid phage-type RNA polymerase. In the 5’ part of the NaRpoTp cDNA no canonical start codon was identified, with the first ATG triplet occurring only at position
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
466. However, a potential non-AUG initiation codon (CUG) was revealed at position 148. Translation from this codon would yield an N-terminal leader peptide with genuine plastid targeting properties, as predicted by two prediction algorithms (TargetP and Predotar). Three different GFP fusions were designed to test the translation initiation capacity of this CUG codon. The results proved a plastid import of the derived aminoterminus (Figure 4J), as well as an efficient translation initiation at the CUG within the context of the fulllength 5’-UTR (Figure 4K) that could be abolished by modifying the codon to CAC (Figure 4L). Thus, Nuphar RpoTp belongs to the rare cases of non-viral plant genes [35-37] that initiate translation exclusively at a nonAUG codon. Interestingly, this is the second case of non-AUG translation initiation among RpoT genes specifying plastid-localized RNA polymerases: translation of the tobacco RpoTp gene also starts from a CUG codon [12]. Both mono- and eudicotyledonous plants possess a solely plastid-localized phage-type RNAP (RpoTp) together with a purely mitochondrial-localized RpoT enzyme (RpoTm) and, in the case of eudicots, a third phage-type RNAP with dual localization in both organelles is found. The data presented here suggest that all RpoTp proteins descent from a common duplication event that took place in a common ancestor of all flowering plants. Thus far it is unknown whether ferns or gymnosperms contain nuclear genes encoding plastidlocalized phage-type RNAPs as well. Since the duplication event giving rise to the second NEP activity in eudicots is clearly more recent, identification of a purely plastid-localized phage-type RNAP in the basal angiosperm Nuphar advena, orthologous to all other purely plastid-targeted enzymes (RpoTp) of flowering plants, suggests that the acquisition of a nuclear gene-encoded transcriptional activity for plastids, not present in lycopods, took place after the split of lycopods from all other tracheophytes, with or before the rise of flowering plants. Moreover, the lack of a dual-targeted RpoTmp both in Nuphar and in monocots suggests that the RpoTmp enzyme detected in eudicots is an ‘invention’ due to an RpoTm gene duplication that might have occurred only after the separation of monocots and eudicots. The putative plastid targeting sequences as present in two of the three Physcomitrella RpoT proteins are therefore clearly species- or lineage-specific convergent inventions. Interestingly, multiple mitochondrial RNA polymerasesas as found in Physcomitrella and eudicots are indentified in Nuphar as well. The fixation of duplicated RpoT genes leads to convergent multiplicity of mitochondrial RNAPs in Nuphar, Physcomitrella and eudicots, not found in any other eukaryotic lineage. Recently it was shown that in Arabidopsis RpoTmp null
Page 8 of 10
mutants transcription of a specific set of mitochondrial genes is strongly reduced. Moreover, accumulation of respiratory complexes was affected to very different levels, suggesting that the presence of multiple transcriptional activities in mitochondria may allow plants to regulate mitochondrial gene expression in a complex specific manner [24]. Further investigations will be necessary to show if a similar division of labor evolved in case of the two mitochondrial RNA polymerases in Nuphar and address the specific impact of NEP and PEP transcriptional activities for gene expression in Nuphar chloroplasts.
Conclusions Identification of three RpoT genes in Nuphar advena, specifying two mitochondrial and one plastid-localized polymerases, suggests that multiple phage-type organellar RNAPs already exist among basal angiosperms. From the high similarity of the encoded amino acid sequences, the conservation of intron positions and phylogenetic analysis we conclude that the RpoT genes of Nuphar, like those of Selaginella, Physcomitrella and monocotyledonous and eudicotyledonous angiosperms, trace back to a common ancestral gene giving rise to all land plant RpoT genes. The presence of a plastid-localized phagetype RNAP in this basal angiosperm, orthologous to all other RpoTp enzymes of flowering plants, suggests that the duplication event giving rise to a nuclear geneencoded plastid RNA polymerase, not present in lycopods, took place after the split of lycopods from all other tracheophytes. A dual-targeted mitochondrial and plastid RNA polymerase (RpoTmp), as present in eudicots but not monocots, was not detected in Nuphar suggesting that this additional NEP activity (RpoTmp) is an evolutionary novelty of eudicotyledonous plants like Arabidopsis. Our results support the idea that RpoT gene duplications occurred independently of each other several times during the evolution of plants and led to different subcellular localization patterns of of organellar RNA polymerases. These data substantially extend our knowledge about the evolution of the transcriptional machineries in plant organelles. Methods Plant material and growth conditions
Nuphar advena were purchased from a commercial supplier (Seerosen Shop, Eschede, Germany). The plants were grown in a growth chamber at 23°C with a light/ dark regime of 8/16 hr. The intensity of light in all experiments was 210 μmol photons s-1m-2. DNA and RNA isolation
Leaves of N. advena were ground to fine powder under liquid nitrogen and incubated in three volumes of CTAB
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
buffer (2% CTAB, 1.4 M NaCl, 20 mM EDTA, 100 mM Tris-HCl, pH 8.0, 2% b-mercaptoethanol) for 1 hour with agitation at 60°C. The lysate was extracted two times with chloroform-isoamyl alcohol (24:1), and the nucleic acids were precipitated with ethanol. The DNA pellet was washed with 70% ethanol and dissolved in TE buffer (10 mM Tris-HCl, 1 mM EDTA). RNA was extracted and purified using the Concert Plant RNA Reagent (Invitrogen, Karlsruhe, Germany) and RNA Cleanup Kit (Qiagen, Hilden, Germany) according to the manufacturers’ instructions. Isolation of cDNA and genomic cloning
cDNA cloning, screening of an N. advena BAC library (Nuphar_HindIII BAC; Arizona Genomics Institute, Tucson, AZ) and subcloning were performed according to standard methods [38]. A 1.5 kb cDNA fragment amplified from the 3’ part of Selaginella RpoT [26] was used as a 32P-labelled hybridization probe to screen the Nuphar BAC library, containing 165,888 independent clones on nine individual filters, under non-stringent conditions (58°C). Identified positive clones were purchased from the Arizona Genomics Institute. BAC DNA was isolated using the QIAGEN plasmid midi kit according to the protocol of the manufacturer. Sanger dideoxy sequencing of subclones, or directly of the BAC DNA by primer walking, was performed on an ABI3130xl sequencer (Applied Biosystems, Darmstadt, Germany). From the genomic sequences obtained, primers were designed (for a list of all primers used in the present study, see Additional file 1) for rapid amplification of cDNA ends (RACE). 3’- and 5’- RACE reactions were performed with the RACE primers listed in Additional File 1 using the CapFishing kit (Seegene, Rockville, USA) and Phusion hot start DNA polymerase (Finnzyme, Espoo, Finnland) following the protocols of the manufacturers. Generation of targeting constructs and transient expression
The amino-terminal sequences were amplified from cDNA of the three N. advena RpoT genes using the primers listed in Additional file 1. Products were ligated into vector pDRIVE (Qiagen) and excised using XbaI and SalI. The fragments were inserted into pOL-GFP [39] opened with SpeI and SalI, to give the constructs shown in Figure 3. coxIV- and recA-GFP constructs were employed as mitochondrial and plastid control constructs [12]. All constructs were used to transfect Arabidopsis protoplasts, isolated from 3 - 5 weeks old Arabidopsis leaves grown under long day conditions (23°C, 16/8 hr light/dark), essentially as described [40]. Cell density was adjusted to 2 × 10 6 /ml. 100 μl protoplasts were
Page 9 of 10
transfected with 20 μg plasmid DNA in 40% polyethylene glycol 4000, 0.8 M mannitol, 1 mM CaCl2. Transformed protoplasts were examined two days after transfection by confocal laser scanning microscopy with a Leica TCS SP2 using 488 nm excitation and two-channel measurement of emission from 510 to 580 nm (green/GFP) and > 590 nm (red/chlorophyll). Phylogenetic analysis
Deduced protein sequences were aligned using ClustalW [41]. Conserved blocks were cut out and merged as described earlier [19] (see Additional file 2) and subjected to Bayesian, maximum-likelihood and maximum parsimony analysis as implemented in the Geneious program package [42,43]. The Bayesian inference method employed the Mixed amino acid replacement model with a gamma distribution to represent among-site rate heterogeneity (mixed +g). MCMC was performed with 1 million generations and four independent chains and two runs. The Markov chain was sampled every 100 generations. Convergence was observed by plots of maximum likelihood (ML) scores and by using the run statistics. The first 20% of all trees generated were discarded; the remaining trees were used to construct a consensus tree and to calculate the posterior branch support values. In addition, maximum likelihood analysis with 1000 and maximum parsimony analysis with 1000 bootstrap replicates were conducted.
Additional material Additional file 1: Oligonucleotide primers used in the experiments. Additional file 2: Merged conserved blocks of 41 RpoT sequences used for reconstruction of phylogeny.
Acknowledgements We thank Susanne Beick and Björn Richter for their help during the early stage of this study. The excellent technical assistance of C. Stock is gratefully acknowledged. CY was supported by NaFög, Berlin. Part of this work was supported by a grant from the Deutsche Forschungsgemeinschaft (WE 1595/6-2, SFB 429). Author details Institut für Biologie, Humboldt-Universität zu Berlin, Chausseestr. 117, 10115 Berlin, Germany. 2Dept. of Genetics and Biotechnology, Faculty of Agricultural Sciences, Aarhus University, Forsøgsvej 1, DK-4200 Slagelse, Denmark. 3FinMIT, Mol. Neurology, Biomedicum, University of Helsinki, Haartmaninkatu 8, 00290 Helsinki, Finland.
1
Authors’ contributions AW and TB designed the research and outlined the manuscript. CY performed the experimental research. UR participated in the experimental work and performed computational phylogenetic analyses. CY, UR, AW and TB interpreted the data. AW and TB wrote the paper. All authors have read and approved the final manuscript. Received: 20 July 2010 Accepted: 6 December 2010 Published: 6 December 2010
Yin et al. BMC Evolutionary Biology 2010, 10:379 http://www.biomedcentral.com/1471-2148/10/379
References 1. Weihe A: The transcription of plant organelle genomes. In Molecular biology and biotechnology of plant organelles. Edited by: Daniell H, Chase CD. Berlin Heidelberg New York, Springer; 2004:213-237. 2. Lang BF, Burger G, O’Kelly CJ, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Gray MW: An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature 1997, 387:493-497. 3. Gray MW, Burger G, Lang BF: The origin and early evolution of mitochondria. Genome Biol 2001, 2:1018.1-1018.5. 4. Hess WR, Börner T: Organellar RNA polymerases of higher plants. Int Rev Cytol 1999, 190:1-59. 5. Liere K, Börner T: Transcription of plastid genes. In Regulation of Transcription in Plants. Edited by: Grasser KD. Oxford, Blackwell Publishing; 2007:184-224. 6. Lerbs-Mache S: The 110-kDa polypeptide of spinach plastid DNAdependent RNA polymerase: single-subunit enzyme or catalytic core of multimeric enzyme complexes? Proc Natl Acad Sci USA 1993, 90:5509-5513. 7. Hedtke B, Börner T, Weihe A: Mitochondrial and chloroplast phage-type RNA polymerases in Arabidopsis. Science 1997, 277:809-811. 8. Liere K, Kaden D, Maliga P, Börner T: Overexpression of phage-type RNA polymerase RpoTp in tobacco demonstrates its role in chloroplast transcription by recognizing a distinct promoter type. Nucleic Acids Res 2004, 32:1159-1165. 9. Shiina T, Tsunoyama Y, Nakahira Y, Khan MS: Plastid RNA polymerases, promoters, and transcription regulators in higher plants. Int Rev Cytol 2005, 244:1-68. 10. Weihe A, Hedtke B, Börner T: Cloning and characterization of a cDNA encoding a bacteriophage-type RNA polymerase from the higher plant Chenopodium album. Nucl Acids Res 1997, 25:2319-2325. 11. Hedtke B, Börner T, Weihe A: One RNA polymerase serving two genomes. EMBO Rep 2000, 1:435-440. 12. Hedtke B, Legen J, Weihe A, Herrmann RG, Börner T: Six active phage-type RNA polymerase genes in Nicotiana tabacum. Plant J 2002, 30:625-637. 13. Kobayashi Y, Dokiya Y, Sugiura M, Niwa Y, Sugita M: Genomic organization and organ-specific expression of a nuclear gene encoding phage-type RNA polymerase in Nicotiana sylvestris. Gene 2001, 279:33-40. 14. Kobayashi Y, Dokiya Y, Kumazawa Y, Sugita M: Non-AUG translation initiation of mRNA encoding plastid-targeted phage-type RNA polymerase in Nicotiana sylvestris. Biochem Biophys Res Commun 2002, 299:57-61. 15. Chang CC, Sheen J, Bligny M, Niwa Y, Lerbs-Mache S, Stern DB: Functional analysis of two maize cDNAs encoding T7-like RNA polymerases. Plant Cell 1999, 11:911-926. 16. Ikeda TM, Gray MW: Identification and characterization of T3/T7 bacteriophage-like RNA polymerase sequences in wheat. Plant Mol Biol 1999, 40:567-578. 17. Emanuel C, Weihe A, Graner A, Hess WR, Börner T: Chloroplast development affects expression of phage-type RNA polymerases in barley leaves. Plant J 2004, 38:460-472. 18. Kusumi K, Yara A, Mitsui N, Tozawa Y, Iba K: Characterization of a rice nuclear-encoded plastid RNA polymerase gene OsRpoTp. Plant Cell Physiol 2004, 45:1194-1201. 19. Richter U, Kiessling J, Hedtke B, Decker E, Reski R, Börner T, Weihe A: Two RpoT genes of Physcomitrella patens encode phage-type RNA polymerases with dual targeting to mitochondria and plastids. Gene 2002, 290:95-105. 20. Kabeya Y, Hashimoto K, Sato N: Identification and characterization of two phage-type RNA polymerase cDNAs in the moss Physcomitrella patens: implication of recent evolution of nuclear-encoded RNA polymerase of plastids in plants. Plant Cell Physiol 2002, 43:245-255. 21. Baba K, Schmidt J, Espinosa-Ruiz A, Villarejo A, Shiina T, Gardestrom P, Sane AP, Bhalerao RP: Organellar gene transcription and early seedling development are affected in the rpoT;2 mutant of Arabidopsis. Plant J 2004, 38:38-48. 22. Courtois F, Merendino L, Demarsy E, Mache R, Lerbs-Mache S: Phage-type RNA polymerase RPOTmp transcribes the rrn operon from the PC promoter at early developmental stages in Arabidopsis. Plant Physiol 2007, 145:712-721. 23. Swiatecka-Hagenbruch M, Emanuel C, Hedtke B, Liere K, Börner T: Impaired function of the phage-type RNA polymerase RpoTp in transcription of
Page 10 of 10
24.
25.
26.
27.
28. 29. 30.
31.
32.
33. 34.
35.
36.
37.
38. 39.
40.
41.
42.
43.
chloroplast genes is compensated by a second phage-type RNA polymerase. Nucleic Acids Res 2008, 36:785-792. Kühn K, Richter U, Meyer EH, Delannoy E, de Longevialle AF, O’Toole N, Börner T, Millar AH, Small ID, Whelan J: Phage-type RNA polymerase RPOTmp performs gene-specific transcription in mitochondria of Arabidopsis thaliana. Plant Cell 2009, 21:2762-2779. Maier UG, Bozarth A, Funk HT, Zauner S, Rensing SA, Schmitz-Linneweber C, Börner T, Tillich M: Complex chloroplast RNA metabolism: just debugging the genetic programme? BMC Biol 2008, 6:36. Yin C, Richter U, Börner T, Weihe A: Evolution of phage-type RNA polymerases in higher plants: characterization of the single phage-type RNA polymerase gene from Selaginella moellendorffii. J Mol Evol 2009, 68:528-538. von Heijne G, Steppuhn J, Herrmann RG: Domain structure of mitochondrial and chloroplast targeting peptides. Eur J Biochem 1989, 180:535-545. McAllister WT, Raskin CA: The phage RNA polymerases are related to DNA polymerases and reverse transcriptases. Mol Microbiol 1993, 10:1-6. Sousa R, Chung YJ, Rose JP, Wang BC: Crystal structure of bacteriophage T7 RNA polymerase at 3.3 Ao resolution. Nature 1993, 364:593-599. Emanuelsson O, Nielsen H, Brunak S, von Heijne G: Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 2000, 300:1005-1016. Small I, Peeters N, Legeai F, Lurin C: Predotar: A tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 2004, 4:1581-1590. Akashi K, Grandjean O, Small I: Potential dual targeting of an Arabidopsis archaebacterial-like histidyl-tRNA synthetase to mitochondria and chloroplasts. FEBS Lett 1998, 431:39-44. Gordon K, Fütterer J, Hohn T: Efficient initiation of translation at non-AUG triplets in plant cells. Plant J 1992, 2:809-813. Kabeya Y, Sato N: Unique translation initiation at the second AUG codon determines mitochondrial localization of the phage-type RNA polymerases in the moss Physcomitrella patens. Plant Physiol 2005, 138:369-382. Riechmann JL, Ito T, Meyerowitz EM: Non-AUG initiation of AGAMOUS mRNA translation in Arabidopsis thaliana. Mol Cell Biol 1999, 19:8505-8512. Depeiges A, Degroote F, Espagnol MC, Picard G: Translation initiation by non-AUG codons in Arabidopsis thaliana transgenic plants. Plant Cell Rep 2006, 25:55-61. Medveczky P, Nemeth A, Graf L, Szilagyi L: Methionine-independent translation initiation from naturally occuring non-AUG codon. Curr Chem Biol 2007, 1:129-139. Sambrook J, Fitsch EF, Maniatis T: Molecular Cloning: A Laboratory Manual Cold Spring Harbor, Cold Spring Harbor Press; 1989. Peeters NM, Chapron A, Giritch A, Grandjean O, Lancelin D, Lhomme T, Vivrel A, Small I: Duplication and quadruplication of Arabidopsis thaliana cysteinyl- and asparaginyl-tRNA synthetase genes of organellar origin. J Mol Evol 2000, 50:413-423. Yo S-D, Cho Y-H, Sheen J: Arabidopsis mesophyll protoplasts: a versatile cell system for trnsient gene expression analysis. Nature Protocols 2007, 2:1565-1572. Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22:4673-4680. Drummond A, Ashton B, Cheung M, Heled J, Kearse M, Moir R, StonesHavas S, Thierer T, Wilson A: Geneious v4.0. 2008 [http://www.geneious. com]. Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003, 52:696-704.
doi:10.1186/1471-2148-10-379 Cite this article as: Yin et al.: Evolution of plant phage-type RNA polymerases: the genome of the basal angiosperm Nuphar advena encodes two mitochondrial and one plastid phage-type RNA polymerases. BMC Evolutionary Biology 2010 10:379.