Mol Genet Genomics (2009) 281:525–538 DOI 10.1007/s00438-009-0429-7
ORIGINAL PAPER
Conservation of dual-targeted proteins in Arabidopsis and rice points to a similar pattern of gene-family evolution Carolina V. Morgante · Ricardo A. O. Rodrigues · Phellippe A. S. Marbach · Camila M. Borgonovi · Daniel S. Moura · Marcio C. Silva-Filho
Received: 15 December 2007 / Accepted: 25 January 2009 / Published online: 13 February 2009 © Springer-Verlag 2009
Abstract Gene duplication followed by acquisition of speciWc targeting information and dual targeting were evolutionary strategies enabling organelles to cope with overlapping functions. We examined the evolutionary trend of dual-targeted single-gene products in Arabidopsis and rice genomes. The number of paralogous proteins encoded by gene families and the dual-targeted orthologous proteins were analysed. The number of dual-targeted proteins and the corresponding gene-family sizes were similar in Arabidopsis and rice irrespective of genome sizes. We show that dual targeting of methionine aminopeptidase, monodehydroascorbate reductase, glutamyl-tRNA synthetase, and tyrosyl-tRNA synthetase was maintained despite occurrence of whole-genome duplications in Arabidopsis and rice as well as a polyploidization followed by a diploidization event (gene loss) in the latter. Keywords Gene duplication · Dual targeting · Genome evolution
C. V. Morgante and R. A. O. Rodrigues contributed equally to this work. Communicated by S. Hohmann. Electronic supplementary material The online version of this article (doi:10.1007/s00438-009-0429-7) contains supplementary material, which is available to authorized users. C. V. Morgante · R. A. O. Rodrigues · P. A. S. Marbach · C. M. Borgonovi · D. S. Moura · M. C. Silva-Filho (&) Departamento de Genética, Escola Superior de Agricultura Luiz de Queiroz, Universidade de São Paulo, Av. Pádua Dias, 11, C.P. 83, Piracicaba, SP 13400-970, Brazil e-mail:
[email protected]
Introduction Evolution of eukaryotic cells was driven through the segregation of distinct metabolic pathways by the development of a complex system of sub-compartments surrounded by membranes, the nature of each compartment being predominantly determined by the set of proteins that it contained. Although the multi-compartmentalization of eukaryotic cells enabled distinct roles for the organelles, functional overlaps have been observed in many processes in plants, indicating that some enzymes are required in more than one compartment. The acquisition of both mitochondria and chloroplasts was determinant in the evolution of plant cells, supplying them with energy and essential metabolic and biosynthetic pathways that are common to both organelles (Dyal et al. 2004). Most proteins that are required in more than one compartment are encoded by distinct genes and equipped with speciWc targeting information within the protein sequence (Schatz and Dobberstein 1996). It is likely that gene duplication events coincided with the establishment of eYcient and eVective protein translocation systems capable of targeting duplicated gene products to the diVerent organelles in which they were required. Preservation of duplicate genes appears to be related and is common to genes that are expressed in many tissues or encode multidomain proteins (Davies and Petrov 2004). Exceptions to this distinct gene strategy have been observed in increasing numbers of studies that show single gene products targeted simultaneously to more than one compartment by a wide and sophisticated range of mechanisms (Silva-Filho 2003; Mackenzie 2005; Millar et al. 2006). Important insights into genome evolution have emerged from multiple genome sequence analyses. IdentiWcation of evolutionary forces that shaped genomes and impinged
123
526
principles upon gene organization is one of the major challenges of comparative genomics. Gene duplications are a primary force in evolution and more than a third of higher eukaryotic genomes consist of duplicate genes and gene families (Lynch and Conery 2000) Thus, large-scale gene or even entire-genome duplications have played major roles in the evolutionary history of eukaryotes (Vandepoele et al. 2003). Plants have about a 20-fold variation of DNA content and how this reXects on gene content is yet to be fully understood. It has also been shown that either developmental constraints or alternative splicing may reduce the level of persistent gene duplications (Castillo-Davis and Hartl 2002; Yang and Li, 2004; Su et al. 2006). We raised the hypothesis that conservation of dual targeting of proteins in plant species may have had important implications for macro evolutionary changes. To address this issue, we conducted a comparative analysis of the number of dual-targeted (DT) proteins and their corresponding gene family sizes in Arabidopsis thaliana and Oryza sativa. Taken together, our data point out to a similar pattern of genefamily evolution in Arabidopsis and rice.
Materials and methods Protein sequence analyses For genes encoding Arabidopsis isoforms, amino acid sequences of previously described DT proteins were used as primary input inquiries in BLAST (Altschul et al. 1997) searches at “The Arabidopsis Information Resource database” (TAIR, http://www.arabidopsis.org). The Arabidopsis isoforms recovered were used to search the rice genome annotated database at “The Institute for Genomic Research” (TIGR, http://tigrblast.tigr.org). Hits with an e value above 1.0e-65 and protein sequences lacking a complete functional domain were ignored in our analysis. Exceptions were made for the aminoacyl-tRNA synthetase, ascorbate peroxidase and peptide deformylase families (Duchêne et al. 2005; Chew et al. 2003; Teixeira et al. 2006; Giglione et al. 2000; Dinkins et al. 2003). TAIR Microarray Expression Search (http://www.arabidopsis. org), and TIGR, and a TBLASTN were performed against EST databases from NCBI (Altschul et al. 1997) in order to verify if the recovered isoforms were expressed. Predictions of targeting sequences were carried out with Predotar (Small et al. 2004), TargetP (Emanuelsson et al. 2000), and iPSORT (Bannai et al. 2002). The entire amino acid sequences were aligned using CLUSTALW (Thompson et al. 1994). Phylogenetic trees were built using the SplitsTree program (Huson and Bryant 2006) and the neighbour joining method (Saitou and Nei 1987). The number of bootstrap replicates used was 100.
123
Mol Genet Genomics (2009) 281:525–538
Cloning of rice dual-targeting sequences N-terminal targeting sequences of methionine aminopeptidase (Os02g52420), monodehydroascorbate reductase (Os08g05570), glutamyl-tRNA synthetase (Os10g22380), and tyrosyl-tRNA synthetase (Os01g31610) from O. sativa were ampliWed by RT-PCR. The reaction was carried out on RNA from leaves of three-month old rice plants. The cDNA was synthesised with ImProm-II™ Reverse Transcription System (Promega), using oligo(dT). The following forward and reverse primers were used: AAG GTACCACACACACACGCAGTCAGGGAGT and TAG GATCCCCAGGTCTTTGACGAGCATTGACATATGG for methionine aminopeptidase; ATGGTACCATGACCTC AGAGGTGGCCGTAGCTCTC and TTGGATCCGTCA GTGCTGGTCGCTCATATGG for monodehydroascorbate reductase; AAGGTACCCAATCTCCGGAGATGATGGCG and TAGGATCCGCGAAGAGGTAGTTGAAGAGC for glutamyl-tRNA synthetase, and CAGGTACCCGCACA GCAAATTTCCTCCTAATCCGTTTC and ATGGATCC CCGATGAGCGATTTGATAGCGTTGGAG for tyrosyltRNA synthetase. Forward primers had a KpnI site at the 5⬘ end and reverse primers had a BamHI site. The PCR products included the Wrst 331, 341, 233 and 455 bp of methionine aminopeptidase, monodehydroascorbate reductase, glutamyl-tRNA synthetase and tyrosyl-tRNA synthetase cDNA sequences, respectively. Because of the high content of GC after the Wrst ATG, and the diYculty in designing primers at these regions, 12 and 41 bp of the 5⬘ untranslated region were included for glutamyl-tRNA synthetase and tyrosyl-tRNA synthetase, respectively. PCR products were digested with BamHI and KpnI restriction enzymes and ligated in frame with the GFP coding sequence of pUCAP-GFP vector, pUCAP containing the expression cassette of pCK-GFP3 (Menand et al. 1998), previously digested with the same enzymes. The expression cassettes, including double 35S promoter, targeting sequence, GFP sequence and terminator, were transferred to pBinPLUS binary vector (van Engelen et al. 1995) by digestion with AscI and Pac I restriction enzymes, followed by ligation. The constructions were used to transform Agrobacterium tumefaciens GV3101 cells. Transient expression in Nicotiana tabacum protoplasts and confocal microscopy analysis Agrobacterium-mediated transient transformation of N. tabacum cv. SR1 leaves was carried out: a single colony of Agrobacterium was inoculated into 5 ml of LB medium containing antibiotics, 10 mM MES (pH 5.6) and 50 M acetosyringone. After incubation at 28°C for 16 h, 1.5 ml culture was centrifuged at 13,000 rpm for 1 min, the pellet was resuspended in 1 ml 10-mM MgCl2 and the OD600 of
Mol Genet Genomics (2009) 281:525–538
527
Table 1 A. thaliana dual-targeted proteins encoded by single genes and its homologous in O. sativa
Alanyl-tRNA synthetase
Asparaginyl-tRNA synthetase
Aspartyl-tRNA synthetase
Cysteinyl-tRNA synthetase
Glycyl-tRNA synthetase
Glutamyl-tRNA synthetase
Histidyl-tRNA synthetase
Lysyl-tRNA synthetase
Methionyl-tRNA synthetase
Loci
Predotar
Target P
iPSORT
Experimental localization
References
Mireau et al. (1996)
At1g50200
M
NC
M
M/C/Cyt
At5g22800
E
C
M
M/C
Os10g10244
M
M
M
–
Os06g13660
C
C
M
–
Os07g15440
E
E
M
–
At4g17300
C
C
M
M/C
At1g70980
E
E
E
–
At5g56680
E
E
E
–
Os07g30200
M
NC
M
–
Os01g27520
E
E
E
–
Os12g22600
E
C
E
–
At4g33760
M
NC
M
M/C
At4g31180
E
C
E
–
At4g26870
E
E
E
–
Os01g06020
E
E
E
–
Os02g46130
E
NC
E
–
Os02g04700
E
E
E
–
At2g31170
C
C
M
M/C
At3g56300
E
E
E
–
At5g38830
E
E
E
–
Os09g38420
ER
M
M
–
Os10g32570
E
E
E
–
Os03g04960
E
E
E
–
At1g29880
E
M
M
Cyt/M
At3g48110
ER
C
C
M/C
Os08g42560
E
E
E
–
Os04g32650
M
M
M
–
At5g64050
E
M
M
M/C
At5g26710
E
C
S
–
Os10g22380
C
C
E
–
Os01g16520
E
E
E
–
Os02g02860
E
NC
M
M/C
At3g46100
M
M
S
M/C
At3g02760
E
E
E
–
Os02g51830
M
NC
M
–
Os05g05840
C
E
S
–
At3g13490
C
M
M
M/C
At3g11710
E
E
E
–
Os02g41470
E
NC
M
–
Os03g38980
E
E
E
–
At3g55400
C
C
C
M/C
At4g13780
E
E
E
–
Os03g11120
M
M
M
–
Os10g26050
E
E
M
–
Os06g31210
E
E
C
–
Duchêne et al. (2005)
(Duchêne et al. 2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
123
528
Mol Genet Genomics (2009) 281:525–538
Table 1 continued
Phenylalanyl-tRNA synthetase
Prolyl-tRNA synthetase
Seryl-tRNA synthetase
Threonyl- tRNA synthetase
Tryptophanyl-tRNA synthetase
Tyrosyl-tRNA synthetase
Loci
Predotar
Target P
iPSORT
Experimental localization
References
At3g58140
E
C
C
M/C
Duchêne et al. (2005)
At4g39280
E
E
E
–
At1g72550
E
E
E
–
Os12g34860
M
NC
M
–
Os10g26130
E
E
E
–
Os05g48510
E
E
E
–
At5g52520
E
NC
M
M/C
At3g62120
E
E
E
–
Os07g07060
E
M
M
–
Os12g25710
E
E
E
–
At1g11870
E
M
S
M/C
At5g27470
E
E
E
–
Os03g10190
E
E
E
–
Os11g39670
E
NC
M
–
Os01g37837
E
E
E
–
At2g04842
C
C
C
M/C
At5g26830
M
M
M
–
At1g17960
E
E
E
–
Os08g19850
M
M
M
–
Os02g33500
NC
NC
M
–
At2g25840
E
C
C
M/C
At3g04600
E
E
E
–
Os01g54020
E
NC
M
–
Os12g35570
E
NC
E
–
Os07g17770
E
M
E
Duchêne et al. (2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
Duchêne et al. (2005)
At3g02660
C
C
M
M/C
At2g33840
E
C
E
–
At1g28350
E
E
E
–
Os01g31610
M
NC
M
M/C
Os08g05490
E
C
E
–
Os08g23110
E
C
E
–
Os08g09260
E
E
E
–
tRNA-dependant amidotransferase subunit A
At3g25660
M
C
C
M/C
Os04g55050
M
M
M
–
tRNA-dependant amidotransferase subunit B
At1g48520
C
M
M
M/C
Pujol et al. (2008)
Os11g34210
M
M
M
tRNA-dependant amidotransferase subunit C
At4g32915
M
C
M
M/C
Pujol et al. (2008)
Os03g44820
M
C
C von Braun et al. (2007)
tRNA nucleotidyltransferases AtPreP metalloprotease
AtSufE
123
At1g22660
M
C
ER
M/C/Cyt
Os12g07350
ER
M
M
–
At3g19170
M
C
M
M/C
At1g49630
M
C
M
M/C
Os02g52390
M
M
M
–
At4g26500
C
C
C
M/C
Os09g09790
C
C
M
–
Duchêne et al. (2005)
Pujol et al. (2008)
Bhushan et al. (2005)
Xu and Moller (2006)
Mol Genet Genomics (2009) 281:525–538
529
Table 1 continued Loci
ATXR methyltransferase
Phage-type RNA polymerase
-type DNA polymerase
DNA gyrase DNA ligase I
Predotar
Target P
iPSORT
Experimental localization
References
Raynaud et al. (2006)
At5g09790
C
C
M
C/Nucleus
At5g24330
E
M
M
Nucleus
Os01g73460
NC
C
M
–
Os02g03030
E
M
C
–
At5g15700
E
C
C
M/C
At2g24120
C
C
C
C
At1g68990
M
M
E
M
Os09g07120
C
C
C
M
Os06g44230
C
C
C
C
At3g20540
E
NC
M
M/C
At1g50840
C
C
C
M/C
Os08g07840
C
C
C
–
Os08g07850
E
NC
C
–
Os04g54500
C
C
C
–
At3g10690
C
C
C
M/C
Os03g59750
M
M
M
–
At1g08130
M
C
M
M/Nucleus
At1g49250
E
E
E
–
At1g66730
E
C
E
–
Os10g34750
E
E
E
–
Os01g49180
C
C
M
–
AP2 domain-containing transcription factor TINY
At2g44940
C
C
M
C/Nucleus
OSB Proteins
At5g44785
E
M
M
M/C
At1g47720
E
C
M
M
At4g20010
E
E
C
C
At1g31010
E
C
M
–
Os01g72049
M
M
M
–
Os03g43420
C
C
C
–
Os03g41530
E
M
M
–
At5g35630
C
C
M
M/C
At5g37600
E
E
E
Cyt
At1g66200
E
E
E
Cyt
At3g17820
E
E
E
Cyt
At5g16570
E
E
E
Cyt
At1g48470
E
E
E
Cyt
Os04g56400
C
C
C
–
Os02g50240
E
E
E
–
Os03g12290
E
E
E
–
Os03g50490
E
E
E
–
At3g25740
E
C
C
M/C
At4g37040
E
M
M
M/C
At2g45240
E
NC
E
Cyt
At1g13270
C
C
C
C
Os02g52420
C
M
M
M/C
Os04g52100
E
C
C
–
Os07g25410
E
E
E
–
Glutamine synthetase
Methionine aminopeptidase
Os10g41130
Hedtke et al. (1999, 2000)
Kusumi et al. (2004) Elo et al. (2003) and Christensen et al. (2005)
Wall et al. (2004) Sunderland et al. (2006)
Schwacke et al. (2007)
– Zaegel et al. (2006)
Taira et al. (2004)
Giglione et al. (2000)
123
530
Mol Genet Genomics (2009) 281:525–538
Table 1 continued
Peptide deformylase
Pseudouridine synthase Cryptochrome Preprotein and Amino Acid Transporter
Formate dehydrogenase
Ascorbate peroxidase
Monodehydroascorbate reductase
Glutathione reductase
123
Loci
Predotar
Target P
iPSORT
Experimental localization
Os10g36470
E
E
M
–
Os07g32590
C
E
M
–
At1g15390
E
C
E
M/C
References
At5g14660
E
C
C
M/C
Giglione et al. (2000) and Dinkins et al. (2003)
Os01g45070
M
M
M
M/C
Moon et al. (2008)
Os01g44980
M
NC
M
–
Os01g37510
ER
C
E
C
At2g30320
E
M
C
M/C
Os03g21980
M
M
M
–
At5g24850
E
E
E
M/C
Os06g45100
E
E
M
–
At5g24650
E
E
E
M/C
At3g49560
E
E
E
C
Os04g33220
E
M
M
–
At5g14780
E
M
M
M/C
Os06g29180
M
M
M
–
Os06g29220
M
M
M
At4g08390
C
C
M
M/C
At1g77490
C
C
C
–
At4g35000
E
E
E
–
At4g35970
E
E
E
–
At3g09640
E
E
E
–
At1g07890
E
E
E
–
At4g32320
E
C
M
–
At4g09010
C
C
C
–
Os04g35520
C
C
M
–
Os02g34810
C
C
C
–
Os12g07820
M
M
M
M
Os12g07830
M
M
M
–
Os04g14680
E
E
E
P
Os08g43560
E
NC
E
–
Os03g17690
E
E
E
–
Os07g49400
E
E
E
–
Os08g41090
C
C
M
–
At1g63940
C
C
C
M/C
At3g52880
E
E
E
–
At3g27820
ER
E
S
–
At5g03630
E
E
E
–
At3g09940
E
E
E
–
Os08g05570
C
NC
C
M/C
Os09g39380
E
E
E
–
Os08g44340
E
E
E
–
Os02g47800
E
M
S
–
Os02g47790
E
E
S
–
At3g54660
C
C
C
M/C
At3g24170
E
E
E
–
Peeters and Small (2001) Kleine et al. (2003) Murcha et al. (2007)
Herman et al. (2002)
Chew et al. (2003)
Teixeira et al. (2006)
Obara et al. (2002)
Chew et al. (2003)
Mol Genet Genomics (2009) 281:525–538
531
Table 1 continued
Mercaptopyruvate sulfurtransferase
Phosphatidylglycerophosphate synthase I
NAD(P)H dehydrogenase A
NAD(P)H dehydrogenase B
NAD(P)H dehydrogenase C Glutathione S-transferase F8
Holocarboxylase synthetase 1
Ribosomal protein S16
Aldolase
Loci
Predotar
Target P
iPSORT
Experimental localization
Os03g06740
C
C
C
–
Os10g28000
E
E
M
–
Os02g56850
E
E
E
–
At1g79230
NC
C
M
M/C
At1g16460
E
E
M
Cyt
Os12g41500
E
E
E
–
Os02g07040
E
NC
E
–
At2g39290
E
C
M
M/C
At3g55030
E
E
E
–
Os03g17520
C
C
M
–
At1g07180
M
M
C
M/P
At2g29990
M
M
C
M/P
Os01g61410
M
M
M
–
Os07g37730
M
M
M
–
At4g28220
M
M
E
M/P
At4g21490
ER
ER
E
–
At4g05020
E
E
E
–
At2g20800
E
M
M
–
Os06g47000
M
M
M
–
Os05g26660
M
M
M
–
Os08g04630
ER
M
M
–
At5g08740
C
C
C
M/C
Os06g11140
C
C
M
–
At2g47730
E
C
C
C/Cyt
At4g02520
E
E
C
–
Os01g27390
E
E
E
–
At2g25710
E
E
M
C/Cyt
At1g37150
E
E
E
–
Os02g07040
M
M
E
–
At4g34620
M
M
M
C
At5g56940
NC
M
M
M/C
Os09g32274
M
M
M
M/C
Os08g40610
M
M
M
M/C
At3g52930
E
E
E
M/Cyt
At2g36460
E
E
M
–
At5g03690
M
M
M
–
At4g26530
E
E
M
–
At4g26520
E
E
E
–
At2g21330
C
C
M
–
At4g38970
C
C
M
–
At2g01140
C
C
C
–
Os01g67860
E
E
E
–
Os05g33380
E
E
E
–
Os10g08022
E
E
E
–
Os08g02700
E
E
E
–
Os06g40640
E
E
M
–
References
Nakamura et al. (2000)
Babiychuk et al. (2003)
Carrie et al. (2008)
Carrie et al. (2008)
Carrie et al. (2008) Thatcher et al. (2007)
Puyaubert et al. (2008)
Ueda et al. (2008) Ueda et al. (2008) Giegé et al. (2003)
123
532
Mol Genet Genomics (2009) 281:525–538
Table 1 continued
Enolase
Thiazole biosynthetic enzyme Vacuoleless 1 Vacuolar protein sorting 11 Vacuolar protein sorting 33
Loci
Predotar
Target P
iPSORT
Experimental localization
Os01g02880
C
M
M
–
Os11g07020
C
E
M
–
At2g36530
E
E
E
M/Cyt
At1g74030
E
C
C
–
At2g29560
E
E
E
–
Os03g14450
E
E
E
–
Os10g08550
E
C
C
–
Os06g04510
E
E
E
–
Os09g20820
C
M
M
–
Os03g15950
E
E
E
–
At5g54770
C
C
C
M/C
Os07g34570
C
C
M
–
At2g38020
E
E
E
T/PVC
Os01g47650
E
E
E
–
At2g05170
E
E
E
T/PVC
Os04g31390
E
E
E
–
At3g54860
E
E
E
T/PVC
Os04g14650
E
E
E
–
References
Giegé et al. (2003)
Chabregas et al. (2003) Rojo et al. (2003) Rojo et al. (2003) Rojo et al. (2003)
C chloroplast, Cyt cytosol, E elsewhere (programs are not able to assign a subcellular location with values equal or above 0.5), ER endoplasmatic reticulum, M mitochondria, NC no consensus (programs are assigning two subcellular locations with values equal or above 0.5), P Peroxissome, PVC prevacuolar compartment, S secretory pathway, T tonoplast
the mixture was adjusted to 0.2–0.3. Acetosyringone was added to a Wnal concentration of 100 M. Abaxial surfaces of leaves were inWltrated by using a syringe without needle. Mesophyll protoplasts were prepared from the inWltrated tobacco leaves as described by Carneiro et al. (1993). Samples were analyzed at 4 and 5 days after inWltration with an Olympus FV1000 confocal laser scanning microcospe. Excitation Wlters and emission were as follows: GFP, 488 nm excitation, 510–550 nm emission; MitoTracker Red FM, 543 nm excitation, 590–640 nm emission; chlorophyll autoXuorescence, 635 nm excitation, 670–700 nm emission. Images were captured using Olympus Fluoview FV10-ASW software. To speciWcally dye mitochondria, MitoTracker was added to the protoplasts suspension, at 300 nM Wnal concentration, for 40 min prior to observation.
Results In order to understand the evolution of genes encoding DT proteins and how it aVected gene duplication, we performed a search of the complete annotated Arabidopsis and rice genomes for orthologues and paralogues of experimentally proved DT proteins. The identiWcation of the rice closest orthologue of an Arabidopsis thaliana DT protein would indicate a candidate for a protein with a similar function
123
and probably with a similar localization. At present, there are 58 Arabidopsis proteins that have been shown, experimentally, to translocate to more than one subcellular compartment, mostly to mitochondria and chloroplasts. These 58 DT proteins can be arranged into 52 gene families (Table 1). All genes found in Arabidopsis have homologues in the rice genome. Combining sequence homology analysis, presence of functional domains and expression data allowed us to establish orthology between rice and Arabidopsis sequences. Twenty-seven out of the 52 families displayed the same number of paralogous proteins in both Arabidopsis and rice, with a similar pattern of gene expansion. The remaining 25 families could be divided into two groups: one with 12 families presenting a higher number of paralogous proteins in Arabidopsis compared to rice and the other with 13 families with a higher number of paralogous proteins in rice than in Arabidopsis. However, the number of paralogues found in rice and Arabidopsis did not highly diVer despite the fact that the rice genome is estimated to be fourfold larger than the Arabidopsis genome and that rice has over 40% more unique expressed sequences than Arabidopsis (Plant Gene Indices at http:// www.tigr.org). Prediction of the subcellular location of members of DT protein families using available softwares, as well as the experimentally conWrmed localization of the paralogous
Mol Genet Genomics (2009) 281:525–538
533
Fig. 1 Phylogenetic trees of Arabidopsis thaliana and Oryza sativa (a) monodehydroascorbate reductase (b) methionine aminopeptidase (c) glutamyltRNA synthetase, and (d) tyrosyl-tRNA synthetase. The trees are based on amino acid sequence alignments and generated by neighbour joining. Numbers on the branches indicate the bootstrap values
proteins are indicated in Table 1. Comparison of the output of these programs shows that they often disagree on the Wnal location of the protein. Therefore, combining of bioinformatics targeting algorithms, comparative genomic analysis and experimental approaches is required to validate of protein location (Heazlewood et al. 2005).
To ascertain the accuracy of paralogous determination (see “Materials and methods” for details), we reconstructed the phylogeny for all families listed in Supplementary material. Phylogenetic distribution of randomly selected monodehydroascorbate reductase, methionine aminopeptidase, glutamyl-tRNA synthetase and tyrosyl-tRNA synthetase
123
534
Mol Genet Genomics (2009) 281:525–538
Fig. 1 continued
gene families (Table 1) revealed that Arabidopsis DT proteins form unique clusters with the rice isoforms (Fig. 1). Arabidopsis monodehydroascorbate reductase, encoded by At1g63940, translocate to both mitochondria and chloroplasts (Obara et al. 2002). It belongs to a gene family that encodes Wve paralogous proteins in both Arabidopsis and rice (Fig. 1a). Phylogenetic reconstruction showed that rice Os08g05570 clustered with Arabidopsis At1g63940. This suggests that the rice orthologue also translocates to mitochondria and chloroplasts. Arabidopsis methionine aminopeptidases are also dual-targeted to both mitochondria and chloroplast (Giglione et al. 2000). The rice methionine aminopeptidase Os02g52420 gene product branches speciWcally with an orthologous Arabidopsis protein (Fig. 1b), encoded by At4g37040 as observed with the monodehydroascorbate reductase family. The same occurs to Arabidopsis tyrosyl-tRNA synthetase At3g02660, dualtargeted to mitochondria and chloroplasts (Duchêne et al. 2005), and its homologue Os01g31610 in rice (Fig. 1c), and to Arabidopsis glutamyl-tRNA synthetase At5g64050, also imported to these two organelles (Duchêne et al. 2005), and its homologue Os02g02860 in rice (Fig. 1d). The intracellular localization of the rice orthologues suggested by the phylogenetic trees (Fig. 1) was conWrmed by using recombinant fusion proteins in which the N-terminal coding sequences of the rice cDNAs were fused to GFP
123
reporter genes. The subcellular localization of the chimeric proteins was determined by transient transformation of N. tabacum protoplasts. This in vivo approach avoids the problems associated with in vitro targeted systems, particularly when considering proteins simultaneously translocated to mitochondria and chloroplast (Silva-Filho et al. 1997; Cleary et al. 2002). As predicted by the phylogenetic reconstruction, the rice orthologues were also transported to both mitochondria and chloroplast (Fig. 2).
Discussion Despite the asymmetric evolution of gene families by increased gene duplications of otherwise paralogous genes, our data demonstrated conservation of a dual targeting strategy between orthologous proteins (methionine aminopeptidase, monodehydroascorbate reductase, glutamyl-tRNA synthetase, and tyrosyl-tRNA synthetase) of Arabidopsis and rice. The methionine aminopeptidase gene family was shown to be composed by four paralogous genes in Arabidopsis whereas the rice gene family was organized into Wve paralogues, indicating that it contains more gene duplication than its Arabidopsis counterpart (Fig. 1b). This is supported by studies that showed a more extensive diploidization in rice than Arabidopsis and also that gene
Mol Genet Genomics (2009) 281:525–538
535
Fig. 2 Dual localization of rice glutamyl-tRNA synthetase-GFP (GluRS Os02g02860), tyrosyl-tRNA synthetase-GFP (TyrRS Os01g31610), methionine aminopeptidase-GFP (MetAP Os02g52429), and monodehydroascorbate reductase-GFP (MDHAR Os08g05570) fusion proteins. Chimerci proteins were introduced by Agrobacterium tumefaciens–mediated transfection in tobacco protoplasts. Fluorescence images were taken using a confocal laser scanning microscope.
GFP corresponds to the Xuorescence detected in the green channel; Chlorophyll, detected in far-red channel, corresponds to the chloroplast autoXuorescence signal; Mitotracker signal was detected in the red channel; Mitotracker + GFP corresponds to the merging of red and green channels; Chlorophyll + GFP corresponds to the merging of farred and green channels; yellow represents the co-localization of green and red signals. All scale bars represent 5 m
loss is not a random event (Bowers et al. 2003; Paterson et al. 2004). Analyzing the DT data recorded in the literature, examples supporting conservation of the DT of homologous proteins become evident among dicots and also between monocots and dicots. The Arabidopsis -type DNA polymerases (At3g20540, At1g50840) (Elo et al. 2003; Christensen et al. 2005) and its two homologues in N. tabacum (Ono et al. 2007) are dual-targeted to mitochondria and chloroplasts. The same occurs to DNA gyrase A, which is dual-targeted to these two organelles in A. thaliana (Wall et al. 2004) and N. benthamiana (Cho et al. 2004). The protein glutathione reductase also presents DT to mitochondria and chloroplasts in Arabidopsis (Chew et al. 2003) and pea (Creissen et al. 1995). The phage-type RNA polymerase (At5g15700) (Hedtke et al. 2000) and its homologs in N. tabacum (Hedtke et al. 2002) and N. sylvestris (Kobayashi et al. 2001) are dual-targeted to mitochondria and chloroplasts. Interestingly, DT of this polymerase was also reported in the moss Physcomitrella patens (Richter et al. 2002). DT to mitochondria and chloroplasts has been
reported for homologous proteins in monocots and dicots: ribosomal protein S16 of Arabidopsis (At5g56940) and rice (Os09g32274) (Ueda et al. 2008); peptide deformylase 1B in Arabidopsis (At5g14660) (Dinkins et al. 2003) and rice (Os01g45070) (Moon et al. 2008), and seryl-tRNA synthetase in Arabidopsis (Duchêne et al. 2005) and maize (Rokov-Plavec et al. 2008). The fact that Arabidopsis and rice genomes, despite the fourfold diVerence in size, represent small genomes among plants prompted us to search for homologues of DT proteins in maize, with a larger 2,365 Mb genome and a representative amount of sequences in public databases. The maize genome evolved by complete genome duplication due to allotetraploidy and by segmental duplications, and expanded rapidly by retrotransposition events (Jorgensen 2004). A complete survey of the maize genome was not possible given that some sequence information was still missing, but for some gene families analysed we could identify homologues for all the Arabidopsis and rice paralogues. In spite of its larger genome size, the number of maize isoforms was nearly the same as observed for
123
536
Arabidopsis and rice (data not shown). It is thus likely that the prevalence of dual targeted proteins is not increased in expanded genomes. The understanding of the evolutionary forces that shape gene family sizes is far from complete. Based on the comparison of currently known gene families containing DT members, on the experimentally conWrmed DT conservation of aminopeptidase, monodehydroascorbate, glutamyl-tRNA synthetase, tyrosyl-tRNA synthetase, peptide deformylase, and ribosomal protein S16 in rice and Arabidopsis, and on the double role of DT proteins in a cell, we propose that DT proteins may be considered one of the evolutionary forces shaping gene family size. Although the relatively small number of 50 gene families currently known to have DT members limits, the data presented here indicate an evolutionary tendency for gene families containing dual-targeting members. Detailed analysis of each known family of DT genes and certainly the characterization of novel and still unknown DT proteins will further expound what is here proposed. Acknowledgments This research was supported by FAPESP and CNPq, Brazil. C.V.M. was supported by a fellowship from CNPq. R.A.O.R., P.A.S.M. and C.M.B. were supported by fellowships from FAPESP. M.C·S.F. is a research fellow of CNPq. The authors thank Carlos Menck and Christine Stock for helpful suggestions and critical read of the manuscript.
References Altschul SF, Madden TL, SchäVer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402 Babiychuk E, Muller F, Eubel H, Braun HP, Frentzen M, Kushnir S (2003) Arabidopsis phosphatidylglycerophosphate synthase 1 is essential for chloroplast diVerentiation, but is dispensable for mitochondrial function. Plant J 33:899–909 Bannai H, Tamada Y, Maruyama O, Nakai K, Miyano S (2002) Extensive feature detection of N-terminal protein sorting signals. Bioinformatics 18:298–305 Bhushan S, Stahl A, Nilsson S, Lefebvre B, Seki M, Roth C, McWilliam D, Wright SJ, Liberles DA, Shinozaki K, Bruce BD, Boutry M, Glaser E (2005) Catalysis, subcellular localization, expression and evolution of the targeting peptides degrading protease, AtPreP2. Plant Cell Physiol 46:985–996 Bowers JE, Chapman BA, Rong J, Paterson AH (2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422:433–438 Carneiro VT, Pelletier G, Small I (1993) Transfer RNA-mediated suppression of stop codons in protoplasts and transgenic plants. Plant Mol Biol 22:681–690 Carrie C, Murcha MW, Kuehn K, Duncan O, Barthet M, Smith PM, Eubel H, Meyer E, Day DA, Millar AH, Whelan J (2008) Type II NAD(P)H dehydrogenases are targeted to mitochondria and chloroplasts or peroxisomes in Arabidopsis thaliana. FEBS Lett 582:3073–3079 Castillo-Davis CI, Hartl DL (2002) Genome evolution and developmental constraint in Caenorhabditis elegans. Mol Biol Evol 19:728–735
123
Mol Genet Genomics (2009) 281:525–538 Chabregas SM, Luche DD, Van Sluys MA, Menck CFM, Silva-Filho MC (2003) DiVerential usage of two in-frame translational start codons regulates subcellular localization of Arabidopsis thaliana THI1. J Cell Sci 116:285–291 Chew O, Whelan J, Millar AH (2003) Molecular deWnition of the ascorbate-glutathione cycle in Arabidopsis mitochondria reveals dual targeting of antioxidant defenses in plants. J Biol Chem 278:46869–46877 Cho HS, Lee SS, Kim KD, Hwang I, Lim JS, Park YI, Pai HS (2004) DNA gyrase is involved in chloroplast nucleoid partitioning. Plant Cell 16:2665–2682 Christensen AC, Lyznik A, Mohammed S, Elowsky CG, Elo A, Yule R, Mackenzie SA (2005) Dual-domain, dual-targeting organellar protein presequences in Arabidopsis can use non-AUG start codons. Plant Cell 17:2805–2816 Cleary SP, Tan FC, Nakrieko KA, Thompson SJ, Mullineaux PM, Creissen GP, von Stedingk E, Glaser E, Smith AG, Robinson C (2002) Isolated plant mitochondria import chloroplast precursor proteins in vitro with the same eYciency as chloroplasts. J Biol Chem 277:562–1569 Creissen G, Reynolds H, Xue Y, Mullineaux P (1995) Simultaneous targeting of pea glutathione reductase and of a bacterial fusion protein to chloroplasts and mitochondria in transgenic tobacco. Plant J 8:167–175 Davies JC, Petrov DA (2004) Preferential duplication of conserved proteins in eukaryotic genomes. PloS Biol 2:318–326 Dinkins RD, Conn HM, Dirk LMA, Williams MA, Houtz RL (2003) The Arabidopsis thaliana peptide deformylase 1 protein is localized to both mitochondria and chloroplasts. Plant Sci 165:751–758 Duchêne AM, Giritch A, HoVmann B, Cognat V, Lancelin D, Peeters NM, Zaepfel M, Marechal-Drouard L, Small ID (2005) Dual targeting is the rule for organellar aminoacyl-tRNA synthetases in Arabidopsis thaliana. Proc Natl Acad Sci USA 102:16484– 16489 Dyal SD, Brown MT, Johnson PJ (2004) Ancient invasions: from endosymbionts to organelles. Science 304:253–257 Elo A, Lyznik A, Gonzalez DO, Kachman SD, Mackenzie S (2003) Nuclear genes encoding mitochondrial proteins for DNA and RNA metabolism are clustered in the Arabidopsis genome. Plant Cell 15:1619–1631 Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300:1005–1016 Giegé P, Heazlewood JL, Roessner-Tunali U, Millar AH, Fernie AR, Leaver CJ, Sweetlove LJ (2003) Enzymes of glycolysis are functionally associated with the mitochondrion in Arabidopsis cells. Plant Cell 15:2140–2151 Giglione C, Serero A, Pierre M, Boisson B, Meinnel T (2000) IdentiWcation of eukaryotic peptide deformylases reveals universality of N-terminal protein processing mechanisms. EMBO J 19:5916–5929 Heazlewood JL, Tonti-Filippini J, Verboom RE, Millar AH (2005) Combining experimental and predicted datasets for determination of the subcellular location of proteins in Arabidopsis. Plant Physiol 139:598–609 Hedtke B, Meixner M, Gillandt S, Richter E, Borner T, Weihe A (1999) Green Xuorescent protein as a marker to investigate targeting of organellar RNA polymerases of higher plants in vivo. Plant J 17:557–561 Hedtke B, Borner T, Weihe A (2000) One RNA polymerase serving two genomes. EMBO Rep 1:435–440 Hedtke B, Legen J, Weihe A, Herrmann RG, Börner T (2002) Six active phage-type RNA polymerase genes in Nicotiana tabacum. Plant J 30:625–637 Herman PL, Ramberg H, Baack RD, Markwell J, Osterman JC (2002) Formate dehydrogenase in Arabidopsis thaliana: overexpression and subcellular localization in leaves. Plant Sci 163:1137–1145
Mol Genet Genomics (2009) 281:525–538 Huson DH, Bryant D (2006) Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 23:254–267 Jorgensen R (2004) Sequencing maize: just sample the salsa or go for the whole enchilada. Plant Cell 16:787–788 Kleine T, Lockhart P, Batschauer A (2003) An Arabidopsis protein closely related to Synechocystis cryptochrome is targeted to organelles. Plant J 35:93–103 Kobayashi Y, Dokiya Y, Sugita M (2001) Dual targeting of phage-type RNA polymerase to both mitochondria and plastids is due to alternative translation initiation in single transcripts. Biochem Biophys Res Commun 289:1106–1113 Kusumi K, Yara A, Mitsui N, Tozawa Y, Iba K (2004) Characterization of a rice nuclear-encoded plastid RNA polymerase gene OsRpoTp. Plant Cell Physiol 45:1194–1201 Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicate genes. Science 290:1151–1155 Mackenzie SA (2005) Plant organellar protein targeting: a traYc plan still under construction. Trends Cell Biol 15:548–554 Menand B, Maréchal-Drouard L, Sakamoto W, Dietrich A, Wintz H (1998) A single gene of chloroplast origin codes for mitochondrial and chloroplastic methionyl-tRNA synthetase in Arabidopsis thaliana. Proc Natl Acad Sci USA 95:11014–11019 Millar AH, Whelan J, Small I (2006) Recent surprises in protein targeting to mitochondria and plastids. Curr Opin Plant Biol 9:610–615 Mireau H, Lancelin D, Small ID (1996) The same Arabidopsis gene encodes both cytosolic and mitochondrial alanyl-tRNA synthetase. Plant Cell 8:1027–1039 Moon S, Giglione C, Lee D-Y, An S, Jeong D-H, Meinnel T, An G (2008) Rice peptide deformylase PDF1B is crucial for development of chloroplasts. Plant Cell Physiol 49:1536–1546 Murcha MW, Elhafez D, Lister R, Tonti-Filippini J, Baumgartner M, Philippar K, Carrie C, Mokranjac D, Soll J, Whelan J (2007) Characterization of the preprotein and amino acid transporter gene family in Arabidopsis. Plant Physiol 143:199–212 Nakamura T, Yamaguchi Y, Sano H (2000) Plant mercaptopyruvate sulfurtransferases: molecular cloning, subcellular localization and enzymatic activities. Eur J Biochem 267:5621–5630 Obara K, Sumi K, Fukuda H (2002) The use of multiple transcription starts causes the dual targeting of Arabidopsis putative monodehydroascorbate reductase to both mitochondria and chloroplasts. Plant Cell Physiol 43:697–705 Ono Y, Sakai A, Takechi K, Takio S, Takusagawa M, Takano H (2007) NtPolIlike1and NtPolI-like2, bacterial DNA polymerase I homologues isolated from BY-2 cultured tobacco cells, encode DNA polymerases engaged in DNA replication in both plastids and mitochondria. Plant Cell Physiol 48:1679–1692 Paterson AH, Bowers JE, Chapman BA, Peterson DG, Rong J, Wicker TM (2004) Comparative genome analysis of monocots and dicots, toward characterization of angiosperm diversity. Curr Opin Biotechnol 15:120–125 Peeters N, Small I (2001) Dual targeting to mitochondria and chloroplasts. Biochim Biophys Acta 1541:54–63 Pujol C, Bailly M, Kern D, Maréchal-Drouard L, Becker H, Duchêne AM (2008) Dual-targeted tRNA-dependent amidotransferase ensures both mitochondrial and chloroplastic Gln-tRNAGln synthesis in plants. Proc Natl Acad Sci USA 105:6481–6485 Puyaubert J, Denis L, Alban C (2008) Dual targeting of Arabidopsis holocarboxylase synthetase1: a small upstream open reading frame regulates translation initiation and protein targeting. Plant Physiol 146:478–491 Raynaud C, Sozzani R, Glab N, Domenichini S, Perennes C, Cella R, Kondorosi E, Bergounioux C (2006) Two cell-cycle regulated SET-domain proteins interact with proliferating cell nuclear antigen (PCNA) in Arabidopsis. Plant J 47:395–407 Richter U, Kiessling J, Hedtke B, Decker E, Reski R, Börner T, Weihe A (2002) Two RpoT genes of Physcomitrella patens encode
537 phage-type RNA polymerases with dual targeting to mitochondria and plastids. Gene 290:95–105 Rojo E, Zouhar J, Kovaleva V, Hong S, Raikhel NV (2003) The AtCVPS protein complex is localized to the tonoplast and the prevacuolar compartment in Arabidopsis. Mol Biol Cell 14:361–369 Rokov-Plavec J, Dulic M, Duchêne AM, Weygand-Durasevic I (2008) Dual targeting of organellar seryl-tRNA synthetase to maize mitochondria and chloroplasts. Plant Cell Rep 27:1157–1168 Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425 Schatz G, Dobberstein B (1996) Common principles of protein translocation across membranes. Science 271:1519–1526 Schwacke R, Fischer K, Ketelsen B, Krupinska K, Krause K (2007) Comparative survey of plastid and mitochondrial targeting properties of transcription factors in Arabidopsis and rice. Mol Genet Genomics 277:631–646 Silva-Filho MC (2003) One ticket for multiple destinations: dual targeting of proteins to distinct subcellular locations. Curr Opin Plant Biol 6:589–595 Silva-Filho MC, Wieers M-C, Chaumont F, Flugge U-I, Boutry M (1997) DiVerent in vitro and in vivo targeting properties of the transit peptide of a chloroplast envelope inner membrane protein. J Biol Chem 272:15264–15269 Small I, Peeters N, Legeai F, Lurin C (2004) Predotar: a tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 4:1581–1590 Su Z, Wang J, Yu J, Huang X, Gu X (2006) Evolution of alternative splicing after gene duplication. Genome Res 16:182–189 Sunderland PA, West CE, Waterworth WM, Bray CM (2006) An evolutionarily conserved translation initiation mechanism regulates nuclear or mitochondrial targeting of DNA ligase 1 in Arabidopsis thaliana. Plant J 47:356–367 Taira M, Valtersson U, Burkhardt B, Ludwig RA (2004) Arabidopsis thaliana GLN2-encoded glutamine synthetase is dual targeted to leaf mitochondria and chloroplasts. Plant Cell 16:2048– 2058 Teixeira FK, Menezes-Benavente L, Galvão VC, Margis R, Margis-Pinheiro M (2006) Rice ascorbate peroxidase gene family encodes functionally diverse isoforms localized in diVerent subcellular compartments. Planta 6:1–15 Thatcher LF, Carrie C, Andersson CR, Sivasithamparam K, Whelan J, Singh KB (2007) DiVerential gene expression and subcellular targeting of Arabidopsis glutathione S-transferase F8 is achieved through alternative transcription start sites. J Biol Chem 282:28915–28928 Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-speciWc gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680 Ueda M, Nishikawa T, Fujimoto M, Takanashi H, Arimura S, Tsutsumi N, Kadowaki K (2008) Substitution of the gene for chloroplast RPS16 was assisted by generation of a dual targeting signal. Mol Biol Evol 25:1566–1575 van Engelen FA, MolthoV JW, Conner AJ, Nap JP, Pereira A, Stiekema WJ (1995) pBINPLUS: an improved plant transformation vector based on pBIN19. Transgenic Res 4:288–290 Vandepoele K, Simillion C, Van de Peer Y (2003) Evidence that rice and other cereals are ancient aneuploids. Plant Cell 15:2192–2202 von Braun SS, Sabetti A, Hanic-Joyce PJ, Gu J, SchleiV E, Joyce PB (2007) Dual targeting of the tRNA nucleotidyltransferase in plants: not just the signal. J Exp Bot 58:4083–4093 Wall MK, Mitchenall LA, Maxwell A (2004) Arabidopsis thaliana DNA gyrase is targeted to chloroplasts and mitochondria. Proc Natl Acad Sci USA 101:7821–7826
123
538 Xu XM, Moller SG (2006) AtSufE is an essential activator of plastidic and mitochondrial desulfurases in Arabidopsis. EMBO J 25:900– 909 Yang J, Li WH (2004) Developmental constraint on gene duplicability in fruit Xies and nematodes. Gene 340:237–240
123
Mol Genet Genomics (2009) 281:525–538 Zaegel V, Guermann B, Le Ret M, Andrés C, Meyer D, Erhardt M, Canaday J, Gualberto JM, Imbault P (2006) The plant-speciWc ssDNA binding protein OSB1 is involved in the stoichiometric transmission of mitochondrial DNA in Arabidopsis. Plant Cell 18:3548–3563