An efficient reverse genetics platform in the ... - Wiley Online Library

Research

Methods An efficient reverse genetics platform in the model legume Medicago truncatula Xiaofei Cheng1, Mingyi Wang1, Hee-Kyung Lee1, Million Tadege1, Pascal Ratet2, Michael Udvardi1, Kirankumar S. Mysore1 and Jiangqi Wen1 1

Division of Plant Biology, The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK 73401, USA; 2Institut des Sciences du Vegetal, CNRS, Avenue de la Terrasse

91198, Gif sur Yvette Cedex, France

Summary Author for correspondence: Jiangqi Wen Tel: +1 580 224 6680 Email: [email protected] Received: 23 May 2013 Accepted: 1 October 2013

New Phytologist (2014) 201: 1065–1076 doi: 10.1111/nph.12575

Key words: database searching, flanking sequence tags (FSTs), floral homeotic genes, Medicago tuncatula, reverse genetics, Tnt1 mutant population.

Medicago truncatula is one of the model species for legume studies. In an effort to develop

legume genetics resources, > 21 700 Tnt1 retrotransposon insertion lines have been generated. To facilitate fast-growing needs in functional genomics, two reverse genetics approaches

have been established: web-based database searching and PCR-based reverse screening. More than 840 genes have been reverse screened using the PCR-based approach over the past 6 yr to identify mutants in these genes. Overall, c. 84% (705 genes) success rate was achieved in identifying mutants with at least one Tnt1 insertion, of which c. 50% (358 genes) had three or more alleles. To demonstrate the utility of the two reverse genetics platforms, two mutant alleles were isolated for each of the two floral homeotic MADS-box genes, MtPISTILATA and MtAGAMOUS. Molecular and genetic analyses indicate that Tnt1 insertions in exons of both genes are responsible for the defects in floral organ development. In summary, we have developed two efficient reverse genetics platforms to facilitate functional characterization of M. truncatula genes.

Introduction Medicago truncatula is one of the model species for legume genetics, genomics and functional genomics studies. Over the last two decades, M. truncatula has been utilized for studies in many research areas, including plant nutrition, metabolism, growth and development, plant–microbe interactions, and plant–environment interactions. With the completion of gene-rich region genome sequencing and annotation (Young et al., 2011), functional characterization of thousands of M. truncatula genes is needed. The function(s) of the vast majority of plant genes may not be correctly predicted just by DNA and/or protein sequence homology. Because of the ambiguity associated with homology-based function assessment, highthroughput approaches are needed to identify gene function(s). Genome-wide transcript profiling studies, protein–protein interaction studies, and comprehensive metabolomic and proteomic studies provide major correlative data and help to gain knowledge of physiological processes, but they are insufficient to assign exact function(s) for individual genes. An efficient reverse genetics platform is needed for dissecting M. truncatula gene function(s). Utilization of mutants has long been a reliable approach to decipher gene functions through both forward and reverse genetics Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

strategies (Wu et al., 2005). The availability of large-scale mutant populations, generated by EMS, T-DNA insertion, transposontagging, gamma radiation or fast-neutron bombardment, has greatly accelerated functional genomics studies in model plant species (Bevan & Walsh, 2005). Phenotype-driven forward genetics, which historically has played an irreplaceable role in the development of modern genetics, is limited for high-throughput gene function analyses. Genetic map-based gene cloning is still a timeconsuming and challenging task even in model plants such as Arabidopsis and rice. In the post-genomic era, with the large amount of sequence information generated by genome projects for Arabidopsis and other plant species, reverse genetics with high-throughput sequence-based screening strategies has enabled researchers to functionally characterize hundreds of genes in a relatively short timespan. Although the recently developed VIGS provides a novel alternative tool for reverse genetics, it is not applicable or efficient in many plant species including M. truncatula. Insertional mutagenesis and EMS mutagenesis-based targeting induced local lesions in genomes (TILLING) dominate the reverse genetics platforms in many plant species. Of all the publicly available mutant collections, a large portion was generated by T-DNA or transposons, while a relatively small portion was generated by chemical mutagenesis and analyzed by TILLING (Martienssen, 1998; New Phytologist (2014) 201: 1065–1076 1065 www.newphytologist.com

New Phytologist

1066 Research

Krysan et al., 1999; Bennetzen, 2000; Okamoto & Hirochika, 2000; Parinov & Sundaresan, 2000; Sussman et al., 2000; Courtial et al., 2001; Yamazaki et al., 2001; Fladung et al., 2004; Stepanova & Alonso, 2006). T-DNA has been successfully used as a powerful insertional mutagen in generating gain-of-function and loss-of-function mutants in Arabidopsis (Weigel et al., 2000; Alonso et al., 2003) and rice (An et al., 2005; Larmande et al., 2008; Fu et al., 2009). The introduction of known sequences in T-DNA mutagenesis makes it easy to recover the flanking sequences and thus to identify insertions in genes. However, due to the nature of low frequency of T-DNA insertions, a large population is required to achieve saturation mutagenesis. For plant species with relatively large genomes and lacking high-throughput in planta transformation, large-scale mutant generation by T-DNA insertion mutagenesis is not practical. Medicago truncatula is one of these species (Brocard et al., 2006, 2008; Scholte et al., 2002). In the past 6 yr, we have regenerated > 21 700 insertional mutant lines in M. truncatula using Tnt1, a well-characterized tobacco retrotransposon (Grandbastien et al., 1989; d’Erfurth et al., 2003; Tadege et al., 2008). The high efficiency of Tnt1 transposition during tissue culture resulted in multiple Tnt1 inserts in single regenerated M. truncatula lines (from 4 to 50 with an average of 25 insertions per genome; Tadege et al., 2008). This feature of Tnt1 enables us to reach near-saturation mutagenesis of the M. truncatula genome with a relatively small number of mutant lines. The value of the Tnt1-tagged mutant population has already been proven in forward genetics screening (d’Erfurth et al., 2003; Benlloch et al., 2006; Tadege et al., 2008; Wang et al., 2008). To make most efficient utilization of these Tnt1 mutants, we developed two reverse genetics approaches. One is the PCR-based DNA pool screening and the other is the direct database BLAST search. Both methods were successful in identifying insertion mutants in M. truncatula.

Materials and Methods Seed scarification and germination Medicago truncatula Gaertn. seeds (all M. truncatula mutant lines were generated at the Samuel Roberts Noble Foundation) were treated in concentrated sulfuric acid for 8 min, washed thoroughly with H2O, sterilized for 8–10 min in 30% commercial bleach with 0.01% Tween-20, and then washed with autoclaved H2O for 3–5 times. The sterilized seeds were placed on wet filter paper or 0.59 MS media and kept at 4°C in dark for 7–10 d, and then transferred to a 25°C culture room for germination. Seedlings were transferred into 1-gallon pots and grown to maturation in the glasshouse. Genomic DNA pooling Approximately 21 700 Tnt1 insertion lines in M. truncatula were generated. Genomic DNA from individual lines was extracted as described previously (Tadege et al., 2008; Cheng et al., 2011). A simple one-dimensional pooling strategy was used to pool the gDNAs into three levels (Fig. 1). The DNA pooling strategy is New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

(a)

(b)

Fig. 1 (a) Genomic DNA pooling chart of Medicago truncatula Tnt1 insertion lines. NF lines: Tnt1 lines were numbered starting with NF for Noble Foundation. (b) Location and direction of Tnt1 primers and genespecific primers (GSP). The box represents the region of a gene to be screened orientated from 5′ (left) to 3′ (right). LTR, long terminal repeat.

described in detail as following: mix 100 ll of genomic DNA from each of 10 individual samples to make a 1-ml mini pool (M-pool); mix 500 ll of genomic DNA from each of 10 M-pools to make a 5-ml pool (P-pool); mix 2 ml of genomic DNA from each of five P-pools to make a 10-ml super pool (S-pool). Therefore, one Spool contains five P-pools, 50 M-pools and 500 individual lines. To date, we have pooled 42 S-pools, 210 P-pools and 2100 Mpools, which contain 21 000 individual lines. Because the S-pool and P-pool DNA samples are most frequently used, to make these samples more stable during storage, S-pool and P-pool DNAs were further purified by extracting twice with phenol : chloroform : isoamyl alcohol (25 : 24 : 1) followed by chloroform : iso-amyl alcohol (24 : 1). The DNAs were precipitated with 1/10 volume of sodium acetate and equal volume of2-propanol, washed with 75% ethanol, dried and dissolved in the original volume of water. Ppool and S-pool DNA samples were aliquoted in 500 ll each. One aliquot of each P-pool and S-pool were kept at 4°C for routine use, and the remaining aliquots of P-pools and S-pools, M-pools and individual DNA samples were stored at 20°C. Primer design Tnt1 primer design Three forward primers and three reverse primers were designed for Tnt1 (Fig. 1b; Supporting Information Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist Table S1). Primers Tnt1-F, Tnt1-F1 and Tnt1-R, Tnt1-R1 were used for PCR screening, whereas primers Tnt1-F2 and Tnt1-R2 were used for PCR product sequencing. Gene-specific primer design Two pairs of gene-specific primers (GSPs) were designed based on the genomic sequence of the gene (Fig. 1b; Table S1). If a genomic sequence is not available, the GSPs were designed based on the cDNA or EST sequences. In this case, it is possible that primers designed from cDNA or EST may span the exon/intron junction and fail to amplify the genomic fragment during the screening. We used these basic guideline for GSP design: primer length is 22–24 bp with 9–11 G/C to match the melting temperatures of Tnt1 primers; there are no G or C clusters longer than 4 in the primers; there is one G or C in the last two nucleotides at the 3′ end of each primer. If a gene is larger than 4 kb, it can be split into two or more fragments with each fragment no larger than 3.5 kb and two pairs of primers are designed for each fragment. The primers for multi-fragment genes are designed in such a way that the first fragment overlaps at least 50 bp with the second fragment, the second fragment overlaps at least 50 bp with the third fragment and so on. The amplification efficiency of gene-specific primers was tested by PCR using both wild-type A17 (the reference genome) and R108 (the ecotype in which Tnt1 lines were generated) genomic DNA as templates before proceeding to the reverse screening of DNA pools. PCR reactions and product analysis Primary PCR (1st PCR) ExTaq (Takara Bio Inc., Shiga, Japan) was used for all PCR reactions. The PCR master mixture was prepared according to the manufacturer’s protocol with 1 lM of GSP primer (GSP-F or GSP-R) and 0.25 lM of Tnt1 primer (Tnt1-F or Tnt1-R). Aliquots of 37 ll of the mixture were placed into each PCR tube and 3 ll of super-pool DNA added to each. PCR was run using a touch-down program as follows: 95°C for 5 min; 94 for 30 s, 60°C 30 s and 72°C for 2.5 min, five cycles; 94°C 30 s, 57.5°C 30 s and 72°C 2.5 min for five cycles; 94°C 30 s, 55°C 30 s and 72°C 2.5 min for 25 cycles; 72°C for 5 min and stored at 10°C. Secondary PCR (nested-PCR) After the first PCR, the PCR products were diluted 50 times and then 2-ll aliquots of diluted 1st PCR products were used as the template for the nested-PCR. The nested-PCR reaction mixture was prepared with 0.25 lM of GSP (GSP-F1 or GSP-R1) and Tnt1 nested primers (Tnt1-F1 or Tnt1-R1); 38-ll aliquots of the PCR mixture were put in each tube. The PCR was run as for primary PCR. Then 10-ll aliquots of PCR products from individual nested-PCRs were separated side-by-side on 1% agarose gels. The remaining 30 ll nestedPCR products from an S-pool showing positive bands were purified using the QIAquick PCR purification kit (Qiagen) following the manufacturer’s protocol, except that the products were eluted in 30 ll of H2O. The concentration of the PCR product was measured using a Nanodrop spectrometer (Nanodrop Technologies Inc., Wilmington, DE, USA). Purified PCR products were sequenced using primer Tnt1-F2 or Tnt1-R2 depending on the Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

Research 1067

primers used for the nested PCR reaction. The sequences of PCR products were aligned with the sequences of genes-of-interest to confirm the gene-specific insertion(s) and to reveal the exact insertion site(s). If only cDNA/EST/TC sequences are available for a target gene and the insertion is in an intron region, the insertion-flanking sequences will not form a contig with cDNA/ EST/TC sequences. In this case, the insertion will be missed. Database architecture and content The mutant database was constructed using a Fedora Linux system and maintained using MySQL, a relational database management system. The data collected in the current database includes 10 884 Tnt1 mutant lines, which were developed by the Noble Foundation and collaborators in European groups. Currently 44 238 flanking sequence tags (FSTs) from 3436 mutant lines were generated and are BLAST-searchable in this database. In order to identify the positions in the Medicago genome for the FSTs, all FSTs were compared against the Medicago genomic sequence (Mt v3.5) using BLASTN. The Medicago genomic sequences used for BLAST were downloaded from the Medicago resources website (http://www.medicagohapmap.org/?genome). FSTs were mapped to Medicago pseudo-chromosomes if the BLASTN results met the following criteria: (1) > 80% identity; (2) ratio of the high scoring segment pair (HSP) length to the FST length > 90%; (3) expected values < 0.01; (4) HSP starting at < 15 bp from the beginning of an FST. If an FST had several hits in the Medicago genome, we chose the hit with the highest bit score value. The view of BLAST results can be obtained as an hyperlinked HTML for these mapped FSTs. The criteria for identifying insertion sites are not very stringent: we are trying to provide a useful reference of insertion positions in the Medicago genome for users wherein users need to verify the exact insertion sites. In addition to the FSTs, photos and phenotype descriptions for most of those mutant lines are also included in the database after phenotypic screening. In total 11 544 pictures for 10 158 lines and 1175 pieces of phenotypic description information are available and can be searched by line numbers or key words. GBrowse viewer In order for researchers to view alternate annotations in nearby genomic regions of mapped mutant insertions, we incorporated a platform-independent web application, GBrowse, developed by Stein et al. (2002) as the Generic Model Organism System Database Project (GMOD; http://www.gmod.org). The GMOD GBrowse viewer in combination with the MySQL database is used to store, search and display annotations and other features of the genome aligned to the genomic sequence. We downloaded all predicted gene models from IMGAG and mapped FSTs, Tentative Consensus (TC) sequences and Affymetrix probesets to the Medicago genome by BLAST search for constructing GBrowse in this database. The information for all these mapped features was imported into GBrowse for visualization. Thus, the New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

New Phytologist

1068 Research

GBrowse web pages are organized alongside features including predicted gene models, TCs, FSTs and Affymetrix probesets. To select a more specific region of the genome, users can enter either a precise sequence range or any valid Medicago identifier (e.g. gene model name, TC number, FST or probeset ID) in the ‘Landmark or Region’ box. GBrowse then fetches a highlighted feature or a region of the genome specified by the user’s search criteria. In addition, we have generated a gene expression atlas (Benedito et al., 2008) that provides a global view of gene expression in all major organ systems of Medicago, with special emphasis on nodule and seed development. To allow researchers to view the gene expression profile for a geneof-interest in which region mutant insertions are located, the gene expression information collected in the gene expression atlas was also integrated into our website by GBrowse. Users may examine the gene expression levels and patterns in plots via the link on an Affymetrix probeset bar presented on the GBrowse web pages. BLAST search The BLAST program, including BLASTN, TBLASTN and TBLASTX, is integrated into this mutant database. Users can use either nucleotide sequences or protein sequences as queries to search the FST database for mutant identification. The search results will be displayed with the corresponding ID of mutant lines as well as the alignments of homologous sequences. If one or more FSTs match the gene sequence, seeds can be ordered online from our website (http://medicago-mutant.noble.org/ mutant/).

from the regenerated Tnt1 lines using thermal asymmetric interlaced (TAIL)-PCR (Liu et al., 1995, 2005) to create an FST database (http://medicago-mutant.noble.org/mutant). Currently, the database includes c. 44 000 FSTs from c. 3400 regenerated Tnt1 lines. This FST database can be searched using the sequence of the gene to study. Ideally, one uses the genomic sequence of the gene to search the FST database. If only the coding sequence is available it can also be used. In the latter case insertions in the intronic regions will not be detected. Even though we estimate an average of c. 25 Tnt1 insertions per line (Tadege et al., 2008), the FSTs recovered by TAIL-PCR are far fewer than 25 per line due to the limitation of the method itself and high cost of the Sanger sequencing. We are in the process of exploring a high-throughput sequencing approach to recover FSTs in a more efficient way. Once the approach becomes practical, we will recover FSTs from all 21 700 lines. By that time, the number of total FSTs will be dramatically increased and the probability of finding an insertion in a given gene by BLAST search in the FST database will be greatly enhanced. Even in the current low-coverage FST database, we tested FST BLAST search using genomic sequences of 50 candidate genes and found at least one insertion in 16 genes (data not shown). PCR-based reverse screening in Tnt1 insertion population The genomic DNA (g-DNA) from 21 700 Tnt1 lines was extracted and pooled into three-level pools using a one-dimensional pooling strategy (Fig. 1a; see the Materials and Methods section). The pooled g-DNA system – representing 42 super pools (S-pools), 210 pools (P-pools), 2100 mini-pools (M-pools), and (a)

RNA extraction and RT-PCR Inflorescence shoots were collected from both heterozygous and homozygous plants of mtpi and mtag, and wild-type R108 plants. Total RNA was extracted using Tri-Reagent (Gibco-BRL Life Technologies, Grand Island, NY, USA) and treated with Turbo DNase I (Ambion, Austin, TX, USA). For RT-PCR, 3-lg samples of total RNA were used for reverse transcription using SuperScript III Reverse Transcriptase (Invitrogen) with olig(dT)20 primer. Two microlitres of 1 : 20 diluted cDNA were used as template for each 30-ll PCR reaction. MtACTIN2 was used as standard control. Gene-specific primers used for RT-PCR were listed in Table S1.

(b)

Results Searching the flanking sequence tags (FSTs) database Based on the genome size and average gene size, it has been estimated that 20 000 Tnt1 insertion lines with an average of 25 insertions per line are required to reach c. 90% saturation of the Medicago truncatula genome (Tadege et al., 2008). From 2003 to 2011, 21 700 Tnt1 lines were generated at the Samuel Roberts Noble Foundation. As described earlier (Tadege et al., 2008), we have been recovering the flanking sequence tags (FSTs) New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

Fig. 2 PCR-based reverse screening flow chart: (a) screening for insertions in super pools; (b) screening for individual lines. Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist

Research 1069

21 000 individual genomic DNAs – was used for reverse screening. Before the PCR-based screening, two pairs of gene-specific primers (GSP) were designed for every gene of interest (Fig. 1b). The specificity and efficiency of primers were tested using wildtype (R108) genomic DNA. Three primers were designed on both ends of Tnt1 (Fig. 1b). To begin the PCR screening, combinations of gene-specific primers with Tnt1 primers were used to selectively amplify the Tnt1-tagged gene-of-interest amplicons from the S-pools. A schematic screening procedure is shown in Fig. 2. There are four primer combinations for screening each gene of interest: GSP-F with Tnt1-F, GSP-F with Tnt1-R, GSPR with Tnt1-F, and GSP-R with Tnt1-R. First, combinations of GSP-F with Tnt1-F and GSP-F with Tnt1-R were used to screen the S-pools. Theoretically, any insertions within the region covered by GSP-F and GSP-R should be recovered using these two primer combinations. However, in some circumstances, specific amplification fails if the insertion site is far from the GSP-F and a large amplicon may not be efficiently amplified or if the GSP-F does not closely match with Tnt1 primers under the given PCR conditions. Second, if no specific amplicon was detected,

combinations of GSP-R with Tnt1-F and GSP-R with Tnt1-R were then used to re-screen the S-pools. For each screening of the S-pools, two rounds of PCR amplification were carried out with a touch-down PCR program (see the Materials and Methods section for details). The genomic DNA from S-pools was used as templates for the first round of PCR amplification and 50-fold diluted first round PCR products were used as templates for the second round of PCR (nested-PCR) amplification. Normally, no specific amplicons from the first round PCR were visible in agarose gels; the true positives which give bright bands in agarose gels were observed in the nestedPCR amplification (Fig. 3a). Positive PCR products were purified and sequenced for identity confirmation and insertion site identification. If the gene-specific insertion(s) from the S-pool screening were confirmed, further screening was sequentially performed in the corresponding lower level pools using the same primer combination that is used in the S-pool screening. There are five P-pools in one S-pool. When P-pools were screened, the same touch-down program was used for two rounds of PCR amplification, except for adjusting the extension time depending on the size of the confirmed S-pool PCR products.

(a)

Fig. 3 Representative results of PCR screening for Tnt1 insertions using the standard single-gene method. (a) PCR results in 10 super pools (S-pools). Two lanes for each S-pool: the first lane for the primary PCR and the next lane for the nested-PCR. Upper panel: results with primer pair GSP-F and Tnt1-F. Lower panel: results of the primer combination of GSP-F and Tnt1-R. One clear band was obtained in S-pool S4 with primers GSP-F and Tnt1-F and three significant bands were obtained in S-pools S7, S8 and S10 with primers GSP-F and Tnt1R. Circled bands represent those inserts that are further shown in the next panels. (b) PCR results from corresponding P-pools. S7 includes P31–P35, S8 includes P36–P40. The products in P32 show the same size as that in S7, and so does the PCR product in P39 as in S8. Circled bands represent those inserts that are further shown in the next panels. (c) PCR results from corresponding mini-pools. P32 includes M311–M320 and P39 includes M381–M390. The expected bands of P32 and P39 were obtained in M316 and M384, respectively. Circled bands represent those inserts that are further shown in the next panels. (d) PCR results from selected individual lines. M316 includes NF3485– NF3494 and M384 includes NF4273– NF4282. The expected bands of M316 and M384 were obtained in individual line NF3492 and NF4276, respectively. Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

(b)

(c)

(d)

New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

1070 Research

True positive products from the P-pool screening, which showed the same size as the one from the S-pool screening, were found in one of the five P-pools (Fig. 3b). Usually it is not necessary to sequence the PCR products from P-pools. After the product was obtained in a specific P-pool, the screening was followed in the corresponding 10 M-pools. PCR results from the first round amplification were checked to see if the same size PCR product was found in one of the 10 M-pools (Fig. 3c). If no clear PCR product with the expected size was obtained, then nested-PCR was carried out to obtain the positive product in a particular Mpool. The screening was continued in the corresponding 10 individual lines as described above (Fig. 3d). The final PCR product was purified and sequenced to re-confirm the identity of the PCR product to the original one from the S-pool. This allows identification of one or more Tnt1 insertion lines for a single geneof-interest. Modification of the standard screening procedure to identify insertions in multiple genes Because individual genomic DNA was extracted from the original regenerated lines (R0), the availability and amount of the g-DNA is limited and not reproducible. To maximize the utilization of the pooled DNA, we developed a multiple-gene screening method. In this modified method, three GSP primers from three different genes, in combination with Tnt1-F or Tnt1-R primers, were used in each PCR reaction in the S-pool screening, that is

New Phytologist three genes were screened at the same time (Fig. 4). Positive PCR products were purified and sequenced to identify Tnt1 insertions in a specific gene and to specify insertion sites in given genes. If mixed products were detected in any specific S-pools, individual nested-PCR was performed using the single nested GSP primer with Tnt1-F1 or R1 in the reaction. The screening procedure in lower pools was same as the standard method. In order to compare the screening efficiency for target genes between the two methods, three genes (Gene A, B and C) were screened in 24 super pools as described above (Fig. 4). Using the standard screening method, 18 positive products for three genes were detected in 14 out of the 24 S-pools with individual GSP-F and Tnt1-F primer combination (FF; Fig. 4a, lower panels), and nine products in nine out of the 24 S-pools with GSP-F and Tnt1-R primers (FR; Fig. 4b, lower panels). Using the multiplegene screening method, 11 single products and one mixed product (S36FF) were detected in 12 out of the 24 S-pools with FF primers (Fig. 4a, upper panel), and eight products in eight out of 24 S-pools with FR primers (Fig. 4b, upper panel). The nestedPCR was re-run in S36 with individual primer pairs to obtain two single products. Most of the products (21 out of 27) that were amplified by the single-gene method were detected in the corresponding pools by the multiple-gene screening method; however, all positive products resulting from the multiple-gene screening method were detected by the standard method. Three products, two from FF primers and one from FR, were missed in S17, S30 and S23 pools using the multiple-gene screening

(a)

(b) Fig. 4 Comparison of the three-gene screening method with the single-gene screening method in super pools 13–36. (a) PCR results with primer pairs GSP-Fs and Tnt1-F. Upper panel shows nested PCR results of mixed forward primers A-F, B-F and C-F with Tnt1-F; lower three panels show the results of individual forward primer of A-F, B-F or C-F with Tnt1-F. (b) PCR results with primer pairs GSP-Fs and Tnt1-R. Upper panel shows nested-PCR results of mixed forward primers A-F, B-F and C-F with Tnt1-R; lower three panels show the results of individual forward primers of A-F, B-F or C-F with Tnt1-R. New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist

Research 1071

method. Products for more than one gene were detected in four S-pools (S23, S25, S34 and S36) using the single-gene method, whereas mixed products were detected in one pool (S36) and only small-sized products were amplified in the other three pools (S23, S25, and S34). Sequence analysis showed that 20 flanking sequences from the multiple-gene method were identical to corresponding products from the standard method, and one sequence (S25FF) had mixed readings, indicating a mixture of products, B-S25FF and C-S25FF, was amplified in the same S-pool. The nested-PCR was re-run in S25 with individual primer pairs and two single products were obtained. We further tested several additional sets of three genes using both methods and similar results were obtained, indicating that most Tnt1 insertions were detected by the multiple-gene screening method which significantly conserves the S-pool DNA and also saves time and resources. Therefore, the multiple-gene screening method has been used as the default method for the PCR-based reverse screening. Furthermore, we compared the screening success rates of the two methods. Eighty-one genes were screened in 14–18 S-pools by the single-gene screening method. One or more Tnt1 insertions were identified in 69 genes with the success rate at c. 85%. In the same 14–18 S-pools, 280 genes were screened by the multiple-gene screening method, one or more Tnt1 insertions were identified in 226 genes with a success rate of c. 80% (Table 1). However, the success rate of the multiple-gene method increased with the increase of total S-pools (Table 1). In 2011, 235 genes were screened in 36 S-pools using the multiple-gene method. Two hundred and three genes were detected with one or more Tnt1 insertions, with a success rate of 86%. More recently, 89 genes were screened in 42 S-pools. Seventy-eight genes were detected with one or more Tnt1 insertions with an 87.6% success rate (Table 1). In summary, the multiple-gene screening strategy described here allowed us to screen Tnt1 insertion lines for three genes in 42 S-pools with 500 or fewer PCR reactions and achieve a success rate of 87% (Table 1). Analysis of Tnt1 insertion frequency with the screened genes So far > 840 genes, requested from 75 different laboratories in 15 different countries, have been screened for Tnt1 insertions. Out Table 1 Effect of Medicago truncatula Tnt1 insertion population size on screening success rates Total genes Single 14–18 S-pools Multiple 14–18 S-pools 24–28 S-pools 36 S-pools 42 S-pools Total

Genes w/Tnt1

Genes w/o Tnt1

Success rate (%)

81

69

10

85.2

280 155 235 89 840

226 129 203 78 705

54 26 32 11 133

80.8 83.2 86.8 87.6 84


of these genes, 655 genes have a gene identification number (ID) and are distributed to the eight pseudo-chromosomes (Mt3.5; Fig. S1). For the remaining 185 genes, some have genomic sequences without gene IDs, whereas some only have cDNA or EST sequences. These genes were not mapped to the pseudochromosomes. Tnt1 insertion preference The smallest gene we screened for an insertion thus far was 486 bp, whereas the largest gene was 25.4 kb. We analyzed the correlation between the gene size and the screening success rate. All 840 screened genes were classified into five groups based on their sizes: < 1.0, 1.0–4.0, 4.0–7.0, 7.0–10.0 and > 10.0 kb. The screening success rates for each group were 77.9%, 86.3%, 80.8%, 79.1% and 54.8%, respectively (Table 2). The success rates for the first four groups with gene sizes between < 1.0 and 7.0–10.0 kb were essentially very close. When the gene size is > 10.0 kb, surprisingly the screening success rate dropped dramatically, even though large genes were divided into multiple 3–4 kb fragments to cover all the gene regions. For small genes, Tnt1 insertions were sometimes identified in the promoter or 3′ UTR regions. In order to determine whether the transcription rate can influence the insertion frequency, we searched the Medicago Gene Expression Atlas and compared the overall expression level of four sets of ten genes in the following groups: (1) c. 3.0 kb with eight or more insertions, (2) > 7.0 kb with eight or more insertions, (3) c. 3.0 kb without insertions, and (4) > 7.0 kb without insertions. No apparent correlations between the expression levels and the insertion frequency were observed among the four groups. Miyao et al. (2003) showed that the rice retrotransposon Tos17 inserts more frequently in kinases and resistance genes. The 840 genes screened for Tnt1 insertion fall into different functional categories: nucleotide binding, protein binding, transcription factors, metabolic enzymes, transporters, kinases, hypothetical proteins, etc. The screening success rates for individual functional categories were analyzed. Results showed that the success rates ranged from the highest (88.6%) in kinase genes to the lowest (82.7%) in nucleotide binding genes (Table S2). All of them were close to the overall success rate of 84%, indicating that Tnt1 insertion may not favor any specific categories of gene

Table 2 Effect of gene size on the screening success rate of Medicago truncatula Tnt1 mutants Gene size

Total genes

Genes w/Tnt1

Genes w/o Tnt1

Success rate, % (P-value*)

< 1.0 kb 1.0–4.0 kb 4.0–7.0 kb 7.0–10.0 kb > 10.0 kb

68 526 172 43 31

53 454 139 34 17

15 76 35 9 8

77.9 (0.81) 86.3 (0.66) 80.8 (0.88) 79.1 (0.93) 54.8 (0.22)

*v2 test with Yates’ continuity correction. New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

New Phytologist

1072 Research

functions. However, a larger number of insertions need to be analyzed to further confirm the above statement.

in 10 individual genes. These results indicated that the Tnt1 insertions randomly distributed in all 10 genes examined (data not shown).

Multiple Tnt1 insertions Multiple Tnt1 insertion alleles for a single gene were frequently detected during the screening. To examine whether screening for mutants in larger genes results in more Tnt1 insertion alleles, we analyzed the correlation between the gene size and the alleles obtained (Table 3). The result shows that > 50% of the screened genes with sizes of 1–4 kb have three or more Tnt1 insertion alleles recovered. The rates are 40.2% and 37% for genes smaller than 1.0 kb and between 4.0 and 10.0 kb, respectively, indicating that the highest rate of multiple insertions was observed in medium-sized genes. The rate is only 7% for genes larger than 10 kb. Taken together, the screening results of 840 genes indicate that Tnt1 may prefer to insert in medium-sized genes and does not favor large genes. The difference between the Tnt1 insertion numbers among the 840 genes is noteworthy. Amongst these genes, no Tnt1 insertion alleles were identified for 142 genes, whereas eight or more insertion alleles were found for 68 genes. Among the 68 genes with eight or more insertion alleles, some genes had as many as 20 Tnt1 insertion alleles. Analysis of genes with noinsertion and multiple-insertion alleles indicates that both classes of genes distributed in all eight pseudo-chromosomes and fell into different functional categories. To examine whether Tnt1 insertion cold spots or hot spots exist, we mapped the genes with no insertion and genes with multiple insertion alleles, represented by green or red dots, respectively, to the eight pseudochromosomes, and no aggregated green or red dots were observed (Fig. S1). The size of most genes with multiple insertion alleles ranged between 1.0 and 7.0 kb and none was larger than 9.0 kb. However, the screening results did not exclude the possibility that some genes with large genomic size might have eight or more Tnt1 insertion alleles in the whole insertion population because the screening was initially carried out in 24 S-pools with primer combinations of GSP-F and Tnt1-F or Tnt1-R. Furthermore, if multiple insertion alleles were detected for a specific gene, no further screening with other primer combinations was carried out in the remaining super pools. Therefore, our default screening procedure does not necessarily maximize the recovery of Tnt1 insertions in a gene in the entire Tnt1 insertion population. Among the genes with multiple Tnt1 insertion alleles, 10–20 Tnt1 insertion alleles were obtained for some genes. To check if Tnt1 insertions in the alleles were clustered in certain region(s) of the individual genes, we examined Tnt1 insertion sites of alleles

A case study Screening for insertion mutants for genes-of-interest In order to compare the two reverse genetics approaches and prove the utility of our Tnt1 mutant collection for gene function analyses, we selected two homeotic MADS box genes, MtPISTILLATA (MtPI) and MtAGAMOUS (MtAG). MtPI is a class B MADSbox gene and is required for petal and stamen identity specification during flower development. Mutation of MtPI leads to the identity loss of petals and stamens in M. truncatula and Arabidopsis (Goto & Meyerowitz, 1994; Kramer et al., 1998; Benlloch et al., 2009). Arabidopsis AGAMOUS (AtAG) is a class C MADS-box gene controlling stamen and carpel development in Arabidopsis (Gomez-Mena et al., 2005). Although MtAG has not been characterized before, an EST (accession number: CX539597) for MtAG, which shares 70% identity of protein sequence to the Arabidopsis AGAMOUS, was obtained from the M. truncatula EST database. We first searched the FST database for Tnt1 insertion lines in MtPI and MtAG. We did not find a true hit for MtAG but found one insertion line (mtpi-3, NF5477) for MtPI. In mtpi-3, the Tnt1 inserts in the first intron of MtPI and homozygous plants did not exhibit visible phenotypes. We then designed two sets of primers for each gene, MtPI-F and MtPI-F1, MtPI-R and MtPI-R1, MtAG-F and MtAG-F1, MtAG-gR and MtAG-gR1 based on the MtPI genomic sequence (2181 bp) and the MtAG EST sequence (1150 bp). Combined with Tnt1 forward and reverse primers, PCR-based reverse screening was carried out in the Tnt1 insertion population as described above. We identified two insertion lines, NF13337 (mtpi-1) and NF5318 (mtpi-2), for MtPI and two insertion lines, NF10148 (mtag-1) and NF13380 (mtag-2), for MtAG. Sequence alignment revealed that NF5318 has a Tnt1 insertion at the 1935th bp (in the 5th exon) and NF13337 has an insertion at the 1112th bp (in the 4th exon) from ATG of MtPI (Fig. 5a). NF10148 and NF13380 harbor the Tnt1 insertions at the 272th bp and the 260th bp from ATG of MtAG, respectively (Fig. 6a). Molecular characterization of the insertion mutants For MtPI insertion lines, four individual plants (plant numbers 1, 2, 3 and 5) from NF13337 and three plants (plant numbers 2, 3 and 5) from NF5318 were found to contain the Tnt1 insertion (Fig. 5b). Of these plants with Tnt1 insertions, two plants (plant numbers 2 and 5) from NF13337 and one (plant number 3) from NF5318 were homozygous for the Tnt1 insertion

Table 3 Effect of gene size on the insertion number in Medicago truncatula Tnt1 mutants

Genes w/more than 3 Tnt1 insertion Total genes screened Multiple insertion percentage


< 1.0 kb

1.0–4.0 kb

4.0–7.0 kb

7.0–10.0 kb

> 10.0 kb

22 56 40

235 468 50.2

55 147 37.4

15 40 37.5

1 15 7


New Phytologist

Research 1073 (a)

Fig. 5 Molecular characterization of Medicago truncatula Tnt1 insertion mutant mtpi-1 and mtpi-2. (a) Three Tnt1 insertion locations in the MtPI genomic sequence. Filled boxes represent exons and solid line segments represent introns. (b) PCR amplification with primers MtPI-F or MtPI-F2 and Tnt1-R. Results show that plants 1, 2, 3 and 5 harbor the Tnt1 insertion in NF13337 progenies and plants 2, 3 and 5 have the Tnt1 insertion in NF5318 progenies. (c) PCR results with primers MtPI-F and MtPI-R showing that plants 2 and 5 from NF13337 and plant 3 from NF5318 are homozygous for Tnt1 insertions. (d) RT-PCR results with primers MtPI-F and MtPI-R show one slightly smaller PCR product in plants 2 and 5 of NF13337 and no PI expression in plant 3 of NF5318 compared to the wild-type R108 plant and Tnt1 heterozygous plant 1.

(b)

(c)

(d)

(a)

(b)

Fig. 6 Molecular characterization of Medicago truncatula Tnt1 insertion mutant mtag-1 and mtag-2. (a) Two Tnt1 insertion sites in MtAG cDNA sequence. The filled box represents MtAG cDNA. (b) PCR results with primers MtAG-F and Tnt1-R, showing that nine plants were detected with the Tnt1 insertion in NF10148 and six progenies were detected with the Tnt1 insertion in NF13380. (c) PCR results with primers MtAG-F and MtAG-R2 showing that plants 2, 7 and 13 were homozygous for Tnt1 insertion in NF10148 and plants 2 and 4 were homozygous for Tnt1 insertion in NF13380. (d) RT-PCR results with primers MtAG-F and MtAG-R3 show no detectable MtAG expression in Tnt1 insertion homozygous plants 2 and 13 of NF10148 and plants 2 and 4 of NF13380.

(c)

(d)

(Fig. 5c). Gene expression analyses indicated that no MtPI transcript was detected in the homozygous Tnt1 mutant plant of NF5318 (plant number 3), whereas a slightly smaller transcript of MtPI was detected in homozygous mutant plants of NF13337 (plant numbers 2 and 5; Fig. 5d). Sequence analysis of the smaller transcript indicated that there was a 45-bp deletion between the 4th and 5th exons of MtPI in the mtpi-1 mutant. The 45-bp fragment, which locates between the Tnt1 insertion and the 4th intron, was spliced out with the inserted Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

Tnt1 and the 4th intron, leaving a smaller MtPI transcript. Because the deletion is in frame, the deduced protein of the truncated MtPI transcript is 15 aa shorter than the wild-type MtPI protein. In MtAG Tnt1 insertion lines NF10458 and NF13380, the progenies detected with Tnt1 insertion by PCR were shown in Fig. 6(b), and the determination of Tnt1 homozygous mutant plants was shown in Fig. 6(c). The transcript of MtAG was undetectable in homozygous Tnt1 insertion plants in both alleles (Fig. 6d). New Phytologist (2014) 201: 1065–1076 www.newphytologist.com

New Phytologist

1074 Research

Morphological characterization of the insertion mutants Both alleles of mtpi and mtag Tnt1 insertion mutants displayed defects in flower development (Fig. 7). Other siblings from the segregating population of both alleles were either wild-type or heterozygous for the PI locus or the AG locus but contained all other unlinked insertions (homozygous or heterozygous) of the parent lines. However, these siblings showed no flower phenotypes. The mutant mtpi-1, which has a smaller transcript of MtPI, exhibited a weak flower phenotype. Petals in the second whorl were partially transformed into yellow-greenish sepal-like structures, whereas stamens were normal looking and sepals and

(a)

(b)

(c)

(d)

(e)

(f)

(i)

carpels remained normal (Fig. 7b,c). However, mutant mtpi-2, which has no MtPI transcripts detected, showed severe flower defects with loss of floral organ identity in the second and third whorls. Petals were completely transformed into five green sepallike structures. Stamens of mutant mtpi-2 in the third whorl developed into carpel-like structures with ovules inside (Fig. 7d, e). No seeds were produced in either mutant. MtAG mutants mtag-1 and mtag-2 exhibited a similar but weak phenotype in carpel development. The first and second whorls of sepals and petals were the same as in wild-type flowers. Stamens in the third whorl were normal looking. Only carpels in the inner whorl were

(g)

(j)


(h)

(k)

Fig. 7 Morphological characterization of Medicago truncatula Tnt1 insertion mutants mtpi and mtag. (a) Wild-type flower; (b, c) mtpi-1 flowers, showing five sepal-like petals (b) and normal-looking stamens and carpel (c). (d, e) mtpi-2 flowers, showing five sepallike petals (d) and five carpel-like stamens (e). (f–h) mtag-1 flowers, showing normal stamens (f) and defective carpels with two stigmas (g) and unfused carpels (h); (i, j) mtag-2 flowers, showing normal stamens and defective carpels with two stigmas (i), unfused carpels and exposed ovules (j); (k) wild-type sepals and carpel. Bars, 1 mm. Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist defective with two stigmas formed, un-fused carpel edges and exposed ovules (Fig. 7g–j). Flowers were generally sterile, although occasionally a few seeds were produced in mtag-2. This preliminary phenotypic analysis of these two flower mutants is in agreement with the roles of these genes in controlling floral identity in other plant species and demonstrates the usefulness of our Tnt1 mutant collection.

Discussion In this paper we demonstrated two successful and efficient reverse genetics approaches in utilizing the M. truncatula mutant population: PCR-based reverse screening and web-based FST database searching. The combined approaches enable us to quickly find Tnt1 insertions in genes of interest. The PCR-based screening approach allows researchers to focus on a small number of genes of interest. The modified multiple-gene screening method enables us to screen three genes in 42 super pools, which include 21 000 individual Tnt1 insertion lines, with c. 500 PCR reactions in 3–4 wk. The success rate of the PCR screening method is close to c. 86%, which is comparable to the success rate in Arabidopsis (Krysan et al., 1999; Stepanova & Alonso, 2006). Furthermore, multiple insertion alleles, which are randomly distributed across the gene coding regions, can be obtained for c. 50% of mediumsized genes. The availability of multiple insertion alleles, on the one hand, makes it possible to choose insertion lines with suitable insertion sites (e.g. in exons); on the other, it can quickly confirm loss-of-function phenotypes of genes of interest without pursuing time-consuming transformation experiments for gene complementation. The existence of insertion bias, that insertions prefer genic regions and the insertion density is closely correlated with the gene density, has been reported by several authors (Alonso et al., 2003; Miyao et al., 2007; Tadege et al., 2008). It is assumed that if a random distribution of insertions in gene-rich regions occurs, the gene size should be positively correlated with the insertion frequency; that is, the larger the gene is, the higher the probability of insertion (Krysan et al., 1999; Li et al., 2006; Tadege et al., 2008). Our screening data analyses indicated that Tnt1 prefers medium-sized genes, evidenced by the high insertion frequency and more insertion alleles recovered in medium-sized genes. For large-sized genes (> 4.0 kb), however, a negative correlation was observed between the gene size and the Tnt1 insertion frequency. How gene size affects the insertion probability is largely unknown. It has been proposed that the transcriptional activity is a determinant of the target site preference, whereas the gene length is negatively correlated with the overall expression level. Highly expressed genes are small in size due to selective forces that favor minimizing the energy and time in transcription (Castillo-Davis et al., 2002; Camiolo et al., 2009). Therefore, large genes have low transcription activity and a low insertion rate. Similar observations were reported for T-DNA insertions in Arabidopsis and Tos17 insertions in rice (Alonso et al., 2003; Miyao et al., 2003). However, one study on the analysis of T-DNA insertion site distribution patterns in Arabidopsis indicated that lack of detectable transcriptional activity is one of the reasons for Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

Research 1075

no insertions in genes (Li et al., 2006). In our analysis, no correlations between the expression level and the insertion frequency were observed. Therefore, the effect of the transcription activity on the insertion frequency remains elusive. In summary, the Tnt1-tagged M. truncatula mutant population generated at the Noble Foundation is an invaluable resource for the research community. This mutant collection has already been widely used and will be as irreplaceable for researchers working on legume biology as the SALK T-DNA collections for Arabidopsis. The currently available 21 700 lines represent over 525 000 independently distributed Tnt1 insertions in the M. truncatula genome. From the efficiency of reverse screening and the genome saturation probability, the current 21 700 lines may have insertions in a c. 85–90% of the M. truncatula genome. However, the FST database currently hosts only c. 44 000 FSTs from c. 3400 lines. Before a high-throughput FST sequencing approach is developed such that the number of FSTs is dramatically increased, direct BLAST searching of the FST database has only a small chance of finding an insertion line in genes of interest. The reverse screening of DNA pools will still play a significant role in reverse genetics of M. truncatula. When large-scale high-throughput FST sequencing is performed on the mutant collections, it will be possible to map most Tnt1 insertions (if not all) on the Medicago genome. Finding an insertional mutant in one’s favourite gene of interest will then be a matter of checking the website (http://medicago-mutant.noble.org/ mutant/) and ordering the seeds from a system analogous to the SALK T-DNA lines. The development of this FST database corresponding to the majority of the Tnt1 insertions in the population will, thus, represent a very valuable tool for the research community.

Acknowledgements We would like to thank Kuihua Zhang for plant care and seed curation, Shulan Zhang for assistance with flanking sequence recovery and Janie Gallaway for organizing forward screening. This work was supported by the Samuel Roberts Noble Foundation and, in part, by NSF plant genome grants (DBI 0703285 and IOS 1127155) and by the European Union (EU FP6-GLIP project FOOD-CT-2004-506223).

References Alonso JM, Stepanova AN, Leisse TJ, Kim CJ, Chen H, Shinn P, Stevenson DK, Zimmerman J, Barajas P, Cheuk R et al. 2003. Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science 301: 653–657. An GH, Lee S, Kim SH, Kim SR. 2005. Molecular genetics using T-DNA in rice. Plant and Cell Physiology 46: 14–22. Benedito VA, Torres-Jerez I, Murray JD, Andriankaja A, Allen S, Kakar K, Wandrey M, Verdier J, Zuber H, Ott T et al. 2008. A gene expression atlas of the model legume Medicago truncatula. Plant Journal 55: 504–513. Benlloch R, d’Erfurth I, Ferrandiz C, Cosson V, Pio Beltran J, Antonio Canas L, Kondorosi A, Madueno F, Ratet P. 2006. Isolation of mtpim proves Tnt1 a useful reverse genetics tool in Medicago truncatula and uncovers new aspects of AP1-like functions in legumes. Plant Physiology 142: 972–983. Benlloch R, Roque E, Ferrandiz C, Cosson V, Caballero T, Penmetsa RV, Pio Beltran J, Antonio Canas L, Ratet P, Madueno F. 2009. Analysis of B


New Phytologist

1076 Research function in legumes: PISTILLATA proteins do not require the PI motif for floral organ development in Medicago truncatula. Plant Journal 60: 102–111. Bennetzen JL. 2000. Transposable element contributions to plant gene and genome evolution. Plant Molecular Biology 42: 251–269. Bevan M, Walsh S. 2005. The Arabidopsis genome: a foundation for plant research. Genome Research 15: 1632–1642. Brocard L, d’Erfurth I, Kondorosi A, Ratet P. 2008. Reverse genetic approaches in Medicago truncatula. In: Kirti PB, ed. Handbook of new technologies for genetic improvement of legumes. Boca Raton, FL, USA: CRC Press, 353–369. Brocard L, Schultze M, Kondorosi A, Ratet P. 2006. T-DNA mutagenesis in the model plant Medicago truncatula: is it efficient enough for legume molecular genetics? CAB Reviews: Perspectives in Agriculture, Veterinary Science, Nutrition and Natural Resources 1: 7. Camiolo S, Rau D, Porceddu A. 2009. Mutational biases and selective forces shaping the structure of Arabidopsis genes. PLoS ONE 4: e6356. Castillo-Davis CI, Mekhedov SL, Hartl DL, Koonin EV, Kondrashov FA. 2002. Selection for short introns in highly expressed genes. Nature Genetics 31: 415–418. Cheng X, Wen J, Tadege M, Ratet P, Mysore KS. 2011. Reverse genetics in Medicago truncatula using Tnt1 insertion mutants. Methods in Molecular Biology (Clifton, NJ) 678: 179–190. Courtial B, Feuerbach F, Eberhard S, Rohmer L, Chiapello H, Camilleri C, Lucas H. 2001. Tnt1 transposition events are induced by in vitro transformation of Arabidopsis thaliana, and transposed copies integrate into genes. MGG Molecular Genetics and Genomics 265: 32–42. d’Erfurth I, Cosson V, Eschstruth A, Lucas H, Kondorosi A, Ratet P. 2003. Efficient transposition of the Tnt1 tobacco retrotransposon in the model legume Medicago truncatula. Plant Journal 34: 95–106. Fladung M, Deutsch F, Honicka H, Kumar S. 2004. T-DNA and transposon tagging in aspen. Plant Biology 6: 5–11. Fu F-F, Ye R, Xu S-P, Xue H-W. 2009. Studies on rice seed quality through analysis of a large-scale T-DNA insertion population. Cell Research 19: 380–391. Gomez-Mena C, de Folter S, Costa MMR, Angenent GC, Sablowski R. 2005. Transcriptional program controlled by the floral homeotic gene AGAMOUS during early organogenesis. Development 132: 429–438. Goto K, Meyerowitz EM. 1994. Function and regulation of the Arabidopsis floral homeotic gene Pistillata. Genes and Development 8: 1548–1560. Grandbastien MA, Spielmann A, Caboche M. 1989. Tnt1, a mobile retroviral-like transposable element of tobacco isolated by plant cell genetics. Nature 337: 376–380. Kramer EM, Dorit RL, Irish VF. 1998. Molecular evolution of genes controlling petal and stamen development: duplication and divergence within the APETALA3 and PISTILLATA MADS-box gene lineages. Genetics 149: 765–783. Krysan PJ, Young JC, Sussman MR. 1999. T-DNA as an insertional mutagen in Arabidopsis. Plant Cell 11: 2283–2290. Larmande P, Gay C, Lorieux M, Perin C, Bouniol M, Droc G, Sallaud C, Perez P, Barnola I, Biderre-Petit C et al. 2008. Oryza Tag Line, a phenotypic mutant database for the Genoplante rice insertion line library. Nucleic Acids Research 36: D1022–D1027. Li Y, Rosso MG, Uelker B, Weisshaar B. 2006. Analysis of T-DNA insertion site distribution patterns in Arabidopsis thaliana reveals special features of genes without insertions. Genomics 87: 645–652. Liu Y-G, Chen Y, Zhang Q. 2005. Amplification of genomic sequences flanking T-DNA insertions by thermal asymmetric interlaced polymerase chain reaction. Methods in Molecular Biology (Clifton, NJ) 286: 341–348. Liu Y-G, Mitsukawa N, Oosumi T, Whittier RF. 1995. Efficient isolation and mapping of Arabidopsis thaliana T-DNA insert junctions by thermal asymmetric interlaced PCR. Plant Journal 8: 457–463. Martienssen RA. 1998. Functional genomics: probing plant gene function and expression with transposons. Proceedings of the National Academy of Sciences, USA 95: 2021–2026. Miyao A, Iwasaki Y, Kitano H, Itoh J-I, Maekawa M, Murata K, Yatou O, Nagato Y, Hirochika H. 2007. A large-scale collection of phenotypic data describing an insertional mutant population to facilitate functional analysis of rice genes. Plant Molecular Biology 63: 625–635. Miyao A, Tanaka K, Murata K, Sawaki H, Takeda S, Abe K, Shinozuka Y, Onosato K, Hirochika H. 2003. Target site specificity of the Tos17 retrotransposon shows


a preference for insertion within genes and against insertion in retrotransposon-rich regions of the genome. Plant Cell 15: 1771–1780. Okamoto H, Hirochika H. 2000. Efficient insertion mutagenesis of Arabidopsis by tissue culture-induced activation of the tobacco retrotransposon Tto1. Plant Journal 23: 291–304. Parinov S, Sundaresan V. 2000. Functional genomics in Arabidopsis: large-scale insertional mutagenesis complements the genome sequencing project. Current Opinion in Biotechnology 11: 157–161. Scholte M, d’Erfurth I, Rippa S, Mondy S, Cosson V, Durand P, Breda C, Trinh H, Rodriguez-Llorente I, Kondorosi E, et al. 2002. T-DNA tagging in the model legume Medicago truncatula allows efficient gene discovery. Molecular Breeding 10: 203–215. Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A et al. 2002. The generic genome browser: a building block for a model organism system database. Genome Research 12: 1599–1610. Stepanova AN, Alonso JM. 2006. PCR-based screening for insertional mutants. Methods in Molecular Biology (Clifton, NJ) 323: 163–172. Sussman MR, Amasino RM, Young JC, Krysan PJ, Austin-Phillips S. 2000. The Arabidopsis knockout facility at the University of Wisconsin-Madison. Plant Physiology (Rockville) 124: 1465–1467. Tadege M, Wen J, He J, Tu H, Kwak Y, Eschstruth A, Cayrel A, Endre G, Zhao PX, Chabaud M et al. 2008. Large-scale insertional mutagenesis using the Tnt1 retrotransposon in the model legume Medicago truncatula. Plant Journal 54: 335–347. Wang H, Chen J, Wen J, Tadege M, Li G, Liu Y, Mysore KS, Ratet P, Chen R. 2008. Control of compound leaf development by FLORICAULA/LEAFY ortholog SINGLE LEAFLET1 in Medicago truncatula. Plant Physiology 146: 1759–1772. Weigel D, Ahn JH, Blazquez MA, Borevitz JO, Christensen SK, Fankhauser C, Ferrandiz C, Kardailsky I, Malancharuvil EJ, Neff MM et al. 2000. Activation tagging in Arabidopsis. Plant Physiology (Rockville) 122: 1003–1013. Wu JL, Wu CJ, Lei CL, Baraoidan M, Bordeos A, Madamba MRS, Ramos-Pamplona M, Mauleon R, Portugal A, Ulat VJ et al. 2005. Chemicaland irradiation-induced mutants of indica rice IR64 for forward and reverse genetics. Plant Molecular Biology 59: 85–97. Yamazaki M, Tsugawa H, Miyao A, Yano M, Wu J, Yamamoto S, Matsumoto T, Sasaki T, Hirochika H. 2001. The rice retrotransposon Tos17 prefers low-copy-number sequences as integration targets. MGG Molecular Genetics and Genomics 265: 336–344. Young ND, Debelle F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KFX, Gouzy J, Schoof H et al. 2011. The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480: 520–524.

Supporting Information Additional supporting information may be found in the online version of this article. Fig. S1 Distribution of total screened genes across eight pseudochromosomes. Table S1 Primer sequences used in the Tnt1 screening and gene characterization Table S2 Effect of gene function classification on Tnt1 screening success rate Please note: Wiley Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.