Development and integration of EST–SSR markers into an established ...

7 downloads 0 Views 407KB Size Report
Jul 3, 2013 - Tobias and colleagues ... dancy with published SSR PPs (Tobias et al. 2005, ...... Penning BW, Barry K, McCann MC, Carpita NC, Lazo GR.
Mol Breeding DOI 10.1007/s11032-013-9921-1

Development and integration of EST–SSR markers into an established linkage map in switchgrass Linglong Liu • Yalin Huang • Somashekhar Punnuri • Tim Samuels • Yanqi Wu • Ramamurthy Mahalingam

Received: 26 March 2013 / Accepted: 3 July 2013 Ó Springer Science+Business Media Dordrecht 2013

Abstract Switchgrass (Panicum virgatum L.) is a model cellulosic biofuel crop in the United States. Simple sequence repeat (SSR) markers are valuable resources for genetic mapping and molecular breeding. A large number of expressed sequence tags (ESTs) of switchgrass are recently available in our sequencing project. The objectives of this study were to develop new SSR markers from the switchgrass EST sequences and to integrate them into an existing linkage map. More than 750 unique primer pairs (PPs) were designed from 243,600 EST contigs and tested for PCR amplifications, resulting in 538 PPs effectively producing amplicons of expected sizes. Of the

Linglong Liu and Yalin Huang have contributed equally to this work.

effective PPs, 481 amplifying informative bands in NL94 were screened for polymorphisms in a panel consisting of NL94 and its seven first-generation selfed (S1) progeny. This led to the selection of 117 polymorphic EST–SSRs to genotype a mapping population encompassing 139 S1 individuals of NL94. Of 83 markers demonstrating clearly scorable alleles in the mapping population, 79 were integrated into a published linkage map, with three linked to accessory loci and one unlinked. The newly identified EST–SSR loci were distributed in 17 of 18 linkage groups with 27 (32.5 %) exhibiting distorted segregations. The integration of EST–SSRs aided in reducing the average marker interval (cM) to 3.7 from 4.2, and reduced the number of gaps (each[15 cM) to 10 from 23. Developing new EST–SSRs and constructing a higher density linkage map will facilitate quantitative trait locus mapping and provide a firm footing for marker-assisted breeding in switchgrass.

Electronic supplementary material The online version of this article (doi:10.1007/s11032-013-9921-1) contains supplementary material, which is available to authorized users. L. Liu  Y. Huang  S. Punnuri  T. Samuels  Y. Wu (&) Department of Plant and Soil Sciences, Oklahoma State University, Stillwater, OK 74078, USA e-mail: [email protected] L. Liu National Key Laboratory for Crop Genetics and Germplasm Enhancement, Jiangsu Plant Gene Engineering Research Center, Nanjing Agricultural University, Nanjing 210095, China

Y. Huang Nanjing Forest Police College, Nanjing 210046, China S. Punnuri Agricultural Research Station, Fort Valley State University, Fort Valley, GA 31030, USA R. Mahalingam (&) Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, OK 74078, USA e-mail: [email protected]

123

Mol Breeding

Keywords Simple sequence repeat (SSR)  Expressed sequence tag (EST)  Linkage map  Switchgrass

Introduction Switchgrass is a model cellulosic biofuel crop in the United States (Wright and Turhollow 2010). The reasons behind selecting switchgrass for bioenergy feedstock production are its high biomass yield potential, low input and wide adaptability to marginal land (Bouton 2007; Schmer et al. 2008). Switchgrass grown on marginal cropland produces more than five times as much renewable energy as that used in the production process, while estimated greenhouse gas emission for producing ethanol from switchgrass is 94 % lower than that from gasoline (Schmer et al. 2008). It is expected that progress in genetics and genomics will contribute to increasing biomass yields in switchgrass through improving breeding protocols, and that in turn will enhance its sustainability and biofuel economics (Tobias et al. 2008; Schmer et al. 2008). Simple sequence repeat (SSR) markers, also known as microsatellites, are highly variable in the number of repeats at a specific locus and are distributed throughout the eukaryotic genomes (Katti et al. 2001). SSR markers have been widely adopted in crop species for construction of linkage maps, quantitative trait locus (QTL) mapping, gene cloning, and marker-assisted selection (Korzun 2002). SSRs can be developed from either genomic sequences or expressed sequence tags, or both. In switchgrass, the development of both types of SSR markers has been demonstrated. Two independent research groups have developed SSRs by sequencing SSR-enriched genomic libraries (Okada et al. 2010; Wang et al. 2011). Tobias and colleagues reported the primer sequences of several hundred SSR markers derived from switchgrass EST sequencing projects (Tobias et al. 2006; Tobias et al. 2008). Sharma et al. (2012) constructed two bacterial artificial chromosome (BAC) libraries (169 genome equivalents) of switchgrass, and a total of 50,206 SSRs were identified from 330,297 BAC-end sequences in silico. Recently, we generated 980,000 reads using Roche 454 GS-FLX Titanium technology from four cDNA libraries. Our in silico analysis indicated nearly 9 % of these ESTs containing SSRs

123

(Wang et al. 2012). Although sequences with SSRs are abundant in the literature and public databases, most of them are not well characterized. Due to the large, highly heterozygous and polyploid genome of switchgrass, more molecular markers including SSRs are needed for developing high-density linkage maps to facilitate marker-assisted breeding in switchgrass (Costich et al. 2010; Liu and Wu 2013). Genetic maps are fundamental for marker-assisted breeding programs, QTL mapping and for cloning genes for economically important traits. The first publicly available switchgrass genetic map comprised of 102 restriction fragment length polymorphism markers (Missaoui et al. 2005). Recently, higher density male and female maps were constructed and their lengths were 1,645 and 1,376 cM, respectively (Okada et al. 2010). In contrast to the full-sib populations used in the former reports, we generated a mapping population by selfing a selfcompatible genotype (Liu et al. 2012). Using this S1 population we reported a linkage map that spans 2,085 cM with an average marker interval of 4.2 cM (Liu et al. 2012). The latter map consists of 18 linkage groups (LGs) of nine homeologous pairs, consistent with A and B genomes proposed for allotetraploid switchgrass (Young et al. 2010; Liu et al. 2012; Triplett et al. 2012). However, there are 23 gaps, each [15 cM and collectively spanning 339.6 cM (Liu et al. 2012). The results strongly suggest that more molecular markers are needed to develop a linkage map having a better coverage of the entire switchgrass genome. Accordingly, the objectives of this study were to: (1) experimentally test and validate SSR markers identified by informatics approaches from our recently published EST sequences; and (2) integrate newly developed and segregating EST–SSR markers into a pre-existing genetic map.

Materials and methods Plant materials and mapping population Two switchgrass genotypes, NL94 and SL93, were used to test the amplification of novel EST–SSRs. Both genotypes are tetraploid, based on the analysis of a marker gene in the chloroplast genome (Liu et al. 2012). NL94 was selfed to produce an S1 population

Mol Breeding

(Liu et al. 2012). A total of 139 random S1 individuals constituted the mapping population used in this study. DNA isolation and PCR amplification The genomic DNA for SL93, NL94 and its progeny plants was isolated from respective healthy leaf tissues using the CTAB method of Doyle and Doyle (1990), with minor modifications as described by Liu and Wu (2012). The DNA quality and concentration were measured using 1 % agarose gels and a ND1000 spectrophotometer (NanoDrop Products, Wilmington, DE, USA). The concentration of DNA samples was adjusted to10 ng/ll, used for PCR amplification. SSR markers were amplified on Biosystems 2720 thermal cyclers (Applied Biosystems, CA, USA), using the PCR reaction conditions as described by Wang et al. (2011). Initially, an AdvanCETM FS96 System (Advanced Analytical Technologies Inc., Ames, IA, USA) was used to test effective amplification of SSR primer pairs (PPs). Subsequently, those primers that amplified scorable bands were selected and used for genotyping the mapping population. In the genotyping stage, PCR products were separated using a 6.5 % KBPlus polyacrylamide gel solution on a LI-COR 4300 DNA Analyzer (LI-COR Biosciences, Lincoln, NE, USA). Primer design, screening and segregated population genotyping SSR locator V.1 software (da Maia et al. 2008) was used for identifying SSRs in 243,600 contigs assembled from 980,000 EST reads (Wang et al. 2012) and for designing primers. All software parameters used were described previously (Wang et al. 2011). The designed PPs were cross-checked to identify redundancy with published SSR PPs (Tobias et al. 2005, 2006, 2008; Okada et al. 2010; Wang et al. 2011). An NCBI Blast program called bl2seq was used to align and compare two or more primer sequences (Tatusova and Madden 1999). The parameters for removing the redundancy were described previously (Wang et al. 2011). All SSR primers were synthesized at Integrated DNA Technologies, Inc. (Coralville, IA, USA). An M13-tagged sequence (CACGACGTTGTAAAACG AC) was added to the 50 end of each forward primer. The EST–SSR PPs were named with the prefix ‘PVE’ where ‘PV’ stands for Panicum virgatum and ‘E’ for

EST. The PPs were initially screened to verify whether the amplicon sizes on the gels were in the range of expected fragment sizes. In the next stage those verified amplicons were analyzed for polymorphisms between NL94 and SL93. The effective PPs with more than two amplified fragments in NL94 were selected and further evaluated for segregation using a small screening panel which contained the parent (NL94) and its seven randomly selected S1 progeny. Subsequently, the polymorphic SSRs from this screening were used to genotype the entire population. Marker scoring, linkage analysis and map construction The fragment sizes of the amplified SSR markers were determined using Saga Generation 2 software, version 3.3 (LI-COR Biosciences). For identifying each SSR polymorphism in the mapping population, three progeny genotypes, ‘‘hh’’, ‘‘hk’’, and ‘‘kk’’, were manually scored as described (Liu et al. 2012). For each locus, a v2 test module in JoinMapÒ 4.0 (Van Ooijen 2006) was used to test goodness-of-fit between observed and expected Mendelian ratios. Linkage analysis was performed using JoinMapÒ 4.0. Before calculation of marker orders and distances, the entire genotyping data from this analysis were combined with a previously published study (Liu et al. 2012). t tests were carried out using MicrosoftÒ Excel 2007 to test for difference in length of LGs between the present and a previous map (Liu et al. 2012). To compare the difference in map intervals between this and the previous map constructed by Liu et al. (2012), a parameter called difference ratio was computed using the formula described by Wu and Huang (2007).

Results EST–SSR marker development and validation Based on our previous transcriptome analysis in four different switchgrass tissues using 454/FLX (Wang et al. 2012), 243,600 contigs were searched for SSR loci. A total of 1,240 SSR loci were identified and 1,171 PPs were designed, while the remaining 69 SSR sequences were too short to design PPs. After comparing sequences of the PPs against 2,265 published SSR PPs (Tobias et al. 2005, 2006, 2008; Okada et al.

123

Mol Breeding 0.50 0.40

Frequency

2010; Wang et al. 2011), 413 were identified to be redundant, and the remaining 758 PPs were selected for primer synthesis. Of these, 538 PPs gave scorable amplifications of expected sizes in NL94 or SL 93, whereas 220 were non-amplifiable. Among 538 amplifiable PPs, 437 (81.3 %) demonstrated effective amplification in both NL94 and SL93, and the other 101 (18.7 %) only in NL94 or SL 93. SSR marker names for the 538 PPs, along with the SSR motif, primer sequences, melting temperature (Tm), and expected fragment sizes are given in the Electronic Supplementary Material (ESM) 1. In total, 481 PPs were able to amplify expected bands for NL94.

0.30 0.20 0.10 0.00

Fig. 2 Frequency distribution of repeat sizes in polymorphic EST–SSR markers used for linkage mapping

Polymorphisms of SSR markers Markers for genotyping the selfed population were identified by testing 481 PPs which exhibited polymorphic amplifications in a panel containing NL94 and seven selfed progeny (Fig. 1). One hundred and seventeen (24.3 %) PPs were polymorphic. Later, 34 (29 %) of the 117 PPs were discarded due to their unclear amplifications in the population. Finally, genotyping data for 83 EST–SSRs were collected from the entire population. Among these 83 EST– SSRs, trinucleotide repeats showed the highest polymorphism rates (38; 46 %), followed by dinucleotide repeats (27; 33 %) (Fig. 2). Each of the 83 EST–SSRs

amplified two fragments within the expected size ranges, indicating two alleles for each locus. Linkage mapping The mapping analysis was carried out by combining the current 83 EST–SSR marker data with a set of 506 genomic SSR data from the same mapping population published previously (Liu et al. 2012). Two markers, sww-1969 and PVCA-347/348, resided in one end of each of two LGs and showed a recombination frequency of 0.38, and thus were used to facilitate the merger of these two separate LGs into LG 6b. Finally, 350 (bp) 325 300

PVE1417/8 PVE1431/2

255 230

PVE1425/6

PVE1429/30

204 200 175

PVE1411/2 PVE1423/4

145 120 105,100 94 75 50

Fig. 1 Screening SSR primer pairs for polymorphism and reliability on a panel of NL94 (first lane from left side per panel) and seven selfed progeny. Polymorphic and segregating markers are indicated in boxes. The last lane is DNA ladder 50–350 size standards (LI-COR Biosciences)

123

Mol Breeding

1a 0

PVCAG-2361/2362

5 PVGA-1271/1272_300 sww-2385 10 15 20 25 30 35 40 45 50 55 60 65 70 75

PVCAG-2199/2200

PVE-1283/1284 sww-1667 PVAAG-3327/3328 sww-112 nfsg-69 PVGA-1253/1254 PVGA-1589/1590 nfsg-105

PVE-605/606 PVGA-2107/2108

PVE-1361/1362

PVCA-627/628 PVAAG-3041/3042 sww-399 PVAAG-3351/3352 sww-333 nfsg-31 sww-775 sww-2034 sww-2489 sww-177 sww-2405_160

nfsg-197 PVGA-1573/1574 PVCAG-2379/2380

85

95

PVAAG-2143/2144 sww-761 sww-686 PVGA-1549/1550 PVGA-1885/1886 PVGA-1401/1402 sww-2271 PVGA-1271/1272_345 PVGA-1925/1926 sww-2115 PVAAG-2987/2988 PVGA-1735/1736 PVGA-1419/1420 sww-2320 PVCA-491/492 sww-2405_200 sww-2196 sww-2356 sww-162 PVCA-179/180 sww-196 PVGA-1947/1948

PVE-15/16

PVCAG-2373/2374 PVAAG-3199/3200 PVGA-1955/1956_358 PVGA-1669/1670 sww-606 PVGA-2119/2120

80

90

2a

1b

PVCAG-2537/2538 PVCAG-2439/2440

PVCA-411/412_290

PVCA-85/86,sww-2840 sww-511,sww-517,sww-593

PVGA-1353/1354

nfsg-26

sww-660

100 105

sww-1615

PVE-751/752

sww-532 sww-687 5211_B07 PVGA-1105/1106 PVCA-969/970 PVAAG-3353/3354 sww-2525 PVGA-1251/1252 sww-2455 PVCAG-2569/2570 sww-1805 sww-938 nfsg-52 PVCA-765/766 Millet-MPGD25

PVE-625/626 nfsg-50 sww-393 PVGA-1719/1720 sww-2640

PVGA-1337/1338

PVE-1429/1430

sww-697_175

sww-359

120

sww-217 sww-718_175 sww-718_185 PVAAG-2973/2974 nfsg-130 sww-2059 PVAAG-3311/3312 PVCA-317/318 PVGA-1133/1134 PVAAG-3355/3356 PVAAG-3245/3246 nfsg-129_282 PVAAG-2787/2788 PVAAG-2847/2848 PVCA-815/816 sww-360 sww-668 PVCA-555/556

135

3b

sww-209,sww-2279

PVAAG-3315/3316

PVE-831/832

***

PVAAG-3095/3096

PVCAG-2393/2394 PVGA-1703/1704

sww-447 PVGA-2105/2106

PVCAG-2239/2240

PVAAG-2721/2722

PVE-707/708 PVGA-1387/1388 PVGA-1907/1908 PVGA-1499/1500 PVCAG-2473/2474 sww-1673 PVCAG-2297/2298 PVGA-2125/2126 nfsg-75 PVCA-55/56 Millet-b255 nfsg-25

PVGA-1957/1958

sww-1803

PVAAG-2869/2870 PVGA-1983/1984 PVCA-631/632 PVGA-2059/2060 sww-2945

sww-2503 PVAAG-2857/2858

sww-1989

PVE-169/170

PVE-775/776

PVGA-1165/1166 PVAAG-2959/2960

PVCA-173/174

***

PVCA-255/256 PVCA-687/688 sww-1747 Sorghum-Xtxp46 sww-530

sww-1581 nfsg-09

PVE-1143/1144 sww-2231 nfsg-129_260

PVCA-243/244 nfsg-35

sww-583 sww-389 PVAAG-2797/2798 PVCA-529/530 PVGA-2073/2074 sww-2662

sww-2251

PVE-1375/1376

PVGA-1677/1678

nfsg-285 PVCA-867/868

PVCA-605/606

PVE-1007/1008

*** sww-1614

PVE-655/656

145

PVCA-893/894

* sww-364 PVCA-931/932

sww-461

140

PVCAG-2323/2324

PVAAG-3029/3030 sww-2502 PVGA-1599/1600 sww-2922 PVCA-535/536

sww-323

***

PVCA-405/406

PVGA-1733/1734 PVAAG-3319/3320

PVE-25/26

PVE-91/92

sww-1553 PVCAG-2623/2624 PVGA-1819/1820 PVCA-327/328_130

nfsg-70

PVE-1099/1100 PVAC-145/146 PVCA-349/350 PVCA-151/152 sww-611 sww-2116 sww-2070 sww-2906 sww-1643 PVAAG-3119/3120 PVGA-1665/1666

PVE-1111/1112 PVE-1201/1202 PVE-141/142

PVE-413/414 PVE-1411/1412 PVE-225/226

sww-2481 sww-2177 PVGA-1197/1198 sww-1761 PVGA-1201/1202 sww-2099 PVGA-1853/1854 PVGA-1187/1188 PVGA-1727/1728

PVE-987/988

PVE-183/184 PVE-1511/1512

sww-83M

PVE-485/486

PVE-977/978

PVGA-1555/1556

PVE-781/782

125 130

3a

PVCA-911/912 sww-1534 sww-523 sww-2545 PVCA-65/66 PVCA-225/226 PVCA-269/270 sww-2501 sww-2605 PVCA-861/862 sww-716 PVCA-309/310 PVGA-1889/1890 PVCA-149/150 PVCAG-2462/2463 sww-573 PVCAG-2647/2648 PVCA-917/918 nfsg-135 sww-2399 sww-2352 PVGA-2079/2080 nfsg-125 PVAAG-3205/3206 sww-1622 PVCA-597/598 PVCAG-2207/2208

PVE-377/378

110 115

2b

sww-223 sww-704 sww-108

* 5054_F06_170

150 5054_F06_140

155 160 165 170

4a 0 5 10

30 35

PVE-1247/1248

40

PVE-1513/1514

20 25

45 50 55 60 65 70 75 80 85

5a nfsg-171 sww-2338 sww-2387

PVCAG-2269/2270

PVE-417/418 PVE-425/426 PVE-1211/1212 PVCA-219/220 PVGA-1637/1638 PVCA-793/794 PVCAG-2209/2210 PVAAG-3167/3168 nfsg-054_175 sww-697_165 PVGA-1759/1760

15

4b

PVGA-1135/1136

sww-2578

6b sww-1754

sww-1749 PVCAG-2437/2438 PVGA-2081/2082

nfsg-48_175

**

sww-1813 ***

PVAAG-2979/2980 sww-1795 PVGA-1955/1956_348 nfsg-054_210 nfsg-54 PVCA-1045/1046_175 PVGA-1553/1554_265 sww-322 PVCAG-2289/2290 PVAAG-2937/2938

PVE-361/362 PVCA-797/798 sww-691,sww-2264

PVE-643/644

PVGA-2067/2068 PVAAG-2861/2862

PVE-1369/1370 PVGA-1971/1972 sww-598 PVGA-1357/1358

PVCAG-2527/2528 PVCA-411/412_345 PVCAG-2211/2212_230 PVCAG-2211/2212_234 sww-564 PVGA-1311/1312 PVGA-1723/1724 PVCA-615/616

PVGA-1143/1144 PVGA-1649/1650

PVE-647/648 PVE-71/72 PVE-117/118

PVAAG-3331/3332

PVGA-1813/1814 PVCAG-2279/2280 sww-2404

nfsg-134

PVE-783/784

PVE-839/840

PVGA-1989/1990 PVCA-599/600 sww-358

PVE-1243/1244 PVE-147/148

PVE-1179/1180 PVAAG-2939/2940

sww-2368

100 105 110 115 120

***

PVE-157/158

PVGA-1209/1210 nfsg-57

sww-568 sww-685

PVGA-1115/1116 PVAAG-3017/3018 PVE-711/712

sww-2557 PVCA-415/416

sww-1539 PVGA-1773/1774 sww-2402

PVE-301/302,PVE-725/726

sww-1889 nfsg-79

PVE-279/280

PVE-1229/1230

sww-3053_175

PVE-1367/1368

***

PVGA-2123/2124 nfsg-288 sww-556 PVGA-1243/1244 PVGA-1963/1964 PVAAG-3139/3140 PVAAG-3163/3164 PVGA-2197/2198 PVCAG-2471/2472 PVGA-1431/1432 sww-489 Millet-MPGD19 nfsg-65 sww-2250 PVCA-11/12

PVCA-949/950 nfsg-36 sww-1918_220

sww-603 sww-2581

PVE-691/692

90 95

PVCA-341/342

sww-2566

sww-432

nfsg-223

PVE-853/854

6a

5b

PVCA-929/930 sww-1573 nfsg-145 nfsg-51

PVCA-347/348

sww-2240

nfsg-239 nfsg-164

PVE-571/572 PVCAG-2153/2154 sww-695 PVCA-963/964 nfsg-246 sww-2376 PVCAG-2545/2546 PVCAG-2535/2536 PVCA-327/328_145 PVAAG-3343/3344 PVAAG-2895/2896 PVAAG-3149/3150 PVAAG-3365/3366 PVAAG-3297/3298_320 PVAAG-3297/3298_260

sww-1969

sww-3053_180d

125 130 135 140 145

PVGA-1485/1486 PVGA-1729/1730 PVCA-1037/1038 PVCAG-2599/2600

PVCA-477/478 PVCAG-2147/2148

150 155

PVCA-1023/1024

160 165 170

Fig. 3 Integrated linkage map of switchgrass constructed by joining data from genomic SSRs (Liu et al. 2012) and EST–SSRs from this study. Linkage group nomenclature followed Liu et al. (2012). Newly added EST–SSR markers are shown in bold. The gray segment in LG 6b indicates linkage identified with maximum linkage function in JoinMapÒ 4.0 at a LOD value of 2.0. The accessory loci are listed next to mapped loci. The linkage

groups are combined into homeologous pairs based on multiallele markers. A scale of the map distances in Kosambi map units is displayed on the extreme left. EST–SSRs showing segregation distortions are indicated by asterisks. The number of asterisks corresponds to the statistical significance levels of the distortion from the expected Mendelian 1:2:1 ratio. *P \ 0.01; **P \ 0.001; ***P \ 0.0001

123

Mol Breeding

7a 0

sww-2896

7b sww-2249

5 10

PVCAG-2503/2504

PVCAG-2167/2168 PVCAG-2491/2492 PVCAG-2163/2164 PVCAG-2389/2390 sww-1679

8a

9a

8b

PVE-653/654

sww-1862

***

PVE-423/424

**

PVE-39/40 PVGA-1741/1742

nfsg-34

PVE-1135/1136

nfsg-190 PVCA-285/286

PVGA-1291/1292 nfsg-219 PVGA-1627/1628

sww-125_208

PVGA-2005/2006 PVGA-1275/1276

40

PVAAG-2881/2882 PVAAG-3253/3254 sww-2532

PVE-1345/1346

45

PVE-595/596

PVE-1351/1352

25 30 35

PVAAG-3051/3052

sww-123

55 60

PVGA-1611/1612

PVE-953/954

PVGA-1149/1150

PVCA-541/542 nfsg-179 PVAAG-2955/2956 5005_B08 sww-2443

sww-1394

65

sww-2167

70

sww-2876

75

PVE-401/402

PVCA-979/980

PVGA-2139/2140

PVAAG-3181/3182

PVGA-1605/1606_160

PVE-207/208

nfsg-133

**

PVAAG-2961/2962

90

Millet-MPGD17

95

PVE-1241/1242

**

sww-463 sww-2364 PVGA-1405/1406 sww-2285 5048_B06_345 PVAAG-3027/3028

5048_B06_180

105

PVE-1459/1460

115

PVGA-1513/1514 PVCA-425/426 sww-2561

120

PVE-323/324 sww-1742 nfsg-270

130 PVCA-17/18_210 135

PVE-443/444

140

PVCA-17/18

145

sww-2097_205 PVCA-863/864

150 155

sww-466 PVAAG-2901/2902

PVE-539/540

PVE-49/50

110

125

PVE-613/614 nfsg-273

sww-651 nfsg-137 sww-170 PVAAG-3091/3092 PVCAG-2517/2518 sww-2292 PVCAG-2281/2282 PVCAG-2397/2398

PVE-597/598

100

* nfsg-299 PVGA-1663/1664 PVGA-1225/1226_168 PVGA-1769/1770 sww-387 sww-2031 nfsg-262 nfsg-55 PVGA-1301/1302 PVGA-2083/2084 PVGA-1559/1560 sww-367

PVE-277/278 PVGA-1091/1092

PVCAG-2651/2652 Millet-b159 PVGA-2187/2188

80 85

***

PVE-237/238 sww-2327

PVE-241/242

PVGA-1869/1870

PVE-409/410

PVGA-1225/1226_190 sww-673 sww-2234 PVGA-1605/1606_315 PVGA-1941/1942 nfsg-139 PVGA-1859/1860 Millet-b171 nfsg-02 PVGA-1843/1844 sww-2527 PVGA-1605/1606_210

sww-2091,nfsg-39

* nfsg-112

50

sww-2381

sww-2631

15 20

9b

PVE-281/282

***

**

sww-2446 PVGA-1153/1154 sww-166_100 5008_B05 sww-166_200 nfsg-202 nfsg-45 nfsg-132 PVCA-7/8 PVCA-811/812 nfsg-200 sww-1678_130d sww-2377 PVCA-37/38 sww-2113 sww-2097_145d PVCAG-2487/2488

PVE-487/488 PVE-219/220

***

sww-1841

* 4942_H03

PVGA-1351/1352 PVCA-19/20 nfsg-21 sww-1678_200 Millet-p44 sww-2221 PVCA-717/718 PVCAG-2615/2616 nfsg-10 sww-585

160 165 PVCA-17/18_340 sww-1812

170

Fig. 3 continued

the integrated map was composed of 18 LGs with 578 markers, including 536 on the framework map, 42 as accessory, and 11 unmapped. The total length of the linkage map was 1,956.5 cM, and there were no significant differences in length between the present and previous corresponding LGs (Liu et al. 2012) (t test, P [ 0.05). The average distance between two adjacent markers on the current map was 3.7 cM. The length of the LGs varied from 3.8 (LG 7a) to 170.0 cM (LG 9b). Ten gaps each C15.0 cM remained and collectively spanned 229.3 cM (Fig. 3). Of the 83 new EST–SSR markers, 79 were placed on the framework map, and three (PVE-893/894, 301/302, and 725/726) were assigned as accessory loci on the map. Only one marker (PVE-1131/1132) was unmapped (Fig. 3). Except for LG 7b, the new EST–SSR loci were distributed across all other 17 LGs, with a range from two (on LG 1b and 5a) to 11 (on LG 2b) per LG (Fig. 3). There was no change in LG 7b (difference ratio = 0), which was the shortest LG identified in our previous experiment (Liu et al. 2012). Because the shared intervals being compared were small for LG 6a and 8a, the difference ratios (0.17 and 0.19, respectively) of these two LGs were larger than other LGs (Table 1).

123

Distribution of segregation distorted loci Of the 83 new SSR loci, 27 (32.5 %) showed distorted segregation as to 1:2:1 ratio for homozygous ‘‘hh’’, heterozygous ‘‘hk’’ and homozygous ‘‘kk’’. Most of the segregation-distorted loci (SDL) (24/27; 88.9 %) from EST–SSRs were placed on the final map, 2.4 % (2/83) to accessory loci, and only one marker (1/83) was unmapped. There was no evidence of clustering of the SDL within the map except for the two SDL observed in our previous experiment (Liu et al. 2012) (Fig. 3).

Discussion Developing resources and/or tools in switchgrass genetics and genomics will potentially accelerate the pace of developing improved cultivars for sustainable biofuel production. The progress in switchgrass breeding and cultivar development is hindered due to its large genome size, high heterozygosity and variable ploidy levels (Costich et al. 2010). Switchgrass variety development programs could greatly benefit from the availability of informative markers. Developing a

Mol Breeding Table 1 The difference ratio of genetic distance in the common marker intervals between the map ‘‘A’’ developed in this study, and the map ‘‘B’’ previously constructed by Liu et al. (2012) P |Aik - Bik|c Bi (cM)b (Ai ? Bi) (cM) Difference ratiod Linkage No. of shared Ai (cM)a group ID marker intervals 1a

16

88.8

99.3

10.5

188.1

0.06

1b

29

91.3

106.8

15.5

198.1

0.08

2a

18

113.7

98.5

15.2

212.2

0.07

2b

20

82.3

69.4

12.9

151.7

0.09

3a

21

116.4

132.2

15.8

248.6

0.06

3b

16

95.3

80

15.3

175.3

0.09

4a

9

31.5

30.3

1.2

61.8

0.02

4b

9

74.2

68.4

5.8

142.6

0.04

5a

10

79.3

104.2

24.9

183.5

0.14

5b

25

135.2

115.2

20

250.4

0.08

6a 6b

4 10

44.4 130.9

62.5 120.3

18.1 10.6

106.9 251.2

0.17 0.04

7a

9

122.4

124.3

1.9

246.7

0.01

7b

3

3.8

3.8

0

7.6

0

8a

5

35.4

52.5

17.1

87.9

0.19

8b

11

86.7

85.6

1.1

172.3

0.01

9a

16

40

50

10

90

0.11

35

140.2

127.4

12.8

267.6

0.05

1,511.8

1,530.7

208.7

3,042.5

0.07

9b Total a

266

Ai is the length (cM) of shared marker interval on i th chromosome of map A

b

Bi is the length (cM) of shared marker interval on the i th chromosome of map B

c

An additively absolute value of the length difference of each k th shared marker interval on i th chromosome between maps A and B

d

Difference ratio was computed based on the formula described by Wu and Huang (2007)

high-density linkage map for switchgrass is helpful for marker-assisted breeding. The use of 454/FLX technology for switchgrass transcriptome sequencing by our group led to the generation of a large number of EST sequences (Wang et al. 2012). Based on this recent resource, we identified and validated ESTderived microsatellite markers, and then these EST– SSRs were genotyped in an S1 population, and integrated into an existing linkage map we had developed (Liu et al. 2012). EST–SSR marker development The success rate of SSR marker development appears to show a vast variation between different research groups. Tobias et al. (2005) identified 96 EST–SSR PPs and found 46 (47.9 %) of them amplified well. In a later study the same group designed 1780 PPs from 61,585 ESTs and reported that 830 (46.6 %) were reliably

amplified (Tobias et al. 2008). In the present study, 538 (71 %) of 758 PPs gave reliable amplifications. Using the same software (SSR locator) and criteria, Wang et al. (2011) reported a success rate of 73.7 % of designed PPs from genomic sequences, which is similar to the results in this study. The differences between this and previous studies may be attributable to the programs or software used for identifying SSRs, more importantly to the criteria set for identifying SSRs and quality of EST sequences. This study suggests that identifying new EST–SSR markers can be accelerated by exploiting the next-generation sequencing-based transcriptome analysis studies reported for switchgrass (Palmer et al. 2012; Meyer et al. 2012; Wang et al. 2012; Zhang et al. 2013). Current investigation showed that trinucleotide repeats had higher polymorphism rates than other types of SSR repeats (Fig. 2). This is consistent with previous studies showing that trimers are the most abundant class of repeats in grasses (Kantety et al. 2002) including

123

Mol Breeding

switchgrass (Tobias et al. 2005, 2008; Sharma et al. 2012; Wang et al. 2012). Genetic linkage map A high-density linkage map is important for estimating gene effect and integrating genetic and physical maps, and for precise marker-assisted breeding. Based on this study, 82 new EST–SSRs were integrated into the existing linkage map (Liu et al. 2012). However, this addition did not cause any significant increase in the overall map length. This was ratified by the value of the overall difference ratio between the two maps being extremely low (0.07). The addition of the 82 EST–SSRs improved map resolution in two aspects. First, it lowered the average marker interval by 0.5 cM. Secondly, in the integrated map the number of gaps spanning more than 15 cM was reduced by 57 %, from 23 gaps on the previous map to 10 gaps on this map. Maps such as this one with fewer gaps are improved for QTL analysis and marker-assisted applications (Beckmann and Soller 1983). Good colinearity of marker order was observed on most of the LGs, except for eight markers in a middle segment of LG 2a, six markers in a distal region of LG 2b, five markers in a distal part of LG 6a, and three markers in a middle region of LG 8a. Order rearrangements were reported to occur frequently upon addition of more markers to existing linkage maps (Crooijmans et al. 1994; Harushima et al. 1998). Segregation distortion Segregation distortion is a common phenomenon in plants (Lyttle 1991) and has been reported in switchgrass (Missaoui et al. 2005; Okada et al. 2010; Liu and Wu 2012; Liu et al. 2012). In this study of 83 newly mapped EST–SSR loci, 27 (32.5 %) showed distorted segregation to the expected 1:2:1 ratio, which is the theoretical ratio of codominant markers for disomic inheritance in tetraploid switchgrass (Okada et al. 2010; Liu and Wu 2012). This distortion for the EST– SSRs is higher than that observed for genomic SSRs (i.e., 18.7 %, Liu et al. 2012), even though the same mapping population was used. Segregation distortion is usually caused by linkage between molecular markers and distorting factors, such as sterile alleles and lethal genes (Lyttle 1991). We speculate that since ESTs are derived from gene coding regions, SSRs

123

developed from ESTs have higher probabilities of being linked to such distorting factors than genomic SSRs, which are distributed preferentially in noncoding regions (Metzgar et al. 2000; Wang et al. 2011). In addition, two SDL clusters were identified on LG 5b and 9b in our previous study (Liu et al. 2012), and the EST–SSRs mapped in the vicinity of these genomic SDL markers showed significant distortion, suggesting that these genomic regions harbor genes that contribute to this phenomenon. Knowledge of segregation distortion regions (SDRs) is useful for developing switchgrass inbred lines, and utilizing inbred lines to explore heterosis was proposed previously (Casler 2012; Liu and Wu 2012). Using homozygosity information on molecular markers surrounding SDRs and plant vigor observed in the field, a breeder would easily select desirable genotypes and speed up the process of developing stable inbred lines (Anhalt et al. 2008). In conclusion, the present study developed 538 effective EST–SSR markers from a public database. Of them, 82 polymorphic markers were integrated into an established linkage map and considerably improved the marker density. The new markers and this higher density linkage map will be a valuable asset for QTL mapping and marker-assisted breeding in switchgrass. All genotyping data including 83 new EST–SSRs (highlighted) and 506 previously mapped loci for the NL94 S1 population are provided in ESM 2. Acknowledgments The authors thank the following funding sources and individuals for sponsoring and helping in this research: National Science Foundation award EPS 0814361; Oklahoma Agricultural Experiment Station; Yiwen Xiang, Yunwen Wang and Pu Feng.

References Anhalt UC, Heslop-Harrison PJ, Byrne S, Guillard A, Barth S (2008) Segregation distortion in Lolium: evidence for genetic effects. Theor Appl Genet 117:297–306 Beckmann JS, Soller M (1983) Restriction fragment length polymorphisms in genetic improvement: methodologies, mapping and costs. Theor Appl Genet 67:35–43 Bouton JH (2007) Molecular breeding of switchgrass for use as a biofuel crop. Curr Opin Genet Dev 17:553–558 Casler M (2012) Switchgrass breeding, genetics, and genomic. In: Monti A (ed) Switchgrass: a valuable biomass crop for energy. Springer, London, pp 29–53 Costich DE, Friebe B, Sheehan MJ, Casler MD, Buckler ES (2010) Genome-size variation in switchgrass (Panicum

Mol Breeding virgatum): flow cytometry and cytology reveal rampant aneuploidy. Plant Gen 3:130–141 Crooijmans RP, van Kampen AJ, van der Poel JJ, Groenen MA (1994) New microsatellite markers on the linkage map of the chicken genome. J Hered 85:410–413 da Maia LC, Palmieri DA, de Souza VQ, Kopp MM, de Carvalho FI, Costa de Oliveira A (2008) SSR locator: tool for simple sequence repeat discovery integrated with primer design and PCR simulation. Int J Plant Genomics. doi: 10.1155/2008/412696 Doyle JJ, Doyle JK (1990) Isolation of plant DNA from fresh tissue. Focus 12:13–15 Harushima Y, Yano M, Shomura A, Sato M, Shimano T, Kuboki Y, Yamamoto T, Lin SY, Antonio BA, Parco A, Kajiya H, Huang N, Yamamoto K, Nagamura Y, Kurata N, Khush GS, Sasaki T (1998) A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148:479–494 Kantety RV, La Rota M, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48:501–510 Katti MV, Ranjekar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18:1161–1167 Korzun V (2002) Use of molecular markers in cereal breeding. Cell Mol Biol Lett 7:811–820 Liu LL, Wu YQ (2012) Identification of a selfing compatible genotype and mode of inheritance in switchgrass. Bioenergy Res 5:662–668 Liu LL, Wu YQ (2013) Molecular genetics and molecular breeding for bioenergy traits. In: Luo H, Wu YQ (eds) Compendium of bioenergy plants: switchgrass. CRC Press, FL (in press) Liu L, Wu Y, Wang Y, Samuels T (2012) A high-density simple sequence repeat-based genetic linkage map of switchgrass. G3-Genes Genomes Genet 2:357–370 Lyttle TW (1991) Segregation distorters. Annu Rev Genet 25:511–557 Metzgar D, Bytof J, Wills C (2000) Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res 10:72–80 Meyer E, Logan TL, Juenger TE (2012) Transcriptome analysis and gene expression atlas for Panicum hallii var. filipes, a diploid model for biofuel research. Plant J 70:879–890 Missaoui AM, Paterson AH, Bouton JH (2005) Investigation of genomic organization in switchgrass (Panicum virgatum L.) using DNA markers. Theor Appl Genet 110:1372–1383 Okada M, Lanzatella C, Saha MC, Bouton J, Wu R, Tobias CM (2010) Complete switchgrass genetic maps reveal subgenome collinearity, preferential pairing and multilocus interactions. Genetics 185:745–760 Palmer NA, Saathoff AJ, Kim J, Benson A, Tobias CM, Twigg P, Vogel KP, Madhavan S, Sarath G (2012) Next generation sequencing of crown and rhizome transcriptome from an upland, tetraploid switchgrass. Bioenergy Res 5:649–661

Schmer MR, Vogel KP, Mitchell RB, Perrin RK (2008) Net energy of cellulosic ethanol from switchgrass. Proc Natl Acad Sci USA 105:464–469 Sharma MK, Sharma R, Cao P, Jenkins J, Bartley LE, Qualls M, Grimwood J, Schmutz J, Rokhsar D, Ronald PC (2012) A genome-wide survey of switchgrass genome structure and organization. PLoS One 7:e33892 Tatusova TA, Madden TL (1999) BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol Lett 174:247–250 Tobias CM, Twigg P, Hayden DM, Vogel KP, Michell RM, Lazo GR, Chow EK, Sarath G (2005) Analysis of expressed sequence tags and the identification of associated short tandem repeats in switchgrass. Theor Appl Genet 111: 956–964 Tobias CM, Hayden DM, Twigg P, Gautam S (2006) Genic microsatellite markers derived from EST sequences of switchgrass (Panicum virgatum L.). Mole Ecol Notes 1:185–187 Tobias CM, Gautam S, Twigg P, Lindquist E, Pangilinan J, Penning BW, Barry K, McCann MC, Carpita NC, Lazo GR (2008) Comparative genomics in switchgrass using 61,585 high-quality expressed sequence tags. Plant Genome 1: 111–124 Triplett JK, Wang Y, Zhong J, Kellogg EA (2012) Five nuclear loci resolve the polyploid history of switchgrass (Panicum virgatum L.) and relatives. PLoS One 7:e38702 Van Ooijen JW (2006) JoinMap 4, software for the calculation of genetic linkage maps in experimental populations. Kyazma BV, Wageningen Wang YW, Samuels TD, Wu YQ (2011) Development of 1,030 genomic SSR markers in switchgrass. Theor Appl Genet 122:677–686 Wang Y, Zeng X, Iyer NJ, Bryant DW, Mockler TC, Mahalingam R (2012) Exploring the switchgrass transcriptome using second-generation sequencing technology. PLoS One 7:e34225 Wright L, Turhollow A (2010) Switchgrass selection as a ‘‘model’’ bioenergy crop: a history of the process. Biomass Bioenergy 34:851–868 Wu Y, Huang Y (2007) An SSR genetic map of Sorghum bicolor (L.) Moench and its comparison to a published genetic map. Genome 50:84–89 Young HA, Hernlem BJ, Anderton AL, Lanzatella-Craig C, Tobias CM (2010) Dihaploid stocks for switchgrass isolated by a screening approach. BioEnergy Res 3:305–313 Zhang JY, Lee YC, Torres-Jerez I, Wang M, Yin Y, Chou WC, He J, Shen H, Srivastava AC, Pennacchio C, Lindquist E, Grimwood J, Schmutz J, Xu Y, Sharma M, Sharma R, Bartley LE, Ronald PC, Saha MC, Dixon RA, Tang Y, Udvardi MK (2013) Development of an integrated transcript sequence database and a gene expression atlas for gene discovery and analysis in switchgrass (Panicum virgatum L.). Plant J 74:160–173. doi:10.1111/tpj.12104

123

Suggest Documents