Genetics of Soybean Agronomic Traits: I. Comparison of Three. Related Recombinant Inbred Populations. J. H. Orf,* K. Chase, T. Jarvik, L. M. Mansur, P. B. ...
Genetics of Soybean Agronomic Traits: I. Comparison of Three Related Recombinant Inbred Populations J. H. Orf,* K. Chase, T. Jarvik, L. M. Mansur, P. B. Cregan, F. R. Adler, and K. G. Lark ABSTRACT
are homozygous and their genotypes can be reproduced by different research groups for repeated experiments in a variety of environments (Mather and Jinks, 1977). Availability of multiple RI populations makes it possible to extend such analyses to identify different segregating loci, and to compare the effects of these loci in populations with different genetic backgrounds. We have studied the effects of genetic background in three RI populations of soybean. In a previous publication, we presented a genetic analysis of agronomic traits using a RI population of segregants derived from a cross between the soybean cultivars Minsoy and Noir 1 (Mansur et al., 1996). Because this RI population was large (240 segregants), we also were able to demonstrate the existence of epistatic effects within that population (Lark et al., 1995; Chase et al., 1997). We now have analyzed two related large RI populations derived from crosses between Minsoy and ‘Archer’ and between Noir 1 and Archer. Whereas Minsoy and Noir 1 were plant introductions, Archer is an elite cultivar, the product of an extensive series of crosses (Cianzio et al., 1991). It presents a genetic background in which we have the opportunity to evaluate the distortion of phenotypes that result from the introduction of QTL alleles from genotypes not used in the northern U.S. breeding program. The objective of this study was to compare the three RI populations derived from Archer, Minsoy, and Noir 1 for linkage of QTLs and molecular markers. From such a comparison we can (i) analyze the role of genetic background in the expression of quantitative trait phenotypes, (ii) identify new QTLs in segregants from different genetic backgrounds, and (iii) confirm many of the previously identified QTLs by identifying them in other genetic backgrounds.
Molecular markers provide a rapid approach to breeding for desired agronomic traits. To use them, it is necessary to determine the linkage between quantitative trait loci (QTLs) and such markers. The objective of this research was to determine such linkage in recombinant inbred (RI) soybean [Glycine max (L.) Merrill] populations. To do this, RI soybean segregants were characterized for molecular genetic markers and traits measured in several different environments. QTLs then were identified by interval mapping. Agronomic traits were measured and compared in large (about 240 segregants) RI populations derived from crosses between the cultivars Minsoy and Noir 1 (MN population), Minsoy and Archer (MA population), and Noir 1 and Archer (NA population). The MA and NA populations were grown together as two replications in each of four environments. Measurements from the MN population were reported previously and were taken from three replications grown in four environments. Traits measured were plant height, lodging, date of flowering, reproductive period, maturity, yield, seed weight, seed oil, seed protein, leaf length, and leaf width. Additional traits were derived from these primary measurements. Each of the three RI populations was also characterized by a large (.400) number of molecular genetic markers including RFLP (restriction fragment length polymorphism) and SSR (simple sequence repeat polymorphisms). QTLs were identified for all of the primary and derived traits at a significance level $LOD 3 on 17 of the 20 linkage groups and tended to be clustered on three. QTLs with major effects (R2 . 10%) were identified for all traits, and for many, these explained more than half of the heritable variation. Comparison of QTLs between the three RI populations established that for the majority of the traits, only two alleles could be identified. In only a few instances could a third allele be detected. Many of the significant QTLs identified in one population were confirmed in another. However, an almost equal number were found in only one population, suggesting that a dependence on the genetic background for expression (epistasis) was common.
W
ith the advent of molecular markers, it has become possible to analyze in detail the genetic basis for complex, polygenic traits (Dudley, 1993). Because quantitative traits are strongly influenced by environmental factors, deducing their genetic basis usually requires compairing mean trait values in different environments. For this purpose, recombinant inbred (RI) populations are particularly useful because segregants
MATERIALS AND METHODS Germplasm and Design of Field Experiments The genetic materials consisted of three recombinant inbred (RI) populations derived from crosses between Minsoy (PI 27890) and Noir 1 (PI 290136) (MN population), Archer (PI 546487) and Minsoy (MA population), and Noir 1 and Archer (NA population). They contained 240 MN RI lines, 233 MA RI lines and 240 NA RI lines, respectively. These F7-derived recombinant inbred lines were developed by single seed descent at Iowa State University (Mansur et al., 1993b), and genetically characterized at the University of Utah by molecular markers. Traits were measured in each of four environments with three replications in each location for the MN population and two replications each for the MA and NA populations. The MA and NA populations shared locations, whereas the MN
J.H. Orf, Dep. of Agronomy and Plant Genetics, Univ. of Minnesota, St. Paul, MN 55108; K. Chase, T. Jarvik, F.R. Adler and K.G. Lark, Dep. of Biology, Univ. of Utah, Salt Lake City, UT 84112; L.M. Mansur, Facultad de Agronomia, Universidad Catholica de Valpariso, Casilla 4-D, Quillota, Chile; and P.B. Cregan, USDA-ARS Soybean and Alfalfa Research Lab., Beltsville, MD 20705. Contribution of Minnesota Agric. Exp. Stn. Paper no. 981130074, Minnesota Scientific Journal Series. Work supported in part from grants from the United Soybean Board, Minnesota Soybean Research and Promotion Council and a grant to KGL from the National Institutes of Health Grant GM 42337. Received 15 July 1998. *Corresponding author (orfxx001 @maroon.tc.umn.edu).
Abbreviations: cM, centimorgan; QTLs, quantitative trait loci; RI, recombinant inbred; RFLP, restriction fragment length polymorphism; SSR, simple sequence repeat.
Published in Crop Sci. 39:1642–1651 (1999).
1642
ORF ET AL.: SOYBEAN GENETICS COMPARING THREE RI POPULATIONS
data had been collected in earlier experiments (Mansur and Orf, 1995). Some of the trait measurements were restricted to two or three environments (e.g., leaf length, leaf width, or leaf area), and traits affected by maturity measured in one environment, Minnesota 1995, were limited by an early frost. The data presented are averages of field data across environments and have not focused on the effects of individual environments. The parental cultivars were included in all experiments. Evaluation of the MN population has already been described (Mansur et al., 1996). Parents and RI lines from the MA and NA populations were evaluated at Rosemount, MN, (458N 938W) in 1995, Waseca, MN, (448N 938W) in 1996 and 1997 and in 1995-96 at Los Andes, Chile, (348S 708W). A randomized complete block design with two replications in each of the four environments was used. The Chile site was irrigated. Two plots of each parent were included in each block. The following traits were evaluated: flowering date as days from planting to flowering (R1); maturity as days from planting to maturity (R8); plant height in cm (HT); lodging scored from 1 5 erect to 5 5 prostrate (LDG); seed yield as kg/ha (YD); seed weight as mg/seed (SW); seed protein (PRO) and oil content (OIL) on a 130 g kg21 moisture basis as g/kg; and leaf length (LL) as well as leaf width (LW) in cm (measured on only 10 plants per replication). The leaf measurements were made on the center leaflet of a fully expanded trifoliate leaf four nodes from the top of the plant. A detailed description of each of these traits is presented in Mansur et al. (1993a). In addition to these primary traits, several derived traits were analyzed: reproductive period (RP 5 R8 2 R1); leaf area (AR 5 LL3LW); seed number as yield divided by seed weight (YD/SW); lodging per unit height (HDL), defined as its reciprocal, height divided by lodging (the ability of tall plants to remain upright); and yield per unit of height (YD/HT, for which high values are obtained from short plants with high yields). All derived traits were calculated using the primary trait data. Creation of the Composite Genetic Map Markers and their assays, as well as genotyping methods, have been reported previously by Mansur et al. (1996) and Cregan et al. (1999). Methods of mapping with Mapmaker (Lincoln and Lander, 1993) as well as mapping and mapmaker parameters not described below are described in these publications. In order to assign each marker to a genome position that would be consistent in all three crosses, a composite genetic map was prepared (Fig. 1). Data from all three populations were combined into a single Mapmaker file containing 713 individuals, and mapping was carried out using all available markers. The marker order established in the MN population was used as a starting point. MA or NA markers not segregating in the MN cross were inserted one by one with the ‘try’ command, and the resulting map was rippled with a window size of three. All of the markers used to analyze the data presented below are shown in Fig. 1. However, for clarity of presentation, many additional markers have been omitted. These include 170 markers that were mapped only in the MN population and the following markers that segregate in two or three of the RI populations. 1. Mapped in MN and MA— L050_6; A975_2; BLT053_1; Satt564; Satt364; Satt073; A110_2; L194_1; Satt422; Satt170; L148_1; K365_1; G214_13; Satt145; K265_1; Satt127; G214_20; T010_1; R051_2; G214_15; Satt001; Sat_020; C009_1; BLT043_1; Satt179; Satt311; G214_9;
1643
BLT007_1; Satt561; Satt513; Sat_089; Satt459; Satt020; G214_4; G214_5; A234_1; Scaa003; L154_1; K258_1; R092_3; Satt220; K644_2; C009_2; Satt605; and Satt126. 2. Mapped in MN and NA— Satt285; R189_1; L156_1; G214_14; A235_1; G214_1; Satt200; L063_1; Sct_188; BLT053_4; L204_3; BLT002_1; Sat_104; Satt440; Sat_132; A053_1; L194_2; A510_5; G214_11; R028_2; BLT049_5; G214_3; B162_1; G214_18; A235_2; Sat_110; A141_1; Satt386; B124_2; Satt296; Sct_186; Satt183; L144_1; BLT053_2; Satt316; A315_1; Satt237; Sat_125; Fr2; Sat_071; A459_1; and BLT057_2. 3. Mapped in MN, MA and NA— Satt343; Satt203; Satt543; Satt158; Satt419; Satt188; Satt147; and Sat_069. All of these markers can be located on the published MN map (Cregan et al., 1999) or at the website www.larklab.4biz.net (verified 13 May 1999). Statistical Methods and QTL Mapping We used the “pearsn” routine from numerical recipes in C (Press et al., 1996) to calculate the correlation coefficients between traits. We estimated confidence intervals for the correlation coefficients by bootstrapping the populations (Press et al., 1996). For detecting QTLs, we used the sample interval mapping feature of the computer package PLABQTL (Utz and Melchinger, 1996). This program employs a multiple regression approach to interval mapping with marker order and distances determined by Mapmaker (Lincoln and Lander, 1993). Permutation tests established empirical LOD thresholds (Churchill and Doerge, 1994). The PLABQTL program carried out a simultaneous fit of all QTL detected above a threshold of 2.5 (Table 1). We used analysis of variance to partition the total variance in each population into genetic, environmental and genotype 3 environmental components (SAS, 1988). Heritability estimates (Hanson et al., 1956) were computed as: h2 5 sG/(sG 1 se/r 1 sge/r ), where h2 5 heritability, sG 5 genotypic variance, se 5 error variance, sge 5 genotype 3 environmental variance and r 5 the number of replications for the trait.
RESULTS Combined Genetic Map Statistical Methods and QTL Mapping Although linkage distances varied somewhat between the maps for the MN, MA, and NA populations, no major differences were found and a combined map was developed (Fig. 1). All of the markers used to identify QTLs (see below) are presented, as well as all the markers which segregate only in the two Archer RI populations. The distances between markers or within the genome are shown and the 20 linkage groups are denoted as in Cregan et al. (1999). Many Archer polymorphisms were useful for combining and integrating the map, but these were redundant for markers on the Iowa or ClarkHarosoy maps which served the same purpose (Cregan et al., 1999). However, Satt369b in linkage group U22 was unique. A rare case of an extra polymorphic PCR (polymerase chain reaction) product segregating in the MA population, this marker linked two portions of U22 with a LOD score of 6.7. In order to compare the locations of QTLs in different RI populations, the combined map in Fig. 1 was used for the scans presented in Fig.
1644
CROP SCIENCE, VOL. 39, NOVEMBER–DECEMBER 1999
Fig. 1. Composite genetic map of the MN (Minsoy-Noir 1), MA (Minsoy-Archer), and NA (Noir 1-Archer) populations. The symbols to the left of the marker names indicate the mapping population(s) for which marker data are available (d5 MN, m5 NA, j 5 MA ). The first number to the right of the marker names indicates the map distance (in centimorgans) to the next marker, and the second number is the genome position of the marker. Linkage groups are labeled with both the Utah names and [in parentheses, e.g., (J)] the Iowa names (see Cregan et al., 1999, for comparison of the Utah and Iowa maps).
ORF ET AL.: SOYBEAN GENETICS COMPARING THREE RI POPULATIONS
1645
Fig. 1. continued.
2 and for the data in Tables 1 and 2. The genome size differs from individual RI population maps because markers mapped in one population could not always be mapped in another (combined map in Fig.1, 2585 cM; MN map, 2504 cM; MA map, 2334 cM; NA map, 2399 cM). Individual maps may be found at the website www.larklab.4biz.net (verified 13 May 1999).
Quantitative Trait Parameters Table 3 compares the means, ranges, and standard deviations for the RI populations with those of their parents. The combined data from all replications in all environments are presented for 15 traits, including derived traits.
Among the parents, Archer, an elite line, had the most desirable values for agronomic traits. Although Archer did not have the greatest leaf width (LW), leaf area (AR), which could influence photosynthesis, was highest for Archer. Of the two plant introductions, Minsoy had the least desirable values for all of the traits but one (YD/HT). This may indicate that Noir 1 was better adapted to current agronomic conditions as a result of selection pressure during its development in Hungary prior to introduction into the USA (Nelson et al., 1987). The means of the RI populations for the most part fell between the means of their parents. Exceptions involved traits related to flowering and maturity (R1, R8,
1646
CROP SCIENCE, VOL. 39, NOVEMBER–DECEMBER 1999
Table 1. Mean, standard deviations and ranges of the 15 traits measured in the three RI populations. Parental means are shown on the right. RI populations Minsoy-Archer
Noir 1-Archer
Minsoy-Noir 1
Parents
TraitP
Mean
SD
Range
Mean
SD
Range
Mean
SD
Range
Archer
Minsoy
Noir 1
LDG HT R1 R8 OIL PRO SW YD LL LW AR HDL RP YD/SW YD/HT
3 85 40 108 181 344 135 24 12 6.6 81 29 68 18 30
0.6 20 5.7 8.2 5.3 10 14 2.6 1 0.6 13 5.7 4.5 2.8 6.3
1.3–4.2 43–140 31–55 91–124 166–197 314–372 101–172 15–31 9.9–15 5.2–8.8 51–134 16–45 58–77 9.8–26 15–51
2.6 99 38 101 175 353 147 27 13 7.6 101 40 63 19 28
0.5 9.8 2.6 5.8 6.7 14 15 2.5 0.9 0.7 15 7.1 4.2 2.2 3.9
1.6–3.8 71–124 33–44 87–115 155–194 318–383 108–189 21–33 11–16 5.6–9.8 67–152 27–66 51–75 13–25 20–38
2.6 74 41 105 180 353 141 20 12 6.8 78 29 64 14 28
0.7 19 3.8 6.2 8.8 13 14 4 0.8 0.6 11 4.2 3.4 3.6 5.8
1.0–4.5 25–132 35–51 91–119 156–206 316–389 105–183 5.5–27 8.8–14 5–9.1 46–119 18–43 53–72 3.1–25 13–46
2 103 46 118 187 340 161 31 14 7.7 109 53 71 20 31
Mean 3.3 57 42 100 178 353 123 18 11 5.9 67 18 58 15 32
2.7 96 40 102 162 350 138 26 13 8.5 107 36 62 19 27
† Traits were: (LDG) lodging scored from 1 5 erect, to 5 5 prostrate; (HT) plant height in cm; (R1) flowering date as days from planting to flowering; (R8) maturity as days from planting to maturity; (OIL) oil content on a 13% moisture basis as g/kg; (PRO) seed protein content on a 13% moisture basis as g/kg; (SW) seed weight as mg/seed; (YD) seed yield as quintals/ha; (LL) leaf length in cm; (LW) leaf width in cm; (AR) leaf area in cm2; (HDL) height divided by lodging in cm; (RP) reproductive period in days 5 R8 2 R1; (YD/SW) seed number 3 106 seeds/hectare; (YD/HT) yield per unit height as kg/ha/cm.
and RP), each of which were outside of the parental means in two of the three RI populations. Most importantly, as in the original MN population, the MA and NA populations showed transgressive segregation for most of the traits, the range of trait values for the RI lines exceeding the parental values, often by large amounts. This can only occur if the genotypes that un-
derlie the quantitative trait values are quite different in the three parents. Correlation coefficients between traits were calculated in each of the three RI populations and compared. Most of the values were either expected (such as positive correlations between height and lodging or between maturity and height) or uninformative because correlations, though significant, were low (r , 0.5). However, an unusual relationship was observed in the NA population between yield and other traits. Whereas the correlations found in the MA and MN populations are similar to values found for a variety of other cultivars, the segregants in the NA population (o) displayed no correlation between yield (YD) and maturity (R8) or reproductive period (RP) (Fig. 2). Similar differences between populations were observed for correlation with height (HT) and lodging (LDG) as well as for the various leaf parameters (length [LL], width [LW], and area [AR]). Finally, yield in the NA population showed a significant positive correlation with seed weight (SW), rather than the negative correlation observed in most cultivars and exhibited here by the segregants in the MN and MA populations. All of these correlations in the NA population suggest that one or more unusual QTLs for yield are segregating.
Identification of QTLs in the Three Populations
Fig. 2. Correlation of yield with 14 other traits in the MN (MinsoyNoir 1) (j), MA (Minsoy-Archer) (3), and NA (Noir 1-Archer) (o) populations. The vertical error bars represent a 97.5% confidence interval for the correlation. Wherever the error bars from two correlations do not overlap the correlations are significantly different at 0.05 threshold. Traits were (HDL) height divided by lodging; (HT) plant height; (LDG) lodging; (LL) leaf length; (AR) leaf area, (LW) leaf width; (OIL) seed oil content; (PRO) seed protein content; (R1) days to flowering; (R8) days to maturity; (RP) reproductive period; (SW) seed weight as mg/seed; (YD/HT) yield per unit height; (YD/SW) seed number.
Figure 3 compares genome scans from the MN, MA, or NA populations in which the maximum likelihood score of finding a QTL for yield and four other traits is determined by interval mapping across the genome. The QTLs that we have identified are ones that have maintained their significance across environments (since trait values were averaged over environments). In each population, QTLs were determined for yield (YD), maturity (R8), reproductive period (RP), yield per cm of height (YD/HT), and height (HT). We examined these scans to determine if there were yield QTLs
1647
ORF ET AL.: SOYBEAN GENETICS COMPARING THREE RI POPULATIONS
Table 2. Complete listing of all QTLs detected above a LOD of 3.0.† Cross LG
High I.D. Marker LOD R2 allele Cross LG
MA MN MN MA MN MA MN MA NA NA NA
U9 U9 U11 U11 U14 U14 U14 U14 U14 U14 U19
1099 1100 1269 1272 1700 1771 1794 1796 1806 1817 1966
MN MA MN MA MN MN NA
U8 U9 U9 U13 U14 U14 U19
944 1099 1100 1602 1700 1796 1960
NA MA MN MA MA MN MA NA
U5 U11 U11 U13 U13 U13 U14 U14
579 1274 1285 1579 1592 1593 1767 1801
NA MA MA MN MN MA MN MA NA NA
U9 U9 U11 U11 U11 U14 U14 U14 U14 U14
1100 1101 1272 1275 1304 1767 1794 1798 1806 1817
NA MN MN MA MN MN MN MN NA NA MA NA
U1 U9 U9 U11 U11 U13 U14 U14 U14 U14 U22 U22
6 1005 1100 1268 1277 1590 1784 1796 1806 1817 2298 2304
HTP Satt277 Satt489 Satt150 Satt150 Satt182 Satt527 G173_1 Satt166 A489_1 Satt373 Sat_096 LDG Sat_036 Satt277 Satt489 Satt335 Satt182 Dt1 Sat_096 HDL Satt324 Satt150 R079_1 Sat_103 K644_1 R045_1 Sat_113 Satt006 R1 Satt489 Satt365 Satt150 Satt567 Sat_003 Sat_113 G173_1 Satt166 A489_1 Satt373 RP Satt287 L199_2 Satt489 Satt150 Satt567 K644_1 Sat_099 Dt1 A489_1 Satt373 G214_24 Sat_077
7.5 8.7 7.1 8.9 4.3 11 23 6.9 11 5.6 4.6
14 15 13 16 8 19 36 13 19 11 9
M M N A N A N A A A A
3.2 12 8.1 9.2 4 16 5
6 21 15 17 8 27 10
M M M M N N A
3.2 4.2 5.8 3.6 4.2 3.7 9.5 7.2
6 8 11 7 8 7 17 13
A A N A A N A A
19 19 15 25 8.4 5 5.3 3 15 8.7
31 31 26 39 15 9 10 6 25 16
M M A N N A N A A A
3.4 3.7 3.1 7.8 7.6 3.4 4.9 4.9 11 7 6.2 12
7 7 6 14 14 6 9 9 19 13 12 21
N M M A N M N N A A A A
I.D.
NA MA MN MA MA MN MN NA MN NA
U1 U2 U6 U11 U11 U14 U1 U14 U14 U14
88 117 685 1270 1303 1699 1794 1806 1814 1817
NA MN MA MN MN MA MN MA NA
U1 U5 U9 U10 U10 U11 U11 U14 U14
6 659 1101 1160 1186 1272 1281 1773 1806
NA MA MA MN MA MN MA MN
U1 U9 U11 U11 U11 U14 U14 U14
6 1101 1272 1279 1303 1700 1764 1796
NA MN MA MA MN MN MA MN MA NA NA MA
U1 U9 U9 U11 U11 U11 U14 U14 U14 U14 U14 U22
6 1100 1101 1270 1275 1304 1773 1794 1796 1806 1817 2295
MN MA NA MN MA
U7 U11 U14 U22 U22
873 1285 1792 2234 2287
High LOD R2 allele Cross LG
Marker LL K375_1 Sat_112 L2 Satt150 Sat_003 BLT010_2 G173_1 A489_1 K385_1 Satt373 LW Satt287 A378_1 Satt365 R249_2 Satt442 Satt150 R079_1 Satt527 A489_1 AR Satt287 Satt365 Satt150 R079_1 Sat_003 Satt182 Satt076 Dt1 R8 Satt287 Satt489 Satt365 Satt150 Satt567 Sat_003 Satt527 G173b_1 Satt166 A489_1 Satt373 Satt578 PRO T155_1 R079_1 Satt166 SOYGPATR Satt578
3.5 3.5 3.3 6.6 3.7 4.5 7.9 5.5 5.1 3.4
7 7 6 12 7 8 14 10 9 7
N A N A A N M A M A
4.1 3.8 4.7 3.6 4.5 7.2 10 4.2 4.1
9 8 9 7 8 13 18 8 8
N M M N N A N A A
3.3 3.1 8.8 6.3 4.2 3.9 3.2 4.1
7 6 16 11 8 7 6 8
N M A N A N A M
3.9 13 7.2 19 22 5.8 5.4 7 4.2 17 11 3.2
8 23 13 31 34 11 10 13 8 29 20 6
N M M A N N A N A A A A
4.2 3.2 3.3 3.7 6.4
15 6 11 12 12
N M A N M
I.D.
NA NA MN NA MN NA MN MN MN NA MN NA MA MN NA MA
U2 U3 U3 U4 U5 U7 U7 U8 U9 U9 U11 U13 U14 U14 U22 U22
208 286 347 418 539 801 871 944 1005 1096 1261 1594 1769 1774 2266 2296
NA MN MN NA NA
U9 U9 U11 U12 U13
1096 1100 1271 1425 1637
MN NA NA MN MN MA NA
U3 U4 U9 U9 U11 U11 U13
343 416 1045 1100 1269 1277 1594
MN MA MN MA MA NA MA MA MN MA NA NA NA NA
U1 U9 U9 U11 U12 U13 U13 U14 U14 U14 U14 U14 U19 U26
10 1099 1109 1272 1427 1639 1641 1773 1792 1796 1806 1817 1974 2552
MA MN NA NA MA MA
U7 U7 U9 U14 U22 U22
867 873 1023 1806 2234 2264
Marker
High LOD R2 allele
SW G214_26 3.1 Satt187 3.7 Satt508 4.3 T028_1 3.8 Satt163 3.6 Satt449 3.2 Satt174 3.4 Sat_036 3.2 L199_2 4.6 Satt277 3.7 Satt150 3.7 L050_14 3.6 Satt527 3.1 Sat_099 3.2 K001_1 5 L192_1 3.1 YD Satt277 6.1 Satt150 3.3 Satt150 11 Satt002 3.9 Satt144 7 YD/SW Satt508 3 T028_1 3.6 Satt291 3.2 Satt489 3.1 Satt150 11 Satt540 4.2 L050_14 8.6 YD/HT Satt405 3.8 Satt277 6.8 Satt079 3.2 Satt150 6.3 Satt582 4 K014_2 3.1 Satt144 8.5 Satt527 9.2 G173_1 30 Satt166 6.5 A489_1 3.6 Satt373 3 Satt095 5.4 Satt066 3.7 OIL Satt174 4.9 T155_1 3.4 Satt432 3.3 A489_1 6.1 SOYGPATR 3.3 SOYGPATR 3.4
6 7 8 8 7 6 7 6 9 7 7 7 6 6 9 6
N A M N N A N N M N M N A N A A
11 6 19 8 13
N M N A A
6 7 6 6 19 8 16
A N M N A A
9 13 6 12 8 6 16 17 44 12 7 6 10 7
M A N M A A A M M M N N N A
10 13 11 19 11 7
A M A N M A
† The QTLs are grouped according to trait. Each QTL is characterized by RI population (cross) MN (Minsoy-Noir 1), MA (Minsoy-Archer), or NA (Noir 1-Archer); the linkage group on which it is located (LG); its precise genome position (I.D.(cM)); the nearest marker to the QTL at a lower I.D. (marker:); the peak LOD score in the interval (LOD); the amount of variation explained by the QTL (R2); and which allele has the higher trait value (N 5 Noir 1, A 5 Archer and M 5 Minsoy). P HT-plant height, LDG-lodging, HDL-height divided by lodging, R1-flowering date, RP-reproductive period, LL-leaf length, LW-leaf width, AR-leaf area, R8-maturity date, PRO-protein content, SW-seed weight, YD-yield, YD/SW-seed number, YD/HT-yield per unit height, OIL-oil content.
which were not attributable to maturity (R8) or reproductive period (RP). No major yield QTL was evident in the MA population, whereas one in linkage group (LG) U11 segregated in the MN population and yield QTLs in LG U9 and LG U13 segregated in the NA population. Although the yield locus in LG U11, segregating in the MN population, was associated with a major maturity QTL, the two in the NA population (LGs U9 and U13) were not; nor were they associated with QTLs for height or reproductive period. The QTLs for height and maturity in LG U14, which segregate in all three populations, were not associated with a yield phenotype. Similarly,
two other major maturity QTLs (in LGs U9 and U11), which segregate in the MA population, were not accompanied by major variations in yield. Clearly, yield and maturity can be dissociated at the level of the individual QTLs. All but one of the QTLs for yield, height, or both appear to regulate the amount of yield per unit height (YD/HT). The exception, a yield QTL in LG U9, which segregates in the NA population, may explain the difference observed between correlations involving yield and height on the one hand and yield and yield per unit of height (YD/HT) on the other (Fig. 2). We have compared the QTL parameters of the three RI populations. Table 2 describes in detail QTLs for
1648
CROP SCIENCE, VOL. 39, NOVEMBER–DECEMBER 1999
Table 3. Summary statistics for the analysis of the 15 traits in the three RI populations. Population† Minsoy-Noir 1 TraitP YD OIL PRO YD/SW LL AR HDL SW LDG RP LW YD/HT R8 HT R1
h2
VQTL
QTL LOD . 4
0.88 0.69 0.61 0.92 0.78 0.77 0.65 0.89 0.90 0.78 0.60 0.80 0.93 0.94 0.87
0.25 0.27 0.35 0.37 0.39 0.40 0.43 0.49 0.54 0.57 0.64 0.65 0.67 0.67 0.81
1 0 1 1 3 2 1 2 3 4 2 2 4 4 4
Minsoy-Archer QTL R2 . 10%
Trait
h2
1 2 2 1 1 1 1 0 2 1 1 2 4 3 3
SW YD YD/SW OIL LL RP AR PRO LDG LW YD/HT HT R8 HDL R1
0.89 0.70 0.81 0.76 0.74 0.85 0.81 0.75 0.83 0.72 0.86 0.94 0.97 0.64 0.95
Noir 1-Archer
VQTL
QTL LOD . 4
QTL R2 . 10%
Trait
h2
VQTL
QTL LOD . 4
QTL R2 . 10%
0.13 0.13 0.20 0.25 0.27 0.34 0.35 0.36 0.41 0.41 0.48 0.48 0.57 0.58 0.61
0 0 1 1 1 2 2 1 2 3 4 4 4 3 3
0 0 0 0 1 2 1 1 2 1 4 4 3 1 2
PRO AR LW LL HDL LDG OIL YD HT YD/HT YD/SW SW R8 R1 RP
0.83 0.75 0.72 0.75 0.74 0.78 0.81 0.71 0.89 0.84 0.74 0.86 0.96 0.87 0.89
0.21 0.22 0.23 0.24 0.25 0.28 0.34 0.38 0.42 0.46 0.47 0.47 0.56 0.58 0.59
0 1 2 1 1 1 1 2 3 2 2 1 3 3 3
1 1 0 1 1 1 2 2 2 2 2 0 3 3 3
† For each trait the heritability (h2), the amount of heritable variation explained (VQTL), the number of QTL detected above a LOD 4, (QTL LOD . 4) and the number of QTL explaining .10% of the variation (QTL R2 . 10%) are listed. For each population the traits have been ordered according to the amount of heritable variation explained by the detected QTL. P YD-yield, OIL-oil content, PRO-protein content, YD/SW-seed number, LL-leaf length, AR-leaf area, HDL-height divided by lodging, SW-seed weight, LDG-lodging, RP-reproductive period, LW-leaf width, YD/HT-yield per unit height, R8-maturity date, HT-plant height, R1-flowering date.
the 15 different traits. All QTLs with LOD scores .3.0 are presented. Whereas QTLs were identified on about half of the 20 linkage groups in any individual RI population, when taken together they were distributed over 17 of the 20 linkage groups. For 12 of the linkage groups
the number of QTLs found using the three RI populations was modest, ranging from one on LG U26 (YD/ HT segregating in NA) to six in LG U1 (representing six different traits, one segregating in MN and five in NA). Five linkage groups contained a majority of the
Fig. 3. QTL genome scans for five different traits in each of the three RI populations. Each graph displays the complete simple interval-mapping scan for a particular population and trait (see materials and methods). The genome position (x-axis) is graphed against the LOD score (yaxis). Vertical lines on the x-axis indicate the boundaries of linkage groups. For these populations a LOD score of 3.7 is significant at a threshold of 0.05 for the experiment (the probability of finding a QTL with a LOD of 3.7 by chance, after scanning all of the markers in the genome). The populations are MN 5 Minsoy-Noir 1, MA 5 Minsoy-Archer, and NA 5 Noir 1-Archer. The traits are (RP) reproductive period, (R8) maturity, (YD) seed yield, (YD/HT) yield per unit height, and (HT) height.
1649
ORF ET AL.: SOYBEAN GENETICS COMPARING THREE RI POPULATIONS
identified QTLs. These were linkage groups U9, U11, U13, U14, and U22, in which 23, 26, 10, 45, and 10 QTLs were segregating respectively. The number of traits segregating varied from a low of seven in LGs U13 and U22 to as many as 13 or 14 in LGs U9, U11, or U14. In linkage groups U9, U13, U14, and U22, QTLs were found in all three RI populations. However, in LG U11, QTLs were only identified in the MA and MN populations. In many cases, QTLs found in one population also could be identified in the same location in another population (Table 4). Finally, the three linkage groups in which no QTLs could be found (LGs U17, U21 and U24) are each as large (120–145 cM), and contain as many markers, as linkage groups which have many QTLs (such as LG U14). Therefore, the absence of QTLs cannot simply be attributed to a lack of opportunity for establishing linkage to a marker locus. The three populations contained about the same number (22–25) of major QTLs which individually accounted for large amounts of trait variation (Table 1), most frequently related to height or maturity. However, values of “explained” heritable variation (VQTL) were generally higher in the MN than in the MA or NA populations. This is probably due to the greater number
of genetic markers which characterize the MN population as well as the larger number of field replications. For all three populations, traits involving development and maturity (R1 and R8) are best explained by the QTLs identified here. QTLs also account for the high heritability of height in the MN and MA populations, in which a major growth habit gene (Dt1) is segregating. In contrast, many other traits were only represented by small QTLs. For example, no QTLs could be found which explained 10% or more of the phenotypic variation in seed weight. This trait also varied greatly between the three populations with respect to the amount of heritable variation that could be explained by the several small QTLs identified (from 13% in the MA to 49% in the MN populations). In two populations, MN and MA, QTLs for yield did not explain much of the heritable variation (Table 1 and Fig. 3) and in the MN population the major QTL which affected yield was tightly linked to a very important maturity QTL, suggesting pleiotropy (Table 2 and Fig. 3). In contrast, QTLs accounted for more of the yield variation in the NA population and neither of the major QTLs affecting this trait were associated with maturity.
Table 4. QTLs that could be identified in one, two, or three RI populations. A — QTLs in Three Populations†
B — QTLs in Two Populations
Population Trait¶
LG
I.D.
MAP LOD
HT HT R1 R8 R8 R8 RP RP YD/HT YD/HT R8 YD/HT
U14 U14 U14 U14 U14 U14 U14 U14 U14 U14 U9 U9
1794 1817 1794 1773 1794 1806 1784 1796 1796 1812 1100 1099
6.9 3.0 3.0 5.4 4.2 3.2 2.2 2.3 6.5 3.7 7.2 6.8
C — QTLs in One Population
Population
MN LOD
NA LOD
23.0 6.5 5.3 2.4 7.0 2.3 4.9 4.9 28.0 8.9 13.0 2.9
6.5 5.6 11.0 4.1 11.0 18.0 5.4 6.9 2.4 2.7 1.9 1.9
Trait
LG
I.D.
MA LOD
HT HT HT LDG LDG HDL HDL HDL HDL HDL R1 R1 R1 R8 R8 R8 RP RP RP SW PRO OIL LL LL LL LW LW AR AR AR YD YD/SW YD/HT YD/HT
U9 U11 U14 U9 U13 U11 U11 U13 U14 U14 U9 U11 U14 U11 U14 U22 U11 U14 U22 U9 U7 U7 U14 U14 U14 U11 U11 U11 U11 U14 U9 U11 U13 U14
1099 1269 1771 1100 1602 1274 1285 1592 1767 1801 1100 1275 1806 1270 1817 2308 1268 1806 2298 1005 873 867 1794 1806 1814 1272 1281 1272 1303 1796 1096 1277 1641 1773
7.5 8.7 11.0 12.0 9.2 4.2 4.2 4.2 9.5 4.8 19.0 15.0 2.5 19.0 1.9 2.3 7.8 1.5 6.2 1.0 3.8 4.9 0 0 0 7.2 5.2 8.8 4.2 1.3 1.4 4.2 3.1 9.2
Population
MN LOD
NA LOD
8.7 7.1 6.0 8.1 1.5 4.3 5.8 3.7 2.3 0 19.0 25.0 1.0 21.0 1.0 0 7.0 3.2 0 4.6 4.2 3.3 7.9 3.5 5.1 7.8 10.0 4.5 2.0 4.1 3.0 9.2 0 16.0
0 0 1.0 0 2.0 0 1.0 1.0 1.5 7.2 0 0 15.0 0 11.0 11.0 0 11.0 12.0 2.1 0 0 4.3 5.5 3.4 0 0 0 1.2 4.1 6.1 1.6 7.1 0
Trait
LG
I.D.
MA LOD
HT HT LDG LDG LDG R1 R1 R1 R1 R8 RP RP SW SW PRO OIL LL LL LW LW LW LW LW AR YD YD YD/SW YD/SW YD/SW YD/HT
U14 U19 U14 U14 U19 U5 U11 U14 U14 U11 U9 U14 U3 U22 U22 U14 U11 U14 U1 U9 U10 U14 U14 U14 U11 U13 U12 U13 U11 U19
1700 1966 1700 1796 1960 663 1304 1767 1817 1304 1024 1817 347 2266 2287 1806 1270 1699 6 1101 1186 1773 1806 1806 1271 1637 1427 1594 1272 1974
0 0 0 1.0 1.0 1.4 0 5.0 1.7 1.0 0 1.0 0 1.7 6.4 0 6.6 0 0 4.7 0 4.2 0 0 1.6 0 0 0 6.3 0
MN LOD
NA LOD
4.3 1.0 4.0 16.0 1.0 1.0 8.4 1.0 1.0 5.8 4.3 1.0 4.3 1.1 1.0 1.0 0 4.5 1.0 1.8 4.5 0 0 1.9 11.0 0 1.0 0 0 0
0 4.6 0 1.2 5.0 12.0 0 1.7 8.7 1.0 1.0 7.0 1.2 5.0 0 6.1 0 0 4.1 0 1.0 1.0 4.1 5.7 0 5.5 5.7 5.3 0 5.4
† To be included in the table, a QTL had to be identified with a LOD $ 4 in at least one RI population. (A) 12 QTLs that were significant in all three RI populations. (B) 35 QTLs that were significant in only two of the three RI populations. (C) 32 QTLs that were significant in only one of the three RI populations. For each QTL the trait, linkage group (LG), genome position (I.D.) and LOD scores are indicated in each population. P MA 5 Minsoy-Archer; MN 5 Minsoy-Noir 1; NA 5 Noir 1-Archer. ¶ HT-plant height, R1-flowering date, R8-maturity date, RP-reproductive period, YD/HT-yield per unit height, LDG-lodging, HDL-height divided by lodging, SW-seed weight, PRO-protein content, OIL-oil content, LL-leaf length, LW-leaf width, AR-leaf area, YD-yield, YD/SW-seed number.
1650
CROP SCIENCE, VOL. 39, NOVEMBER–DECEMBER 1999
Segregation of QTLs in Different RI Populations In Fig. 3, QTLs for maturity that occurred in LG U14 were found to segregate in all three populations, suggesting the presence of three different alleles. In contrast, maturity QTLs in LG U11 segregated only in two of the populations, consistent with two alleles. The QTL for YD/HT in LG U11 segregated only in the MA population. This suggested to us that the failure to segregate in the other two populations might be determined by additional genomic information involving epistatic effects. We therefore examined the three populations for segregation of QTLs linked to particular marker loci. In each case, a QTL was chosen if it was highly significant (LOD .4) in one of the three populations. The other two populations then were examined for the presence of the same QTL at a LOD of two or higher. In this manner, it was possible to form an estimate of the occurrence of two or three alleles as well as of the frequency of cases in which segregation occurred in only one population (Table 4). Segregation of a particular QTL in all three of the RI populations constitutes evidence for three alleles of a QTL. There were only 12 cases in which segregation was observed in all three populations (Table 4A); 10 were found in LG U14 and two in LG U9. All involved height (HT), date of flowering (R1) or maturity (R8), and in each linkage group QTLs for the different traits were closely clustered. Thus there might be only two examples of loci with three alleles, both involving maturity QTLs with pleiotropic effects. In contrast, there were many examples of segregation in two populations (two alleles, Table 4B) or of segregation in only one population (Table 4C). Examples of segregation in two populations involved QTLs for all of the traits, located on seven linkage groups, whose spacing indicated at least 12 clusters of QTLs. Examples of segregation in only one population involved QTLs for all but one of the traits, located on nine different linkage groups, whose spacing indicated at least 19 clusters. These data strongly indicate that for most traits, only two QTL alleles were segregating and that for many QTLs effects of genetic background limited segregation of phenotypic variation to one population, indicating epistasis.
DISCUSSION Our study has compared quantitative traits and genotypes of three RI populations related by their descent from three parents. The data have been averaged over different environments and therefore, the QTLs that we have identified have maintained their significance across environments. Previous analysis of the MN population (Mansur et al., 1996) had led to three important conclusions: (i) the genomes of Minsoy and Noir 1 were quite different, leading to pronounced transgressive segregation of trait values in the progeny; (ii) for many traits, major QTLs (R2 . 10%) were observed; and (iii) agronomic QTLs were clustered on three linkage groups, U9, U11, and U14. A primary objective of this research was to deter-
mine whether these conclusions remained valid as more molecular markers became available and if so, whether the conclusions were restricted to the MN population or could be extended to other genomes. Segregation of traits was transgressive in both the MA and NA populations as had been observed in the MN population, but was more pronounced for some traits than for others (Table 3). We have concluded from this that the QTL genotypes that control these traits are different in all three parents, and in particular, that Archer is genotypically distinct from both Minsoy and Noir 1. Consistent with this, we have identified 50 new QTLs (LOD . 3) in the MA and NA populations that did not segregate in the MN population. A similar number of major agronomic QTLs (22–25 QTLs, R2 . 10%) were identified in each of the three RI populations (Tables 1 and 2). In this respect, the Minsoy, Archer, and Noir 1 genomes resembled each other. However, some traits seemed to be better represented by large QTLs than others. For example, height, lodging, flowering date, and maturity, as well as protein and oil all were represented by large QTLs, whereas seed weight, seed number, and leaf related traits were represented by QTLs that explained less variation. As the number of available markers has increased, we have found that almost every linkage group (17 of 20) had one or more agronomic QTLs. However, clustering of QTLs on linkage groups U9, U11, and U14 continued to be observed (Table 2). The clustering on LG U14 is clearly common to all three parents since 12 or more QTLs were identified on this linkage group in each RI population. In contrast, the QTLs on LG U11 appear to be the result of particular Minsoy alleles, since no QTLs were identified on LG U11 in the NA population despite the fact that 12 were identified in the MA and MN populations (Table 2). The almost equal frequency of QTLs identified in the three RI populations (Table 2) is misleading, since the maps of the three populations are not equally covered by markers. The map in Fig. 1 comprises about 2585 cM of linkage distributed between the 20 linkage groups of soybean and includes all available markers that are 5 cM or more from any other marker. Nevertheless, gaps of 20 to 40 cM exist in which QTLs cannot be identified. Moreover, the three populations differ in the numbers of these gaps found in their respective genetic maps. Gaps of 20 cM or more account for 22% of the MN linkage map, but similar gaps are much more frequent in the NA (36%) or MA (43%) maps. Thus, the 44 identified QTLs in the MA population is almost certainly a low estimate, as are the 37 identified in the NA population. As these gaps in the linkage map are filled we can expect to identify many more QTLs from the existing trait data. In general, large QTLs tended to explain much of the heritable variation for highly heritable traits such as height, flowering date, or maturity (Table 1). An exception was seed weight, for which a large number of small QTLs (Table 2) explained as much as 50% of the heritable variation in the MN or NA populations. For yet other traits such as yield, oil or protein, and leaf related
ORF ET AL.: SOYBEAN GENETICS COMPARING THREE RI POPULATIONS
traits QTLs often failed to account for much of the heritable variation. This varied from trait to trait and from one population to another. This may have resulted from regions in the genetic maps in which the absence of marker loci prevented identification of QTLs. A major portion of the heritable variation for yield remains to be explained by individual QTLs in all three populations (Table 1). This is particularly apparent in the MA population (Fig. 3) and underlines the pressing need for more genetic markers in this population. These would allow identification of yield genes with small effects or would identify genes that lie within gaps in the genetic maps (Fig. 1). The correlation between yield and other traits (such as height, seed weight, R1 or R8) differs between the NA population and the MN or MA populations and most other cultivars show correlations similar to those of the MA and MN populations. Moreover, the high correlation between yield and yield per unit of height in the NA population is an indication that in this population increases in yield might not be dependent on the overall size of the plant. These correlations suggested that the NA population might reveal new genetic information on yield. This was borne out by the QTL data in Fig. 3. The three most significant NA yield QTLs were on linkage groups in which there were no QTLs for height or maturity, whereas the major yield loci in the MN population corresponded to loci for maturity, height, or both. Moreover, two of the major NA loci for yield per unit height (YD/HT) corresponded only to loci for yield, in contrast to the MN population in which the major YD/HT loci corresponded to either height or maturity. Finally, unlike the MN yield loci, the NA yield loci were not linked to loci for reproductive period. We have compared RI populations to search for QTLs with three rather than two alleles and for QTLs which may be affected by the genotype in which they reside (epistasis). We observed compelling evidence that very few of the QTLs we observed (perhaps only two) have three alleles (Table 4A). All of the examples found could be explained by two maturity QTLs, on LGs U9 and U16, with pleiotropic effects. Indeed, the loci on LG U9 may not be tri-allelic because the NA LOD scores are borderline. There were 34 examples of QTLs in which a significant QTL in one population was confirmed in another (Table 4B). These included loci for all 15 traits located on 6 linkage groups. However, 30 QTLs with LOD . 4 could not be confirmed in another RI population (that is, no second QTL of LOD . 2 could be found). These included QTLs for 14 traits located on 11 linkage groups. (Seventeen of these had a significance greater than LOD 5 5.) If other genotypic effects were not present, we would expect
1651
each QTL to be observed in at least two of the three RI populations. Therefore, it would appear that a large proportion of the QTLs identified are subject to epistatic effects. This conclusion must be tempered by the fact that only two of the three populations were grown in common environments. However, it should be noted that despite the lack of common environments, 27 QTLs found in the MN population could be confirmed in either the MA or the NA population. It seems likely that by averaging over environments we have avoided specific environmental effects. REFERENCES Chase, K., F.R. Adler, and K.G. Lark. 1997. Epistat: A computer program for the analysis of epistasis. Theor. Appl. Gen. 89:435–440. Churchill, G.A., and R.W. Doerge. 1994. Empirical threshold values for quantitative trait mapping. Genetics 138:963–971. Cianzio, S.R., S.P. Schultz, W.R. Fehr, and H. Tachibana. 1991. Registration of ‘Archer’ soybean. Crop Sci. 31:1707. Cregan, P.B., T. Jarvik, A.L. Bush, R.C. Shoemaker, K.G. Lark, A.L. Kahler, T.T. Van Toai, D.G. Lohnes, T. Chung, and J.E. Specht. 1999. An integrated genetic linkage map of the soybean. Crop Sci. 39(5):1464–1490. Dudley, J.W. 1993. Molecular markers in plant improvements: manipulation of genes affecting quantitative traits. Crop Sci. 33:660–668. Hanson, C.H., H.F. Robinson, and R.E. Comstock. 1956. Biometric studies of yield in segregating populations of Korean lespedera. Agron. J. 48:268–272. Lark, K.G., K. Chase, F.R. Adler, L.M. Mansur, and J.H. Orf. 1995. Interactions between quantitative trait loci in which trait variation at one locus is conditional upon a specific allele at another. Proc. Natl. Acad. Sci. (USA) 92:4656–4660. Lincoln, S.E., and S.L. Lander. 1993. Mapmaker/exp 3.0 and Mapmaker/QTL 1.1 Whitehead Inst. of Med. Res. Tech. Report. Cambridge, MA. Mansur, L.M., K.G. Lark, H. Kross, and A. Oliveira. 1993a. Interval mapping of quantitative trait loci for reproductive, morphological, and seed traits of soybean [Glycine max (L.) Merr.]. Theor. Appl. Genet. 86:907–913. Mansur, L.M., and J.H. Orf. 1995. Agronomic performance of soybean recombinant inbreds in northern USA and Chile. Crop Sci. 35: 422–425. Mansur, L.M., J.H. Orf, K. Chase, T. Jarvik, P.B. Cregan, and K.G. Lark. 1996. Genetic mapping of agronomic traits using recombinant inbred lines of soybean. Crop Sci. 36:1327–1336. Mansur, L.M., J.H. Orf, and K.G. Lark. 1993b. Determining linkage of quantitative trait loci to RFLP markers using extreme phenotypes of recombinant inbreds of soybean [Glycine max (L.) Merr.]. Theor. Appl. Genet. 86:914–918. Mather, K., and J.L. Jinks. 1977. Introduction to biomedical genetics. Cornell University Press, Ithaca, NY. Nelson, R.L., P.J. Amdor, J.H. Orf, J.W. Lambert, J.F. Cavins, R. Kleiman, F.A. Laviolette, and K.L. Athow. 1987. Evaluation of the USDA soybean germplasm collection: maturity groups 000 to IV (PI 273.483 to PI 427.107) USDA Tech. Bull. 1718. Press, W.H., T.A. Saul, W.T. Vetterling, and B.P. Flannery. 1996. Numerical recipes in C, the art of scientific computing. Cambridge University Press, New York. SAS Institute. 1988. SAS Procedures guide release, 6.03 edu. SAS Institute, Cary, N.C. Utz, H.F., and A.E. Melchinger. 1996. PLABQTL: A program for composite interval mapping of QTL. J. Quantitative Trait Loci (http://probe.nalusda.gov:8000/otherdocs/jqtl/jqtl1996-01/utz.html; verified 13 May 1999).