Mapping As You Go: An Effective Approach for Marker-Assisted Selection of Complex Traits Dean W. Podlich,* Christopher R. Winkler, and Mark Cooper
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
ABSTRACT
Many factors have contributed to the inability to successfully employ a marker-based selection scheme for complex traits. One major difficulty has been the effective detection, estimation, and utility of QTL and their effects. This is especially the case for traits governed by context-dependent gene effects (i.e., interaction with other genes and/or environment). Historically, most evaluations of mapping and MAS strategies have largely ignored the possibility of context dependency and have instead assumed that QTL act independently (i.e., no interaction with other genes and/or environment). Adoption of this assumption has led some to suggest that MAS has little if any power over traditional phenotypic selection (e.g., Bernardo, 2001). Increasing evidence suggests, however, that context-dependent factors are important in the determination of trait phenotypes (e.g., Cooper and Hammer, 1996; Schlichting and Pigliucci, 1998; Clark, 2000; Wolf et al., 2000; Holland, 2001; Mackay, 2001; McMullen et al., 2001; Tuberosa et al., 2002; de Visser et al., 2003). The presence of these factors has significant implications for the way mapping and MAS should be conducted for the improvement of complex traits in a breeding program during the short, medium, and long term. Analysis methods have been developed to accommodate for the effects of context dependency (e.g., Crossa et al., 1999; Jannink and Jansen, 2001; Nelson et al., 2001; Boer et al., 2002; van Eeuwijk et al., 2002). For example, in the case of epistasis, Holland (2001) outlined an approach that was based on the identification of preferred allele configurations across interacting genes. Similar approaches have been suggested by others (e.g., Jansen et al., 2003; Kuhnlein et al., 2003). Applications of this approach will be a challenge, even for well-studied gene networks (Peccoud et al., 2004). Other advances in methodology include the use of multiple line crosses among related individuals (Jannink et al., 2001; Yi and Xu, 2001; Bink et al., 2002) and/or haplotype information to increase the power to accurately estimate QTL and their effects (Meuwissen and Goddard, 2000; Jansen et al., 2003). In all cases, the analysis methods assume that the mapping studies can be conducted with sufficient power to adequately account for all, or at least the important, context dependencies that may exist. Regardless of what assumptions are made, a common outcome of all QTL analysis methods is the estimation of QTL allele effects, whether at an individual gene level or across multiple interacting gene complexes (Jansen, 1996). A target combination of marker alleles is defined from these estimates, forming the basis of selection in
The advent of high throughput molecular technologies has led to an expectation that breeding programs will use marker–trait associations to conduct marker-assisted selection (MAS) for traits. Many challenges exist with this molecular breeding approach for so-called complex traits. A major restriction to date has been the limited ability to detect and quantify marker–trait relationships, especially for traits influenced by the effects of gene-by-gene and gene-by-environment interactions. A further complication has been that estimates of quantitative trait loci (QTL) effects are biased by the necessity of working with a limited set of genotypes in a limited set of environments, and hence the applications of these estimates are not as effective as expected when used more broadly within a breeding program. The approach considered in this paper, referred to as the Mapping As You Go (MAYG) approach, continually revises estimates of QTL allele effects by remapping new elite germplasm generated over cycles of selection, thus ensuring that QTL estimates remain relevant to the current set of germplasm in the breeding program. Mapping As You Go is a mapping-MAS strategy that explicitly recognizes that alleles of QTL for complex traits can have different values as the current breeding material changes with time. Simulation was used to investigate the effectiveness of the MAYG approach applied to complex traits. The results indicated that greater levels of response were achieved and these responses were less variable when estimates were revised frequently compared with situations where estimates were revised infrequently or not at all.
R
ecent advances in molecular genetics have led to an enthusiasm for use of MAS to improve the performance of traits in plant breeding (Koornneef and Stam, 2001; Peleman and Rouppe van der Voort, 2003). The key components to the implementation of this approach are (i) the creation of a dense genetic map of molecular markers, (ii) the detection of QTL based on statistical associations between marker and phenotypic variability, (iii) the definition of a set of desirable marker alleles based on the results of the QTL analysis, and (iv) the use and/or extrapolation of this information to the current set of breeding germplasm to enable markerbased selection decisions to be made. To date, this approach has been effective for relatively simple traits that are controlled by a small number of genes (e.g., disease resistance; Flint-Garcia et al., 2003) but less effective for more complex traits controlled by many genes that are under the influence of epistasis (gene-by-gene interaction) and gene-by-environment interaction effects (e.g., grain yield; Openshaw and Frascaroli, 1997; Melchinger et al., 1998; Utz et al., 2000).
Pioneer Hi-Bred International, 7250 NW 62nd Ave., P.O. Box 552, Johnston, IA 50131-0552. Received 31 Oct. 2003. *Corresponding author (
[email protected]). Published in Crop Sci. 44:1560–1571 (2004). Crop Science Society of America 677 S. Segoe Rd., Madison, WI 53711 USA
Abbreviations: MAS, marker-assisted selection; MAYG, Mapping As You Go; MET, multienvironment trial; MSO, Mapping Start Only; QTL, quantitative trait loci.
1560
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
PODLICH ET AL.: MAPPING AS YOU GO
the application of MAS in a breeding program. More advanced applications of MAS may weight specific marker alleles based on the amount of genetic variation they explain in the analysis (Lande and Thompson, 1990). However, in essence, the approach to MAS in plant breeding has been to develop accurate estimates of QTL effects within a relatively narrow reference population and use those estimates in the application of marker-based selection. This approach assumes that the desirable QTL alleles, once identified, will remain relevant during many cycles of selection. That is, the estimates of QTL effects that are calculated at the beginning will still apply as new germplasm is created during the breeding process (e.g., Peleman and Rouppe van der Voort, 2003). Additional QTL analyses may be conducted on new germplasm, but in general the aim is to validate or refine the initial estimates by making them more accurate and precise (Beavis, 1998). The assumption that the value of QTL alleles should stay relatively fixed or static is appropriate for traits controlled solely by additive genes (e.g., Bernardo, 2001). In this way, the effects of QTL are consistent across all or most germplasm (both current and future) and hence MAS can be implemented by independently assembling or stacking desirable alleles. However, when context dependencies are present, the value of QTL alleles can differ depending on the genetic structure of the current set of germplasm in the breeding program (Wade, 2002). That is, the value of a given QTL allele can change over cycles of selection because of changes in the background (i.e., context-dependent) effects at any given time in the breeding process (Peccoud et al., 2004). Therefore, when these background effects are important, the stacking of desirable alleles by MAS becomes inadequate because it is possible that the initial target combination of alleles is no longer the best target, or even a relevant target, for increased trait performance in subsequent breeding cycles. Additional discussions of how gene-bygene and gene-by-environment interactions can give rise to context-dependent effects are given in Appendix 1 and by Cooper and Podlich (2002) and Wade (2002). The MAYG approach investigated in this paper is a mapping-MAS strategy to accommodate the presence of context dependencies due to epistasis and gene-by-environment interaction by implementing MAS such that the estimated values of QTL alleles can evolve as the current germplasm evolves over cycles of selection. The MAYG method operates by cyclically reestimating the value of QTL alleles each time a new set of germplasm (i.e., different set of genetic backgrounds) is created during the breeding process. It is important to note that the main motivation for the consideration of this approach is not to reestimate QTL effects as a means to refine initial estimates by making them more accurate and precise, although this is an obvious application that comes to mind when reestimation of effects are conducted. Instead, the main motivation for the consideration of this approach is to ensure that the estimates of QTL allele effects remain relevant to the current set of germplasm. In this way, because of the presence of context-dependent effects, the estimated value of a QTL
1561
allele may change in magnitude over cycles of the breeding program and in the extreme case, a different QTL allele may be identified as favorable. Thus, selection pressure on one allele type (or haplotype) may be interspersed with selection pressure on alternative allele types (or haplotypes) over the duration of the breeding program. Given convincing molecular evidence of the ubiquitous nature of context-dependent effects in genetics (de Visser et al., 2003), we seek a MAS strategy that is effective across a wide range of trait architectures and environmental conditions. Therefore, the objectives of this paper are to: (i) outline the general structure of the MAYG approach, and (ii) use simulation to investigate the effectiveness of the MAYG method for a range of genetic models. MATERIALS AND METHODS Mapping As You Go Approach The approach typically suggested for MAS for complex traits is to use a single estimation of QTL effects to define some form of desirable marker allele index, and then apply selection to individuals that best meet the target combination of marker alleles. In this paper, the single estimation approach to MAS will be referred to as the Mapping Start Only (MSO) approach (Fig. 1a). In practice, the single estimation or MSO approach may be based on the results of a single mapping study or the aggregation of multiple mapping studies. However, for purposes of discussion in this paper, the MSO approach adopted a single set of QTL estimates in the application of MAS over all cycles of the breeding program to enable forward selection for a fixed target genotype. An example of a MSO proposal is the so-called “breeding by design” concept described by Peleman and Rouppe van der Voort (2003). The MAYG approach to MAS suggested in this paper reestimates the value of QTL alleles as new germplasm is sequentially created over cycles of the breeding program (Fig. 1b). While many variants can be considered within the context of the MAYG approach, the key steps used in this paper are as follows: (i) Estimate the effects of QTL alleles
Fig. 1. A schematic representation of the (a) Mapping Start Only and (b) Mapping As You Go approaches to marker-assisted selection. A solid circle on the timeline indicates a mapping event. QTL ⫽ quantitative trait loci.
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
1562
CROP SCIENCE, VOL. 44, SEPTEMBER–OCTOBER 2004
from an initial set of breeding crosses. (ii) With the information from the initial QTL analysis, construct a target configuration of marker alleles and conduct marker selection or MAS on germplasm representative of that used in the QTL mapping study. (iii) Create a new set of crosses among the selected lines. (iv) Reestimate the effects of QTL alleles in the set of germplasm created from the new set of crosses. (v) Update the estimates of the QTL effects that will be used in the next cycle of selection. (vi) Select within the new set of crosses on the basis of the updated estimates of QTL effects. (vii) Continue this cyclical process by evolving the estimates of QTL effects as new germplasm is created over cycles of the breeding process. Again, many variants to this method are possible and we will not consider them all here. The central theme that is emphasized in this paper is that the MAYG approach insures that estimated QTL effects remain relevant to the genetic structure of the breeding germplasm at any given point in time.
An In Silico Investigation of the Mapping As You Go Approach Computer simulation was used to investigate the performance of the MAYG approach to MAS. The simulation experiments were implemented using the library of routines available in the QU-GENE software (Podlich and Cooper, 1998). The results of different implementations of MAYG were compared with the MSO method. The main objective of the experiment was to obtain a general assessment of how an approach based on the reestimation of QTL effects would perform in the application of MAS, and thus many aspects of the simulation experiment were simplified to focus on this central theme. A reciprocal recurrent selection breeding program was simulated, though the MAYG approach can be implemented in any breeding program. Here, two pools of inbreds were formed, each with 200 individuals. Pedigree breeding was conducted in each pool. Biparental crosses (100 crosses) and inbred development (500 lines) were performed independently in each of the pools. Hybrid combinations were formed using testers (10 testers per line) from the opposite pool of germplasm. Each of the hybrid combinations (500 ⫻ 10 ⫽ 5000 combinations) was evaluated in a simulated multienvironment trial (MET) with 10 locations. The environment types represented at each location were sampled at random from a simulated target population of environments (Comstock, 1977; Cooper and Hammer, 1996; Cooper and Podlich, 2002). The phenotypic values of the hybrids across the 10 locations were used to estimate the QTL allele effects. To reduce the run time of the simulation experiment, a simple approach to the estimation of the QTL effects was adopted in that a molecular marker was perfectly linked to each of the QTL contributing to trait variation, that is, perfect markers and complete linkage disequilibrium. Thus, each QTL allele could be uniquely identified by a marker allele within every genotype at every stage in the simulation experiment. Estimated QTL effects were obtained for each genotype (e.g., AA, Aa, aa) by averaging the phenotypic values of all individuals in the hybrid populations that contained that genotype. The effects were then contrasted for each genotype. For example, the average performances of all individuals with the AA genotype combination at a locus were compared with all individuals with Aa and aa genotype combinations. For each locus, the magnitude of the effect was estimated and the favorable genotype identified. It should be noted that the method used to estimate QTL effects in this experiment was one of many possible analysis methods that could be considered and was chosen because of its ease of implementation. It should also be noted that by
virtue of implementing a model of perfect linkage between the QTL and markers and by using a large number of individuals in the estimation process, relatively accurate estimates of QTL and their effects were obtained. For the purposes of this experiment, this was considered a desirable outcome and was constructed in this way to ensure the QTL estimates were of relatively high quality in any single mapping analysis. With the availability of high quality estimates, the focus of the experiment was on the efficiency of MAS as applied using “true” values of QTL effects rather than the efficiency of MAS using QTL estimates of questionable quality. For the latter, the MAYG strategy will have obvious advantages (i.e., refinement of the initial inaccurate estimates) and is of little interest here. Marker-assisted selection was implemented in the breeding program by evaluating hybrid performance based on an index of phenotypic and genotypic information. The phenotypic information used in the index was based on the average performance of the hybrid combinations across the 10 locations sampled in the MET. For the genotypic evaluation, a molecular score was assigned to each hybrid combination according to the genetic similarity of the hybrid with the target configuration of marker alleles as defined by the QTL analysis. Genotypic scores of individual loci were weighted based on the magnitude of the allele effect as defined by the QTL analysis. The top 100 inbreds in each germplasm pool were selected based on the combined index of hybrid phenotypic and genotypic information and retained for the next breeding cycle. The process of pedigree breeding, hybrid evaluation, and selection was conducted over 30 cycles of the breeding program. For the MSO approach (Fig. 1a), the QTL effects were estimated in Cycle 1 of the breeding program and used throughout the 30 cycles of selection. For the MAYG approach (Fig. 1b), the QTL effects were reestimated: (i) every cycle of the breeding program (i.e., update ⫽ every cycle), (ii) every five cycles of the breeding program (i.e., update ⫽ five cycles), and (iii) every 10 cycles of the breeding program (i.e., update ⫽ 10 cycles). In all cases, the older QTL estimates were completely replaced by the newer QTL estimates. Thus, no information was retained from one QTL mapping analysis to the next. The relative performance of the MSO and MAYG approaches to MAS were considered over a wide range of genetic models. The genetic models were created using the E(NK) model ensemble approach (Podlich and Cooper, 1998; Cooper and Podlich, 2002), where E refers to the number of different environment types conditioning gene-by-environment interactions in the target population of environments, N refers to the number of genes influencing the trait, and K is a measure of the level of epistasis. For a given number of N genes, different levels of context dependency due to gene-by-environment interaction and epistasis can be introduced by varying the E and K parameters. Increased levels of E indicate more gene-by-environment interaction and larger values of K indicate more epistasis. See Cooper and Podlich (2002) for a detailed description on the use of the E(NK) model and Appendix 1 for an example set of context dependencies that the model can generate. In this experiment, a total of nine general classes of genetic models were considered (Table 1). The first general class of genetic models had only additive effects, that is, E ⫽ 1, K ⫽ 0 (a classical finite locus additive model). Two of the general classes of genetic models had epistatic effects and no gene-by-environment interaction effects; i.e., E ⫽ 1, K ⫽ 1, 2. Two of the general classes of genetic models had gene-by-environment effects and no epistatic effects; i.e., K ⫽ 0, E ⫽ 5, 10. The remaining four general classes of genetic models had a combination of gene-by-environment and epistatic effects; i.e., all combinations of E ⫽ 5, 10; K ⫽ 1, 2. For
PODLICH ET AL.: MAPPING AS YOU GO
Table 1. The nine classes of genetic models used in the simulation experiment using the E(NK) ensemble model, where E refers to the number of environment types conditioning gene-by-environment interactions in the target population of environments, N refers to the number of genes influencing the trait, and K is a measure of the level of epistasis. E values
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
K values
1
0
additive models
1, 2
epistatic models only
5, 10 gene-by-environment models only gene-by-environment and epistatic models
each class of genetic model, four levels of N were considered; N ⫽ 12, 24, 48, 96. In all cases, the genetic effects for the QTL alleles were sampled at random from an underlying uniform distribution according to the description given in Cooper and Podlich (2002). For each general class of genetic model, a total of 500 different random parameterizations of the E(NK) model were considered. Each of the MAS strategies was independently replicated 25 times for each parameterization of the E(NK) model. Scenarios were run for heritability levels of 0.1 and 0.7 on a single-plant basis. In total, the breeding program was run 3.6 million times, encompassing 108 million cycles of selection.
RESULTS The progress from selection for the MSO and MAYG approaches to MAS was evaluated in terms of the average performance of the hybrids at each cycle of the breeding program. On average, across all genetic models, the MAYG method outperformed the MSO method to MAS (Fig. 2). The greatest response was observed for the strategy that updated the QTL allele effects the most frequently (i.e., update ⫽ every cycle). The next highest levels of response were achieved by the strategies that updated the QTL allele effects every five and 10 cycles, respectively (update ⫽ five and 10 cycles; Fig. 2). For these latter two MAYG strategies, there were large increases in relative response immediately after the QTL alleles were reestimated (e.g., for update ⫽ 10 cycles, a jump in performance occurred at
1563
Cycle 11 and then again at Cycle 21). In all cases, the MAYG method outperformed the MSO method. There was a substantial difference in the relative performances of the MSO and MAYG methods among the nine general classes of genetic models (Fig. 3). For the class of genetic models where only additive effects were present (i.e., E ⫽ 1, K ⫽ 0; top-left panel of Fig. 3), there were relatively small differences in performance for the different MAS strategies. This result indicates that the initial estimates of QTL effects were effective across a long period of breeding and hence the updated estimates provided by the MAYG methods offered little additional information to improve selection response. In contrast, for the classes of genetic models that contained epistasis but no gene-by-environment interaction (i.e., E ⫽ 1; K ⫽ 1, 2; top-middle, and top-right panels of Fig. 3), the MAYG methods had higher levels of response than the MSO method. Here, the cyclical reestimation of QTL effects across cycles of selection using the MAYG method provided a more effective estimation of the genetic architecture of the trait within the context of the current germplasm, enabling higher responses to be achieved in the medium to long term. The size of the advantage that was observed using the MAYG approach increased with K (i.e., more context dependency), or when the QTL effects were estimated more frequently. For genetic models with gene-by-environment interaction effects but no epistasis (i.e., K ⫽ 0; E ⫽ 5, 10; middle- and lower-left panels of Fig. 3), the MAYG methods generally achieved higher levels of response compared with the MSO method. The MAYG approach had the desirable aspect of using a new sample of environment types in each QTL analysis and thus outperformed the MSO approach because the QTL estimates were not fixed indefinitely based on a single sample of environment types from the target population of environments. However, it should be noted that the MSO method initially outperformed the MAYG methods for genetic models with gene-by-environment interaction effects only. This was due to the fact that the MAYG method was continually chasing a moving target
Fig. 2. The performance of the Mapping Start Only (MSO) and three versions of the Mapping As You Go approach averaged across all genetic models considered in the experiment. (a) The performance is shown in terms of trait value normalized between 0 and 100%. (b) The response shown in (a) is represented as the difference in response between a given breeding strategy and the MSO method. Positive values indicate the breeding strategy had a higher response than the MSO method, and negative values indicate the breeding strategy had a lower response than the MSO method. The performance differences are expressed in terms of normalized trait value.
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
1564
CROP SCIENCE, VOL. 44, SEPTEMBER–OCTOBER 2004
Fig. 3. The relative performance of the Mapping Start Only (MSO) and three versions of the Mapping As You Go approach for nine general classes of genetic models (Table 1). E ⫽ the number of different environment types conditioning gene-by-environment interactions in the target population of environments, and K ⫽ level of epistasis. Additive genetic models: E ⫽ 1, K ⫽ 0; Epistatic effects models: E ⫽ 1, K ⫽ 1, 2; Gene-by-environment effects models: K ⫽ 0, E ⫽ 5, 10; Epistasis and gene-by-environment effects models: E ⫽ 5, 10, K ⫽ 1, 2. In all cases, performance is represented as the difference in response between a given breeding strategy and the MSO method. Positive values indicate the breeding strategy had a higher response than the MSO method, and negative values indicate the breeding strategy had a lower response than the MSO method. The performance differences are expressed in terms of normalized trait value.
based on the set of environments sampled in any given cycle (i.e., the “yo-yo effect”; Rathjen, 1994), thus leading to an initial less-desirable response to selection in the target population of environments. When both epistasis and gene-by-environment interaction effects were present (i.e., K ⫽ 1, 2; E ⫽ 5, 10; lower-right four panels of Fig. 3), the MAYG method on average outperformed the MSO method. There were differences in the variation of response among the different breeding strategies for the different classes of genetic models (Fig. 4). For the class of genetic models where only additive effects were present (Fig. 4; top-left panel), the variation in the response was consistent between the MSO method and versions of the MAYG method. In contrast, when context dependencies were present (Fig. 4; all panels except top left), the two approaches show progressively different patterns of variation in the mean response, with the MSO method having the most variation, particularly at later cycles. Despite the advantages of the MAYG method as shown in Fig. 3 and 4, there were still individual occurrences where the MSO method outperformed the MAYG method. For example, Fig. 5 shows the performance of
individual runs of the MSO and MAYG (update ⫽ every cycle) methods, where each point on the figure represents an individual realization of the breeding program for a specific genetic model. Values above the 1:1 line indicate the MAYG method had a higher level of response compared with the MSO method. Values below the 1:1 line indicate the MAYG method had a lower level of response compared with the MSO method. When viewed from this perspective, there were individual occurrences where the MSO method outperformed the MAYG method. Thus, in any given realization of the breeding process and for any given genetic model, the relative performances of the MSO and MAYG method cannot be guaranteed. However, there is a significant advantage to the MAYG method on average, which was more consistently achieved across individual scenarios when long-term genetic gain was considered (Fig. 5; Cycle 10 cf. Cycle 20).
DISCUSSION Accommodating the effects of epistasis and geneby-environment interaction in the application of MAS
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
PODLICH ET AL.: MAPPING AS YOU GO
1565
Fig. 4. The standard deviation of performance (Fig. 3) for nine general classes of genetic models (Table 1). E ⫽ the number of different environment types conditioning gene-by-environment interactions in the target population of environments, and K ⫽ level of epistasis. Additive genetic models: E ⫽ 1, K ⫽ 0; Epistatic effects models: E ⫽ 1, K ⫽ 1, 2; Gene-by-environment effects models: K ⫽ 0, E ⫽ 5, 10; Epistasis and gene-by-environment effects models: E ⫽ 5, 10, K ⫽ 1, 2.
for quantitative traits continues to be a difficult problem. Conventional approaches to MAS have largely ignored this problem by assuming QTL behave in an independent manner. Furthermore, the estimation of QTL effects has traditionally focused on the results of a single or limited number of mapping studies, effectively producing estimates that are based on a snap-shot in genetic
space, giving a limited perspective of the “gene’s eye view” discussed by Wade (2002). Such approaches to MAS would be appropriate if the effects of QTL were consistent within and among crosses, across environment types, and over the course of selection (Appendix 1). This is not expected to be the case for many of the complex traits manipulated in a breeding program (Ras-
Fig. 5. The performance of the Mapping Start Only and Mapping As You Go (update ⫽ every cycle) approaches to marker-assisted selection at Cycles 10 and 20 of the breeding program. Each point represents the response for a single genetic model and a single run of the breeding program. All nine classes of the genetic models (Table 1) are shown (250 points per class).
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
1566
CROP SCIENCE, VOL. 44, SEPTEMBER–OCTOBER 2004
musson and Phillips, 1997; Holland, 2001) where context dependencies such as epistasis and gene-by-environment interaction are considered to be important complicating factors. The MAYG approach outlined in this paper accommodates these context dependencies by reestimating the effects of QTL in parallel with the selection process and by evolving the knowledge on which selection is based. We have shown that such a strategy is an effective method for improving complex trait phenotypes (Fig. 2 and 3) and for reducing the variability in possible outcomes of the selection process (Fig. 4). It is important to note that the MAYG approach is not a new statistical analysis method per se. Instead, it is a new approach to MAS that can be used in combination with any of the appropriate existing statistical analysis methods to estimate QTL effects within any given set of germplasm (e.g., interval mapping, composite interval mapping, or haploMQM; Jansen et al., 2003). In principle, any analysis method that estimates QTL allele effects can be used. However, the ability of the analysis method to accurately estimate QTL effects will influence the power of the MAYG approach. The contrast of different statistical analysis methods on the effectiveness of the MAYG approach was not the focus of this paper, although some preliminary considerations are given in Appendix 2. The MAYG method also provides an effective treatment for the types of errors that can be easily introduced in the estimation of QTL effects in mapping populations (Jansen, 1994; Beavis, 1998). Two common types of error that can occur in QTL mapping studies are: (i) the designation of significant QTL effects when in fact no QTL actually exists in that linkage position (i.e., Type I errors), and (ii) the nonidentification of significant QTL that do in fact exist (i.e., Type II errors). In both cases, the errors can compromise the definition of the favorable marker configuration and hence reduce the effectiveness of MAS. A third type of error can occur in mapping studies when a true QTL position is correctly identified but the wrong allele is designated as the favorable allele (i.e., Type III errors). In the application of the MAYG method, the impact of these types of errors is confined to a single cycle of the breeding program. That is, any selection pressure on non-QTL (Type I error), or lack there of on true QTL (Type II), or on the incorrect allele of true QTL (Type III error) will only apply until the next round of QTL estimates. Thus, errors generated in any given mapping study will have little impact across a longer period of time in the breeding program. The particular implementation of the MAYG method in the simulation experiment was one of many possible variants that can be considered. Refinements can be made to increase the advantage of the MAYG method, and some of these are described below. For example, the analysis methods that provide the best estimates of QTL allele effects for the current elite germplasm are the most desirable. In this case, the mapping analysis would be conducted on multiple line crosses of elite germplasm, enabling pedigree relationships to be used
(e.g., Jannink et al., 2001; Bink et al., 2002). The analysis methods do not need to be confined to single gene estimates, as was the case in the simulation experiment reported here, but could take into consideration the structure of gene-by-gene or gene-by-environment interactions (e.g., Jannink and Jansen, 2001; Nelson et al., 2001; van Eeuwijk et al., 2002). However, it should be noted that even with a single-gene analysis, the MAYG method accommodates for these interactions without requiring an understanding of the detail of their causal basis. For example, in the case of epistasis, the superior performance of the MAYG approach did not require any understanding of the structure of the gene networks responsible for the trait performance. The change in estimated allele values over cycles of selection was an emergent property of the cyclical reestimation process in combination with genetic changes in the reference population due to the effects of selection. Thus, with no knowledge of the gene network structure itself, the MAYG approach exploits the network by selecting on genetic components of the network that are most favorable in context with the reference germplasm that is available at a given point in time in the breeding process. Another refinement of the MAYG method that can be considered is the timing and type of updating procedure employed in the reestimation of the QTL effects. In the simulation experiment, the older estimates of QTL effects were completely replaced by the new estimates. This was done every cycle of the breeding program or on an intermittent basis (e.g., every 5 cycles). The success of this approach was in part because of the fact that high quality QTL estimates for each new set of germplasm were attainable in every mapping study. Factors such as a low trait heritability, small population sizes, or even limited resources may preclude an effective estimation of QTL effects in a mapping study. Therefore, for many situations, QTL information derived from multiple cycles of the breeding program may be needed to improve the quality of the QTL estimates. One approach would be to use the QTL estimates from all cycles of the breeding program and weight estimates from recent cycles more heavily than older cycles of the breeding program. This approach would only be effective if the older estimates were sufficiently relevant to the current germplasm and offered increased levels of precision. Alternatively, the process can be performed by updating the estimates selectively within a subset of the breeding cycles. The estimates of QTL effects would then be based on a moving window of information. Use of an approach that combines estimates over several cycles of the breeding program is also an effective way to account for the effects of gene-by-environment interactions (e.g., with this approach, higher levels of response can be obtained by the MAYG methods than are shown in Fig. 3). In this case, the MAYG method accumulates information on QTL effects in different types of environments that are sampled over cycles of the breeding program (i.e., year and location combinations). Thus, progress in the target population of environments defined by the scope of the breeding
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
PODLICH ET AL.: MAPPING AS YOU GO
program can be more efficient by taking into consideration the QTL effects for the different environment types and the associated environmental variables that condition gene-by-environment interactions. One way to implement this approach is to conduct selection on a weighted index of QTL information using estimates from previously sampled environments, where the weights that are used are based on the frequency of occurrence of environment types in the target population of environments (Podlich et al., 1999). The estimation of QTL effects does not necessarily need to be tied to the breeding population as a whole. For example, estimates of QTL allele effects can be considered on an individual cross basis, where each estimate is confined to a single cross between two elite lines (e.g., Openshaw and Frascaroli, 1997). Marker-assisted selection is then conducted within each cross separately, based on the QTL effects estimated from each individual cross. A new set of estimates is used when selected lines form the basis of the next round of crossing. With computer simulation we have found that for specific circumstances, the refinements outlined above offer advantages over the implementation of MAYG considered here (data not shown). The appropriateness of any of these variants to the MAYG approach depends largely on the extent to which epistasis and gene-by-environment interactions influence the genetic architecture of the trait of interest. Given their potential for impact on response to selection, empirical investigations to quantify the importance of epistasis and gene-by-environment interactions for trait phenotypes is considered to be an important component of the design and optimization of any MAS strategy. ACKNOWLEDGMENTS We would like to thank three of the four anonymous reviewers and the associate editor for their constructive comments in reviewing the manuscript.
REFERENCES Beavis, W.D. 1998. QTL analysis: Power, precision and accuracy. p. 145–161. In A.H. Paterson (ed.) Molecular analysis of complex traits. CRC Press, Boca Raton, FL. Bernardo, R. 2001. What if we knew all the genes for a quantitative trait in hybrid crops? Crop Sci. 41:1–4. Bink, M.C.A.M., P. Uimari, M.J. Sillanpa¨a¨, L.L.G. Janss, and R.C. Jansen. 2002. Multiple QTL mapping in related plant populations via a pedigree-analysis approach. Theor. Appl. Genet. 103:1243– 1253. Boer, M.P., C.J.F. ter Braak, and R.C. Jansen. 2002. A penalized likelihood method for mapping epistatic quantitative trait loci with one-dimensional genome searches. Genetics 162:951–960. Cheverud, J.M., and E.J. Routman. 1995. Epistasis and its contribution to genetic variance components. Genetics 139:1455–1461. Clark, A.G. 2000. Limits to prediction of phenotypes from knowledge of genotypes. Evol. Biol. 32:205–224. Comstock. R.E. 1977. Quantitative genetics and the design of breeding programs. p. 705–718. In E. Pollack et al. (ed.) Proc. of the Int. Conf. on Quantitative Genetics, Ames, IA. 16–21 Aug. 1976. The Iowa State University Press, Ames. Cooper, M., S.C. Chapman, D.W. Podlich, and G.L. Hammer. 2002. The GP problem: Quantifying gene-to-phenotype relationships. Silico Biol. 2:151–164. Cooper, M., and G.L. Hammer. (ed.) 1996. Plant adaptation and crop improvement. CAB International, Wallingford, UK. Cooper, M., and D.W. Podlich. 2002. The E(NK) model: Extending
1567
the NK model to incorporate gene-by-environment interactions and epistasis for diploid genomes. Complexity 7(6):31–47. Crossa, J., M. Vargas, F.A. van Eeuwijk, C. Jiang, G.O. Edmeades, and D. Hoisington. 1999. Interpreting genotype ⫻ environment interaction in tropical maize using linked molecular markers and environmental covariables. Theor. Appl. Genet. 99:611–625. de Visser, J.A.G.M., J. Hermisson, G.P. Wagner, L. Ancel Meyers, J. Bagheri-Chaichian, J.L. Blanchard, L. Chao, J.M. Cheverud, S.F. Elena, W. Fontana, G. Gibson, T.F. Hansen, D. Krakauer, R.C. Lewontin, C. Ofria, S.H. Rice, G. von Dassow, A. Wagner, and M.C. Whitlock. 2003. Perspective: Evolution and detection of genetic robustness. Evolution 57:1959–1972. Flint-Garcia, S.A., L.L. Darrah, M.D. McMullen, and B.E. Hibbard. 2003. Phenotypic versus marker-assisted selection for stalk strength and second-generation European corn borer resistance in maize. Theor. Appl. Genet. 107:1331–1336. Holland, J.B. 2001. Epistasis and plant breeding. Plant Breed. Rev. 21:27–92. Jannink, J.-L., M.C.A.M. Bink, and R.C. Jansen. 2001. Using complex plant pedigrees to map valuable genes. Trends Plant Sci. 6(8):337– 342. Jannink, J.-L., and R.C. Jansen. 2001. Mapping epistatic quantitative trait loci with one-dimensional genome searches. Genetics 157:445– 454. Jansen, R.C. 1994. Controlling the Type I and Type II errors in mapping quantitative trait loci. Genetics 138:871–881. Jansen, R.C. 1996. Complex plant traits: Time for polygenic analysis. Trends Plant Sci. 1:89–94. Jansen, R.C., J.-L. Jannink, and W.D. Beavis. 2003. Mapping quantitative trait loci in plant breeding populations: Use of parental haplotype sharing. Crop Sci. 43:829–834. Koornneef, M., and P. Stam. 2001. Changing paradigms in plant breeding. Plant Physiol. 125:156–159. Kuhnlein, U., R. Parsanejad, D. Zadworndy, and S.E. Aggrey. 2003. The dynamics of the genotype-phenotype association. Poult. Sci. 82:876–881. Lande, R., and R. Thompson. 1990. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124:743– 756. Mackay, T.F.C. 2001. Quantitative trait loci in Drosophila. Nat. Rev. Genet. 2:11–20. McMullen, M.D., M. Snook, E.A. Lee, P.F. Byrne, H. Kross, T.A. Musket, K. Houchins, and E.H. Coe. 2001. The biological basis of epistasis between quantitative trait loci for flavone and 3-deoxyanthocyanin synthesis in maize (Zea mays L.). Genome 44:667–676. Melchinger, A.E., H.F. Utz, and C.C. Scho¨n. 1998. Quantitative trait locus (QTL) mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. Genetics 149:383–403. Meuwissen, T.H.E., and M.E. Goddard. 2000. Fine mapping of quantitative trait loci using linkage disequilibria with closely linked marker loci. Genetics 155:421–430. Nelson, M.R., S.L.R. Kardia, R.E. Ferrell, and C.F. Sing. 2001. A combinatorial partitioning method to identify multilocus genotypic partitions that predict quantitative trait variation. Genome Res. 11: 458–470. Openshaw, S., and E. Frascaroli. 1997. QTL detection and markerassisted selection for complex traits in maize. Proc. Annu. Corn Sorghum Res. Conf. 52:44–53. Peccoud, J., K. Vander Velden, D.W. Podlich, C.R. Winkler, W.L. Arthur, and M. Cooper. 2004. The selective values of alleles in a molecular network model are context dependent. Genetics 166:1715– 1725. Peleman, J.D., and J. Rouppe van der Voort. 2003. Breeding by design. Trends Plant Sci. 8(7):330–334. Podlich, D.W., and M. Cooper. 1998. QU-GENE: A platform for quantitative analysis of genetic models. Bioinformatics 14:632–653. Podlich, D.W., M. Cooper, and K.E. Basford. 1999. Computer simulation of a selection strategy to accommodate genotype-by-environment interactions in a wheat recurrent selection program. Plant Breed. 118:17–28. Rasmusson, D.C., and R.L. Phillips. 1997. Plant breeding programs and genetic diversity from de novo variation and elevated epistasis. Crop Sci. 37:303–310. Rathjen, A.J. 1994. The biological basis of genotype environment interaction—Its definition and management. p. 13–17. In J. Paull
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
1568
CROP SCIENCE, VOL. 44, SEPTEMBER–OCTOBER 2004
et al. (ed.) Proc. Seventh Assembly of the Wheat Breeding Society of Australia. Adelaide, Australia. Schlichting, C.D., and M. Pigliucci. 1998. Phenotypic evolution: A reaction norm perspective. Sinauer Associates, Sunderland, MA. Tuberosa, R., S. Salvi, M.C. Sanguineti, P. Landi, M. Maccaferri, and S. Conti. 2002. Mapping QTLs regulating morpho-physiological traits and yield: Case studies, shortcomings and perspectives in drought-stressed maize. Ann. Bot. (London) 89:941–963. Utz, H.F., A.E. Melchinger, and C.C. Scho¨n. 2000. Bias and sampling error of the estimated proportion of genotypic variance explained by quantitative trait loci determined from experimental data in maize using cross validation and validation with independent samples. Genetics 154:1839–1849. van Eeuwijk, F.A., J. Crossa, M. Vargas, and J.-M. Ribaut. 2002. Analysing QTL by environment interaction by factorial regression, with an application to the CIMMYT drought and low nitrogen stress programme in maize. p. 245–256. In M.S. Kang (ed.) Quantitative genetics, genomics and plant breeding. CAB International, Wallingford, UK. Wade, M.J. 2001. Epistasis, complex traits, and mapping genes. Genetica (The Hague) 112–113:59–69. Wade, M.J. 2002. A gene’s eye view of epistasis, selection, and speciation. J. Evol. Biol. 15:337–346. Wolf, J.B., E.E. Brodie, III, and M.J. Wade. (ed.) 2000. Epistasis and the evolutionary process. Oxford University Press, Oxford. Yi, N., and S. Xu. 2001. Bayesian mapping of quantitative trait loci under complicated mating designs. Genetics 157:1759–1771.
APPENDIX 1 Context-Dependent Gene Effects A key motivation of the MAYG approach was to accommodate for the effects of context-dependent gene effects. Here, a brief description is given to some of the influences of context dependencies in the estimation of QTL effects. See Cooper and Hammer (1996), Schlichting and Pigliucci (1998), Wolf et al. (2000), and Wade (2001, 2002) for further descriptions of the influences of context dependencies in mapping populations and some plant breeding and evolutionary consequences. The presence of context dependency brings into question the value of a given QTL allele, the contrasts between the genotypes possessing different allele combinations, and the gene-to-phenotype relationship for traits (Nelson et al., 2001; Cooper et al., 2002). For example, in the case of epistasis, a QTL allele can have one effect in the presence of one genetic background and a different effect in the presence of another genetic
background. In some cases, the presence of a given genetic background may change the definition of the favorable allele for the QTL. As discussed by Cheverud and Routman (1995) and Holland (2001), we can distinguish between physiological and statistical epistasis. Figure 6 illustrates features of both perspectives and emphasizes some of the complications that can arise with the existence of context dependencies due to epistasis. Here, three genetic models with different amounts of context dependency are considered: (i) gene A is independent of all other genes (Fig. 6a), (ii) gene A interacts with gene B (Fig. 6b), and (iii) gene A interacts with genes B and C (Fig. 6c). There are many contrasts that can be used to study the influences of epistasis on the relative performance of multilocus genotypes. Here we consider the contrast between the homozygous genotype classes at a single locus (i.e., taking a “gene’s eye view”; Wade, 2002). The focus of Fig. 6 is on visualizing the genotypic contrast between AA and aa for gene A. For the scenario where there is no context dependency (Fig. 6a), genotype AA always has the highest trait value, and hence there is a clear definition of the favorable homozygous class. However, for the scenarios where gene A interacts with other genes (i.e., Fig. 6b,c), the value of the genotype classes and the definition of the favorable genotype class are less well defined. For example, in Fig. 6b the highest performing combination for gene A is genotype AA when combination BB is present at gene B, and genotype aa is the highest performing genotype when combination bb is present at gene B. Genotype class AA has the highest effect when averaged across all combinations of gene B (Fig. 6b; vertical bars). In Fig. 6c, the genotype combination AABBCC has the highest performance (line plot) but genotype aa has the highest value when averaged across all background genotypes for genes B and C (vertical bars). Within a population of individuals, the background effects are not represented equally since allele and genotype frequencies are not the same, or even in HardyWeinberg equilibrium. Furthermore, the frequency of alleles and genotype combinations can differ from one population to the next and from one generation to the
Fig. 6. The effect of genotype combinations (line plots; referred to as physiological epistasis by Cheverud and Routman, 1995) and average genotype effect of gene A across the genetic background (vertical bars; statistical estimation of the average genotype value for the aa and AA genotypes of gene A) for three hypothetical genetic models; (a) a single independent additive gene (gene A); K ⫽ 0, (b) a digenic network where gene A interacts with gene B; K ⫽ 1, and (c) a trigenic network where gene A interacts with genes B and C; K ⫽ 2. K ⫽ level of epistasis. Values of the vertical bars show the effect of the two homozygous genotype classes for gene A averaged across all background genotype combinations in the network.
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
PODLICH ET AL.: MAPPING AS YOU GO
next. When context dependencies due to epistasis are present, each population samples a different set and frequency of background effects, resulting in distinct trait phenotypes and hence QTL allele effects will differ among populations. Therefore, the effect of a QTL allele or genotype combination is population specific and thus any estimate of QTL effect is in context with a given population of individuals in a given environment. To illustrate this property, 10 000 independent populations were created for each of the three genetic models considered in Fig. 6. A random set of allele frequencies were independently defined for each population. Figure 7a shows the distribution of the estimated QTL effect size for gene A, where the QTL effect was represented as the difference in value for the homozygous genotype classes for gene A (i.e., average effect of AA minus the average effect of aa). A positive effect size indicates that genotype class AA was favorable and a negative effect size indicates that genotype class aa was favorable. For the genetic models where there were no context dependencies (Fig. 6a), the estimated effect of gene A was relatively consistent across the 10 000 populations (Fig. 7a; K ⫽ 0). In contrast, the genetic models that contained context dependencies (Fig. 6b,c) had highly variable estimates for gene A (Fig. 7a; K ⫽ 1 and K ⫽ 2). In these two scenarios, both the magnitude of the effects and the definition of the favorable genotype class varied among the populations. These results illustrate how the definition of the highest value genotype and
1569
estimated effect of an allele can differ among random sets of genetic background. The genetic backgrounds of germplasm generated over cycles of selection are not random. Thus, the variation shown in Fig. 7a is not necessarily representative of the variation that may be expected from germplasm generated over consecutive cycles of a breeding program. Instead, the change in frequencies of alleles in a population is likely to be more systematic. This is because of the coancestral or pedigree relationships that exist among individuals generated over cycles of the breeding program. Figure 7b demonstrates this property for each genetic model shown in Fig. 6. Here the effect of gene A is estimated over 10 cycles of selection, where each estimate is independent and is based on the germplasm available in the current cycle of selection. Each line represents an independent run of selection. As was the case in Fig. 7a, there was variation among the estimates of QTL effects for the genetic models with context dependencies (Fig. 7b; K ⫽ 1 and K ⫽ 2). In this case, the differences were less variable among consecutive cycles of selection than they were among the 10 000 random populations (Fig. 7a cf. Fig. 7b for K ⫽ 1 and K ⫽ 2). However, it is also important to emphasize that there were some differences among the cycles of selection. The figures show a deviation in the magnitude of QTL effects as well as intermittent change in the definition of the favorable genotype class from cycle to cycle. These results emphasize the presence of a QTL
Fig. 7. The effect of gene A in different populations of individuals. The three genetic models used in the construction of this figure are an extension of those presented in Fig. 6. In all cases, the genetic models had 10 genes. Genes not represented in Fig. 6 were defined as having additive effects (i.e., equivalent to Fig. 6a). For example, for genetic models (K ⫽ 2), the first three genes were defined as in Fig. 6c and the remaining seven genes were defined to have additive independent effects. Figure 7a shows the distribution of allele effect size for gene A estimated in 10 000 independent populations. Figure 7b shows the estimated allele effect for gene A across 10 cycles of selection for 30 different runs of selection. A positive effect size indicates that genotype class AA was favorable and a negative effect size indicates that genotype class aa was favorable. K ⫽ level of epistasis.
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
1570
CROP SCIENCE, VOL. 44, SEPTEMBER–OCTOBER 2004
Fig. 8. The relative performance of five breeding strategies for six general classes of genetic model. In all cases, performance is represented as the difference in response between a given breeding strategy and the Mapping Start Only (MSO) method. Positive values indicate the breeding strategy had a higher response than the MSO method, and negative values indicate the breeding strategy had a lower response than the MSO method. The performance differences are expressed in terms of normalized trait value. Each line represents average performance across 20 000 runs of the breeding program (600 000 runs in total). A categorization of the E(NK) models considered is given in Table 1; E ⫽ the number of different environment types conditioning gene-by-environment interactions in the target population of environments, and K ⫽ level of epistasis.
effect that is dependent on the evolving population structure at any point in the sequence of breeding cycles.
APPENDIX 2 Additional in silico Investigations of the Mapping As You Go Method Appendix 2 describes the results of three additional simulation experiments that were conducted to investigate factors that extend beyond those examined in the main section of the paper. The additional factors that were considered were as follows: (i) a comparison of the MAS methods (MSO and MAYG) to phenotypic selection, (ii) an implementation of the MAYG method with a mixed model analysis for QTL estimation (i.e., Jannink and Jansen, 2001), and (iii) an implementation of the MAYG method using a version of the haploMQM approach for QTL estimation (Jansen et al., 2003). Unless otherwise noted, the same parameters as described in the Materials and Methods section were used in these experiments. Comparison of the Marker-Assisted Selection Methods to Phenotypic Selection The performances of the MSO and MAYG methods were compared with phenotypic selection (Fig. 8). The MAS strategies were implemented as described in the Materials and Methods. The average phenotypic performance of the hybrids across a MET with 10 locations was used as the basis for phenotypic selection. A heritability value of 0.1 on a single-plant basis was used in all three implementations. For the models with no context de-
pendencies (Fig. 8; top-left panel) the two MAS strategies outperformed phenotypic selection for all of the cycles of selection considered here. However, when context dependencies were present (Fig. 8; all panels except top left), MAS outperformed phenotypic selection across the first 10 to 15 cycles. However, in the longer term, phenotypic selection outperformed MSO, and in some cases, phenotypic selection outperformed versions of the MAYG method. Mapping As You Go Method with a Mixed Model Analysis for QTL Estimation The response of the two MAS approaches (MSO and MAYG), with a more sophisticated QTL analysis method than was considered in the main section of the paper, was investigated (Fig. 9). For QTL estimation in each of the MAS approaches, a mixed model analysis of the phenotypic and genotypic information that took into consideration within and among cross information was implemented. A similar approach was described by Jannink and Jansen (2001). The results showed that there were no differences between the two MAS approaches when no context dependencies were present (Fig. 9; topleft panel). However, when context dependencies were present, the MAYG approach demonstrated advantages and outperformed the MSO approach in the long term. These results are consistent with those reported in the main section of the paper (i.e., Fig. 3). Mapping As You Go Method with a Version of the HaploMQM Approach for QTL Estimation The two MAS approaches (MSO and MAYG) were implemented using a version of the haploMQM method
Reproduced from Crop Science. Published by Crop Science Society of America. All copyrights reserved.
PODLICH ET AL.: MAPPING AS YOU GO
1571
Fig. 9. The relative performance of the Mapping Start Only (MSO) and Mapping As You Go methods for six general classes of genetic model. In all cases, performance is represented as the difference in response between a given breeding strategy and the MSO method. The performance differences are expressed in terms of normalized trait value. Each line represents average performance across 1000 runs of the breeding program (24 000 runs total). A categorization of the E(NK) models considered is given in Table 1; E ⫽ the number of different environment types conditioning gene-by-environment interactions in the target population of environments, and K ⫽ level of epistasis.
for QTL allele estimation (Jansen et al., 2003). The relative performances of the MAS approaches are shown in Fig. 10. For this experiment, an 1800-cM genetic map with markers spaced every 5 cM was assumed. Effects were estimated for multiple-marker haplotype combinations, where a given haplotype was defined to
span four adjacent marker locations. A high and low linkage disequilibrium situation was considered. The results of the experiment were consistent with the trends observed for the other implementations of the MAYG method. Namely, the MAYG method outperformed the MSO when context dependencies were present.
Fig. 10. The relative performance of the Mapping Start Only (MSO) and Mapping As You Go methods for three general classes of genetic models (E ⫽ 10; K ⫽ 0, 1, 2) for two different starting population configurations (low and high linkage disequilibrium [LD] among markers). In all cases, performance is represented as the difference in response between a given breeding strategy and the MSO method. The performance differences are expressed in terms of normalized trait value. Each line represents average performance across 1000 runs of the breeding program (24 000 runs in total). A categorization of the E(NK) models considered is given in Table 1; E ⫽ the number of different environment types conditioning gene-by-environment interactions in the target population of environments, and K ⫽ level of epistasis.