Strategies for efficient implementation of molecular ... - Springer Link

11 downloads 0 Views 130KB Size Report
between AWB Limited, CSIRO, the Grains Re- search and Development Corporation and Syn- genta Seeds. References. Bariana H.S. and McIntosh R.A. 1993.
Molecular Breeding 15: 75–85, 2005. Ó 2005 Springer. Printed in the Netherlands.

75

-1

Strategies for efficient implementation of molecular markers in wheat breeding D.G. Bonnett1,2,*, G.J. Rebetzke1 and W. Spielmeyer1,2 1

CSIRO Plant Industry, P.O. Box 1600, Canberra, ACT 2601 Australia; 2Graingene, 65 Canberra Avenue, Griffith ACT 2603, Australia; *Author for correspondence (e-mail: [email protected])

Received 15 December 2003; accepted in revised form 31 August 2004

Key words: Breeding strategy, Marker-assisted selection (MAS), Molecular marker, Population genetics, Population size

Abstract Although molecular markers allow more accurate selection in early generations than conventional screens, large numbers can make selection impracticable while screening in later generations may provide little or no advantage over conventional selection techniques. Investigation of different crossing strategies and consideration of when to screen, what proportion to retain and the impacts of dominant vs. codominant marker expression revealed important choices in the design of marker-assisted selection programs that can produce large efficiency gains. Using F2 enrichment increased the frequency of selected alleles allowing large reductions in minimum population size for recovery of target genotypes (commonly around 90%) and/or selection at a greater number of loci. Increasing homozygosity by inbreeding from F2 to F2:3 also reduced population size by around 90% in some crosses with smaller incremental reductions in subsequent generations. Backcrossing was found to be a useful strategy to reduce population size compared with a biparental population where one parent contributed more target alleles than the other and was complementary to F2 enrichment and increasing homozygosity. Codominant markers removed the need for progeny testing reducing the number of individuals that had to be screened to identify a target genotype. However, although codominant markers allow target alleles to be fixed in early generations, minimum population sizes are often so large in F2 that it is not efficient to do so at this stage. Formulae and tables for calculating genotypic frequencies and minimum population sizes are provided to allow extension to different breeding systems, numbers of target loci, and probabilities of failure. Principles outlined are applicable to implementation of markers for both quantitative trait loci (QTL) and major genes.

Introduction Marker technology has progressed to the point where the major practical limitation to implementation in plant breeding is the handling of populations large enough to identify target recombinants at the many loci we can now screen (Howes et al. 1998; Koebner and Summers 2003). For example, markers at 12 unlinked loci are routinely used for selection in GraingeneÒ breeding populations and

10 in the Australian Grain TechnologyÒ breeding program (Kuchel et al. 2003). In addition to these marker loci, populations may need to be large enough to retain sufficient variation for the many traits for which there are no markers and which must be selected using conventional phenotypic screens. Irrespective of improvements in marker technology, critical minimum population sizes are necessary to ensure with a high degree of certainty

76 that a target genotype is present in a breeding population (Sedcole 1977). This minimum population size varies depending on the number of loci to be combined, frequencies of alternative alleles at these loci, extent of genetic linkage of the marker with the target locus, marker expression (dominant vs. codominant), level of homozygosity and any linkage between target alleles (Falconer and Mackay 1996). Knowledge of minimum population size allows efficient resource allocation by ensuring that population sizes are large enough for a high probability of success but do not consume more resources than necessary. An understanding of the effects of inbreeding, population type (e.g. biparental, backcross), allelic frequency and accuracy of phenotyping on minimum population size is essential in the development of resource efficient crossing and selection strategies. Strategies that increase the frequency of target genotypes should improve the efficiency of MAS by allowing smaller populations or selection at more loci. Increasing the frequency of target genotypes through ‘enrichment’ of target alleles in F2 (F2 enrichment) by selecting for carriers (or against non-target homozygotes), backcrossing to increase the frequency of recurrent parent alleles, and inbreeding (or development of doubled-haploid (DH) populations) to increase the frequency of homozygotes in the population are effective strategies to achieve these objectives. Marker expression (dominant vs. codominant) does not change the population size needed to ensure the presence of a target genotype but it does affect the amount of testing required to identify an individual of this genotype. This paper examines population genetic issues relating to MAS using as examples three simple

crosses typical of those that form the basis of most breeding programs for wheat and many other crops. Population sizes with and without enrichment of allelic frequencies in the F2 generation and for different levels of inbreeding, numbers of loci and marker phenotype are considered.

Materials and methods Population sizes necessary to recover target combinations of alleles from biparental, backcross and topcross (or three-way cross) populations between Australian wheat varieties Goldmark and Sunstate, and the long coleoptile, semidwarf CSIRO germplasm line HM14BS were calculated. Genotypes of these lines at eight target loci are shown in Table 1. The eight marker-linked genes in Table 1 are currently being used in selection in the GraingeneÒ germplasm development program to develop new breeding lines for evaluation and commercial release. Alleles at the RhtD1 and Rht8 loci affect plant height, the Glu-B1, Glu-D1 and Glu-A3 loci code for grain storage proteins, Sr2 is an adult plant stem rust resistance gene, Cre1 a cereal cyst nematode resistance gene (McIntosh et al. 1998) and VPM an Aegilops ventricosa chromosome translocation carrying genes for leaf (Lr37), stem (Sr38) and stripe (Yr17) rust resistance (Bariana and McIntosh 1993). The VPM chromosome segment does not recombine with wheat chromosomes and so behaves genetically as a single locus. To simplify examples, molecular markers were assumed to be ‘perfect’ markers although the Rht8 and Sr2 diagnostic markers are a small chromosomal distance from the respective gene (Korzun et al. 1998; Spielmeyer et al. 2003).

Table 1. Wheat lines and genotypes at seven marker-linked loci Variety/line

HM14BS Goldmark Sunstate

Locus and marker phenotype RhtD1* (codom)

Rht8+ (codom)

Sr2++ (codom)

Cre1++ (dom)

VPM++ (dom)

Glu-B1 (codom)

Glu-D1 (dom)

Glu-A3 (codom)

RhtD1a RhtD1b RhtD1b

Rht8 rht8 rht8

sr2 sr2 Sr2

cre1 Cre1 cre1

vpm vpm VPM

Glu-B1a Glu-B1i Glu-B1i

Glu-D1d Glu-D1a Glu-D1d

Glu-A3e Glu-A3c GluA3b

Target alleles in bold font. *RhtD1b formerly designated Rht2, RhtD1a formerly designated rht2. RhtD1b confers a dwarfing phenotype, RhtD1a is the wild-type, non-dwarfing allele. + Rht8 confers a semidwarf phenotype, rht8 is the alternative non-dwarfing allele. ++ Alleles conferring resistance in uppercase, susceptibility alleles in lowercase.

77 The RhtD1, Rht8, Sr2, Glu-B1 and Glu-A3 molecular markers are codominant, and Cre1, VPM and Glu-D1d markers dominant in genic expression (Korzun et al. 1998; Ogbonnaya et al. 2001; Ellis et al. 2002; Ma et al. 2003; Spielmeyer et al. 2003; Zhang et al. 2004). There is no linkage between any of these marker loci. In all examples, a key objective was to select for Rht8 and against RhtD1b to allow recovery of semidwarf lines with long coleoptiles (Rebetzke and Richards 2000). Minimum population sizes necessary to recover a target genotype were calculated according to the following formulae. The accepted probability of failing to obtain a target genotype was set at 5% (p = 0.05). All values in tables were independently calculated twice to ensure accuracy. Genotypic frequency (G) Genotypic frequencies are the product of frequencies at individual loci. While the ultimate goal of selection strategies is recovery of a genotype homozygous at all loci, target genotypes in earlier stages of selection may be either homozygous or heterozygous at one or more loci. At a single locus, if the frequency of a target homozygote of genotype AA is designated p, the frequency of the heterozygote (Aa) as q and the frequency of the non-target homozygote (aa) as r (where p + q + r = 1), with one generation of inbreeding the frequencies can be calculated according to the formulae (after Li 1955): GAA ¼ p þ q=4

against aa in the F2 generation the frequencies become 0.714, 0.286 and 0. In a completely inbred population, the frequencies of homozygotes for each allele will be equal to the allelic frequency and there will be no heterozygotes in the population. This is observed in DH populations.

Number of individuals to progeny test Progeny-testing is used to differentiate homozygotes from heterozygotes when using dominant markers. Progeny-testing is not necessary when the probability of an individual selected with a dominant marker at one or more loci being homozygous is over 95% as occurs in highly inbred or DH-derived populations. This probability can be calculated by dividing the frequency of homozygotes for segregating target alleles by the frequency of carriers. Frequencies of homozygotes and carriers with common initial genotypic frequencies and different levels of inbreeding are given in Table 2. If progeny testing is required, then more progeny need to be screened if an individual is being assessed for homozygosity at multiple unlinked loci in order to achieve a 5% cumulative probability of misclassification across all loci. The formula to determine this number is: N ¼ logn ð1  ð0:95ð1=tÞ ÞÞ= logn ð0:75Þ where N is the population size and t is number of loci being screened.

GAa ¼ q=2 Gaa ¼ r þ q=4: These frequencies then become p, q and r for the next cycle of inbreeding. These formulae apply regardless of the frequencies of p, q and r which will vary with cross type and any selection for or against a particular allele. For example, in the F2 generation of a biparental cross, expected genotypic frequencies of AA, Aa and aa are 0.25, 0.50 and 0.25, respectively. If aa homozygotes are culled in the F2 generation the resulting frequencies will be 0.33, 0.67 and 0.00. In the F1 generation of a single backcross the expected frequencies of AA, Aa and aa are 0.50, 0.50 and 0.00 changing to 0.625, 0.25 and 0.125 in the F2. With selection

Population size to recover a target genotype The population size (N) needed to recover at least one individual with a specified probability of failure (x) can be calculated according to the formula (Hanson 1959): N ¼ logn x= logn ð1  GÞ where G = genotypic frequency. In all examples, the effects of enriching the frequency of target alleles by removing homozygous null genotypes from the population in F2 and inbreeding to reduce the frequency of heterozygotes were examined. Minimum population sizes for a 5% probability of failure of detecting at least

78 Table 2. Frequencies of homozygotes (homo) and carriers of a target allele (A) for different allele frequencies and levels of inbreeding Allelic frequency

0.25 (e.g. non-recurrent parent allele in BC1)

0.5 (e.g. biparental cross)

0.75 (e.g. recurrent parent allele in BC1)

0.67 (e.g. following F2 enrichment of biparental cross)

0.857 (e.g. following F2 enrichment of recurrent parent allele in BC1)

Gen:

Homo (AA)

Carrier (A)

Homo (AA)

Carrier (A)

Homo (AA)

Carrier (A)

Homo (AA)

Carrier (A)

Homo (AA)

Carrier (A)

F2 F2:3 F3:4 F4:5 F5:6 F6:7 F7:8 F8:9 F9:10 DH

0.125 0.188 0.219 0.234 0.242 0.246 0.248 0.249 0.250 0.250

0.375 0.313 0.281 0.266 0.258 0.254 0.252 0.251 0.250 0.250

0.250 0.375 0.438 0.469 0.484 0.492 0.496 0.498 0.499 0.500

0.750 0.625 0.563 0.531 0.516 0.508 0.504 0.502 0.501 0.500

0.625 0.688 0.719 0.734 0.742 0.746 0.748 0.749 0.750 0.750

0.875 0.813 0.781 0.766 0.758 0.754 0.752 0.751 0.750 0.750

0.333 0.500 0.583 0.625 0.646 0.656 0.661 0.664 0.665 0.667

1.000 0.833 0.750 0.708 0.688 0.677 0.672 0.669 0.668 0.667

0.714 0.786 0.821 0.839 0.848 0.853 0.855 0.856 0.857 0.857

1.000 0.929 0.893 0.875 0.866 0.862 0.859 0.858 0.858 0.857

Table 3. Population sizes required for enrichment (enrich) vs. fixation (fix) of target alleles in biparental F2 populations and to obtain at least one target homozygous genotype in later generation enriched (enrich) and non-enriched (rand) populations for different numbers of segregating loci Pop. required for fixation (fix) vs. enrichment (enrich) (p = 0.05)

Population size required to obtain a target homozygote at all loci in non-enriched (rand) and enriched (enrich) populations (p = 0.05)

Gen:

F2

F2:3

Loci:

Fix

Enrich

Rand

Enrich

Rand

Enrich

Rand

Enrich

Rand

Enrich

Rand

Enrich

1 2 3 4 5 6 7 8 9 10

11 47 191 766 3067 12270 49081 196327 785312 3141252

3 4 6 8 11 16 21 29 39 52

7 20 56 151 403 1076 2872 7659 20427 54473

5 11 23 47 95 191 382 766 1533 3067

6 15 35 81 186 426 975 2231 5100 11660

4 8 14 25 43 75 129 222 382 656

5 13 28 61 131 281 601 1284 2741 5848

3 6 11 18 30 49 79 128 205 329

5 12 25 53 111 231 478 988 2040 4213

3 6 10 16 26 40 63 98 152 236

5 11 23 47 95 191 382 766 1533 3067

3 6 9 14 22 33 50 76 114 172

F3:4

one individual of a target genotype are presented in Table 3 for populations with two alternative alleles of equal initial frequency and for differing numbers of loci. In each example, data are presented for populations developed through singleseed descent (SSD) and also for a population of DH lines. Where F2 enrichment was not to be applied F2:3, F3:4 and F4:5 generations were assumed derived by SSD from the F2. Where F2 enrichment was applied, selected F2s were each assumed to contribute equal numbers of progeny

F4:5

F5:6

DH

to the DH and F2:3 populations with F3:4 and F4:5 generations derived by SSD from the F2:3.

Results The examples presented are typical of cross types used by commercial wheat breeders but also breeders of other self-pollinating and some crosspollinating species. The examples are used to illustrate genetic and statistical considerations that

79 must be satisfied at all stages of crossing and selection using molecular markers.

Example 1. HM14BS (RhtD1a, Rht8)/Sunstate (Sr2, VPM, Glu-B1i) biparental cross population The target genotype from this cross is homozygous for RhtD1a, Rht8, Sr2, VPM, Glu-B1i and Glu-A3b. With five of the six marker loci showing codominant expression, a large number of different selection strategies are possible. Strategies outlined in Table 4 are a subset illustrating the important principles determining relative efficiencies. Strategy 4.1 is the simplest strategy involving two selection stages and lacks an F2 enrichment step. The first stage is to select individuals homozygous for target alleles at the five codominant loci (RhtD1a, Rht8, Sr2, Glu-B1i, Glu-A3b) and carrying

the dominant VPM marker allele. The second stage is to progeny test selected individuals to identify VPM homozygotes. Strategy 4.1 best demonstrates the reductions in population size that can be achieved through inbreeding. A single generation of inbreeding from F2 to F2:3 before selection is initiated reduces the number of individuals that must be tested from 12368 to 2077. Further reductions are achieved if selection is delayed until later generations. In a DH population only 191 individuals would need to be tested; a reduction of almost 98.5% compared to an F2. Reductions in population size in more inbred populations are seen in all strategies and are due to a halving of the frequency of heterozygotes with each cycle of inbreeding. Strategies 4.2, 4.3 and 4.4 involve F2 enrichment with or without selection of homozygotes at one or more loci in F2. In strategy 4.2, enrichment is applied to all target loci while in strategies 4.3 and

Table 4. Frequencies and population sizes required to identify the target genotype RhtD1a RhtD1aRht8Rht8Sr2Sr2VPMVPMGluB1iGlu-B1iGlu-A3bGlu-A3b from HM14BS/Sunstate (p = 0.05) Generation

Frequency of target homozygous genotype

Number of individuals for target genotype (p = 0.05)

Number of individuals after screening RhtD1aRht8Sr2VPM Glu-B1iGlu-A3ba

Progeny to test from each plant

4.1. Without selection of carriers in F2 F2

Suggest Documents