unbiased prediction - Europe PMC

0 downloads 0 Views 556KB Size Report
Oct 15, 1990 - where y is an n x 1 vector of records, X, Z and W are n x p, n x n and n x 2n ... geny (yp ; parents) and records of individuals without progeny (y,!r ...
Original

article

Reduced animal model for marker assisted selection using best linear unbiased prediction RJC Cantet*

C Smith

University of Guelpla, Centre for Genetic Im rovement of Livestock, P Department of Animal and I’oultry Science, Guelph, Ontario, N1G 2Wl, Canada

(Received

15 October

1990; accepted 11 April 1991)

Summary - A reduced animal model (RAM) version of the animal model (AM) incorpo-

rating independent marked quantitative trait loci (M(aTL’s) of Fernando and Grossman (1989) is presented. Both AM and RAM permit obtaining Best Linear Unbiased Predictions of MQTL effects plus the remaining portion of the breeding value that is not accounted for by independent M(aTL’s. RAM reduces computational requirements by a reduction in the size of the system of equations. Non-parental MQTL effects are expressed as a linear function of parental MQTL effects using marker information and the recombination rate (r) between the marker locus and the MQTL. The resulting fraction of the MQTL variance that is explained by the regression on parental MQTL effects is

2[(1- r) 2 2 + /2 ] r when the individual is not inbred and both parents are known. Formulae are

obtained to

breeding values

the computations when backsolving for non-parental MQTL and all non-parents have one record. A small numerical example is also

simplify in

case

presented. maker assisted selection marker

/

best linear unbiased

prediction /

reduced animal model

/

genetic

Résumé - Un modèle animal réduit pour la sélection assistée par marqueurs avec BLUP. Une version du ncodèle animal réduit (RAM) basée sur le modèle animal (AM) de Fernando et Crossman (1989) avec loci indépendants de caractères quantitatifs marqués (MQTL) est présentée. Dans les2 cas, RAM et AM, on obtient les meilleurs prédictions linéaires sans biais (BLUP) des effets des MQTL en plus de la portion restante de la valeur

génétique inexpliquée par les MG!TL indépendants. L’emploi de RAM diminue les exigences de calcul par une réduction de la taille du système d’équations. Les effets des MQTL reon-parentaux sont exprimés sous la forme d’une fonction linéaire des effets des MQTL parentaux à l’aide de l’information provenant du marqueur et du taux de recombinaison (r) entre le locus marqueur et le MQTL. La proportion résultante de la variance du MG!TL * On leave from : Departamento de Buenos Aires, Argentina Correspondence and reprints **

Zootecnia, Facultad de Agronomia, Universidad de

par la régression des effets des MQTL parentaux est donnée par l’expression dans le cas d’un individu non consanguin avec parents connus. Des + 2!(1 formules sont dérivées pour simplifier les calculs lorsque l’on résout pour les effets des MQTL et des valeurs génétiques non parentaux dans le cas où tous les individus non parents possèdent une seule observation. Un exemple numérique est également donné.

expliquée

2 r)

r2] /2

sélection assistée par marqueurs

/

BLUP

/

modèle animal réduit

/

marqueur

génétique INTRODUCTION

(1989) obtained best linear unbiased for alleles at a marked quantitaeffects of the additive predictors (Henderson, 1984) tive trait locus (MQTL) and of the remaining portion of the breeding value. They used an animal model (AM; Henderson, 1984) under a purely additive mode of inheritance. Letting p be the number of fixed effects in the model, n the number of animals in the pedigree file and m the number of M(!TL’s, the number of equations in the system for this AM is p + n(2m + 1). For large m, n or both, solving such a system may not always be feasible. The reduced animal model (RAM; Quaas and Pollak, 1980) is an equivalent model, in the sense of Henderson (1985), to the AM and provides the same results, but with a smaller number of equations to be solved. In this paper, the RAM version of the model of Fernando and Grossman (1989) is obtained. The resulting system of equation is of order p + s(2m + 1), s being the number of parents. In general s is much smaller than n. Therefore, the advantage due to the reduction in the number of equations by using RAM is considerable. A numerical example is included to illustrate the application.

In

a

recent paper, Fernando and Grossman

THEORY For simplicity, derivations are presented for a model with one MQTL. The extension to the case of 2 or more independent M(!TL’s is covered in the section entitled More than one MQTL. In the notation of Fernando and Grossman (1989), MP and Mm are alleles at the marker locus that individuali inherited from its paternal (p) and its maternal (m) parents, and vf and viare the additive effects of the paternal and maternal MQTL’s, respectively. The recombination frequency between the marker allele and the MQTL is denoted as r. We will use the expression &dquo;breeding value&dquo; to refer to the additive effects of all genes that affect the trait excluding the MQTL(s). Matrix tion

expressions

for the animal model with

A matrix version of equation

(3)

genetic

in Fernando and Grossman

marker informa-

(1989)

is :

where y is an n x 1 vector of records, X, Z and W are n x p, n x n and n x 2n incidence matrices which relate data to the unknown vector of fixed effects !, the

random vector of additive breeding values u and the random vector v of additive effects of the individual MQTL effects, respectively. The 2n x 1 vector v is ordered within animal such that vf always precedes f!. The matrices Z and W will have zero rows for animals that do not have records on themselves but that are related to animals with records. Non-zero rows of Z and W have 1 and 2 elements equal to 1, respectively, with the remaining elements being zero. First and second moments of y are given by :

where Acr! and G ,ware the variance-covariance matrices of u and v, respectively. 2 The scalars a A 2wand o,2are the variance components of the additive effects of breeding values, the MQTL additive effects and of the environmental effects. RAM requires partitioning the data vector y into records of individuals with progeny (yp ; parents) and records of individuals without progeny (y,!r ; non-parents) so that y’ = [y%, y’ 1. A conformable partition can be used in X, Z, W, u, v and e. Using this idea (1) can be written as :

To obtain RANI, u N and v N should be expressed as linear functions of up and vp, respectively. Since an individual’s breeding value can be described as the average of the breeding value of its parents plus an independently distributed Mendelian sampling residual (!) (Quaas and Pollak, 1980), for u N we can write :

where P is an (n - s) x s matrix relating non-parental to parental breeding values. Each row of P contains at most two 0.5 values in the columns pertaining to the BV’s of the sire and of the dam. Now, E(!) 0 and Var(cp) = D A aA, where D A is a diagonal matrix with diagonal elements equal to : 1 - 0.25(a!! + add), if both sire and dam of the non-parent are known 1 - 0.25a , if only the sire is known ss 1 - 0.25ad!, if only the dam is known 1, if both parents are unknown with a ss and add being the diagonal elements of A corresponding to the sire and the dam, respectively. A scalar version of the relationship between v N and vp can be obtained from equations (8a) and (8b) in Fernando and Grossman (1989) and these are : =

The subscripts o, s and d denote the individual, its sire and its dam, respectively. The coefficients bis are either 1-ror r according to any of these 4 possible patterns of inheritance of the marker alleles : Paternal marker

The above

Maternal marker

developments

lead

us

to the

following relationship

between v!Br and

! : The

2(n — s)

x

2s matrix F relates the additive effects of the

MQTL

of

non-

to the additive effects of the MQTL of parents and s is the vector with and element i + 1 equal to the residual &0’. Each element i equal to residual Leti and k be the row row of F, contains at most, 2 non-zero elements : the

parents

eo

bis.

indices for the MQTL marked by MÓ and A/o&dquo; respectively. Let j and j+ 1 be the column indices corresponding to the additive effects of the MQTL for the sire that transmits i : j refers to the paternal grandsire and j +1 to the paternal granddam. Also, let1 and 1 +1 be the column indices corresponding to the dam that transmits i + 1 : corresponds to the maternal grandsire andl + 1 to the maternal granddam. Then Fij = b 3 and +1 . All remaining elements of F 4 , Fi,!+1 = bz, F!,! = b l ,i b k F are 0. When marker information is unavailable, r is taken to be 0.5 (Fernando and Grossman, 1989) and all bis are 0.5. To exemplify, consider individuals 1 (male), 2 (female) and 3 (progeny of 1 and 2). Animals 1 and 2 are unrelated and 3 has paternal and maternal marker alleles originating from the dams of 1 and 2, = with namely5’ alleles M V!l,)vp31 d and 1 M.! 2> respectively. Then,> v [v’,1 v &dquo;, í p v’n!’ 3 1 v 2 2 = V!! S : t1, v l vi p 2, V2n]’ and yM w3, ’U!i!’. The matrix W 1 =

=

For

r

!7J!,

=

0.2, the matrix F is

2

x

4 and

equal

to :

The residuals e have E(s) 0 and Var(e) ufl. Fernando and Grossman c G (1989) showed that G«u fl is diagonal with non-zero elements equal to Var(e’) = , and f s d are the 2r(1 - r)(1 - fg)u’§and Var( ü) 2r(1 - r)(1 - fd,)o, 2, where f E the of the coefficients at the of sire and dam, respectively. They MQTL inbreeding =

=

=

express the

probability

that the

paternal and maternal alleles

of

an

individual for

a given MQTL are the same. These f’s are the of -diagnonal elements f diagonal blocks of the matrix G v (Fernando and Grossman, 1989). Using (3) and (4) in (2) gives :

in the 2

x

2

or

On

* letting e

=

N + e

# + W,ve, N Z

we

have that :

here Q = !! I(n-s) w where / !! !i&dquo;v , UA eand !!!!!!!!!°&dquo;’fi$ilie (6) MA =

Mixed model equations for

v 2 2 U /U e-

+ Z,!DAZ;!aA + WNGEW!,av,aAm, = !A!!e and av =

are :

The matrices A v that P and G vP are the corresponding submatrices of A and G belong to parents. Equations (7) give the solutions for RAM with genetic markers. Of practical importance is the case where all non-parents have only one record so I. Then, W that Z N 1 are diagonal (see Appendix A). The WN and QE G N diagonal elements of W NGe W! are derived in Appendix A and they are equal to : =

s -f 2r(1-r)(2- f ), when both the sire and the dam of the non-parent d + 2r(1 - r)(1 - f ) 1, when only the sire is known s 2r(1 - r)(1 - f ) + 1, when only the dam is known d 2, if both the sire and the dam of the non-parent

are

are

known

unknown.

If there is zero probability that the paternal and maternal alleles at the MQTL of parent p are the same (ie fp = 0), the contribution to the diagonal element of W NGe: W! is 2r (I - r) (if marker information is available) or 1/2 (if marker information is unavailable). This occurs because, in the absence of marker information, there is equal probability of receiving the MQTL from the grandsire and from the granddam, and r = 0.5 (Fernando and Grossman, 1989).

A further simplification to (7) occurs when parents do not have records so that Zp and Wy are zero and the model becomes a sire-darn model. A program for RAM, such as the one presented by Schaeffer and Wilton (1987) and modified to include marker information can be employed to solve equations (7). More than

MQTL

one

Multiple MQTL (k, say) modification of model

where

is

j!

Var(vi)

=

k

a

x

G,,iu 2 ,,i

ZA’D.4Z!.(x,t

+

Backsolving

can

be dealt with

assuming independence by the following

(1) :

1 vector with all elements equal to 1. We will assume that * = I + Cov(v2, vi!, ) = 0. For k = 2 and letting Q

and

Wn!(GEIa&dquo;1

+

GE2cx.(2)W!,

RAM equations for

(8)

are :

for non-parents

After

solving for fixed effects, parental breeding values and parental effects of the MQTL, the breeding values and additive MQTL effects of non-parents can be calculated. This is accomplished by writing the equations for § and i from the mixed model equations of (5). This gives :

and after

a

little

Appendix parents have the

algebra :

B shows how to obtain solutions of one

equations (10), when all nonrecord, by solving (n - s) independent systems of order 2. Using

predictors obtained from (7) and (10)

are :

in

(3)

and

(4),

solutions for non-parents

EXAMPLE We 4

use

the

same

data that Fernando and Grossman (1989) employed. There are parents and 1 is a non-parent. The file is :

are

individuals, 3 of them

Notice that individual 4 is inbred. A fixed effect was included and the matrix from adjoining the incidence matrix X and the vector of observations y,

resulting ie

[Xly]

is :

Variance components used were (J! = 100, a § = 10 and Q! = 500 and r = 0.1. The matrices G and are in Fernando and Grossman (1989). u presented U G solutions for AM were obtained. The coefficient matrix for AM is : First,

and the

right-hand site vector is [445, 505, 235, 210, 250, 255, 235, 235, 210, 210, 250, 250, 255, 255]’. The vector of solutions is [222.5, 251.764, 2.08109, -2.08109, - 0.083214, 1.16537, 0.213435, 0.216098, -0.202783, -0.226749, 0.213102, -0.229745, 0.231409, 0.174809]. There are 11 equations in the system for RAM (as compared to 14 in AM) since there is

(individual 4) and Q is a scalar : 1.1136 l+(0.5/oc,))+2[0.5(0.5) + (0.9) (0.1)]/