Comparison of Three Methods of Sire Evaluation - Semantic Scholar

1 downloads 0 Views 799KB Size Report
Henderson and Carter (10) apportioned the total variation in ...... Anita. Prod.,. 7 : 221. (2) Cunningham, E. P., and C. R. Henderson. 1966. Analytical technique ...
Comparison of Three Methods of Sire Evaluation R. H. MILLER, B. T. McOANIEL, and R. D. PLOWMAN

Animal Husbandry Research Division, USDA Beltsville, Maryland Milk

Abstract

Fat

--(%)--

Three methods of sire evaluation were compared empirically--herdmate comparison, least squares, and maximum likelihood. Michigan Dairy Herd Improvement Association Holstein data representing 10,620 first lactation records and 1,003 herd-year groups were analyzed. Forty individual bulls representing three artificial insemination units were evaluated. The rank correlations among herdmate comparison and ]east squares estinlates ranged from 0.94 to 0.97. The rank correlations between the maximum likelihood evaluations and those computed by least squares and herdmate comparisons were .99 and .97, respectively. The sampling errors of the maximum likelihood sire constants were about 1.5% smaller than those for least squares estimates.

Sire Herd Year-season Sire X herd Herd X year-season Residual

7 30 4 2 14 43

7 33 5 2 15 38

Obviously, any model for sire evaluation must contain herd and year-season effects. Given this model, one nmy proceed in one of two ways: a) Adjust the data for herd and year-season effects and then analyze the data ignoring herds and year-seasons, or b) obtain a sinmltaueous solution for the effects of sires, herds, and year-seasons. Procedure a) might be one as elaborate as Henderson's Method Two (8) or as Sinlple as the herdmate comparison. Procedure b) would involve the application of a linear model requiring matrix inversion for a solution (since the data are highly nonorthogonal). Cunningham (1) pointed out that the herdMost of the recent research in sire evaluation methodology has eentered on the herdmate mate comparison procedure amounts to a twocomparison procedure. The evidence indicates stage process in which herd and year-season that the herdmate comparison procedure gives effects are removed (by expressing the records a reasonably accurate assessment of bulls used as deviations from herd-year-season means) in artificial insemination [Heidhues et al., and sires are evaluated by analyzing these de(6), McDaniel et al. (13)]. However, little viations, ignoring herds and year-seasons. This means that sires are compared on a withinattention has been given to the development of possible alternative procedures which possess herd-year-season basis. Possible interactions of sires with herds and year-seasons are ignored. nmre desirable properties. One probable reason is that of computation- Also, sire comparisons contained in differences al limitations. The herdmate comparison is among herd-year-season means are not used. relatively easy to program and does not have Due to nonorthogonality, the herd-year-season elaborate computational requirelnents. The means contain some information on sire efherdmate comparison can be computed on a fects. A variation on the herdmate comparison sequential basis, thus reducing computer storage procedure [Henderson, (9)] does utilize a requirements in contrast to procedures which portion of the inter-herd-year-season variance, but for a different purpose. This adjustment require a matrix inversion process. (the adjusted daughter average) takes into There are several classes of nongenetie effects which must be considered in sire evalua- account that a portion of this variation repretion. The number of sets of factors illv¢~lved sents genetic differences among the cow groups causes a geometric increase in computer stor- in these herds. As an alternative to the herdmate compariage requirements for nonsequential processes, son, Cunningham (1) suggested the sinmtsuch as matrix inversion. Henderson and Carter (10) apportioned taneous consideration of the sire, herd, and the total variation in age-adjusted dairy rec- year-season effects by linear estimation. The model he suggested is ords as follows: Received for publication 5uly 3, 1967. 782

SIRE

EVALUATION

were t, represents sire effects and bj represents herd-year-season effects. The sire-by-herd-yearseason interaction is ignored. With the preceding linear model, there are four alternative sets of assumptions from an estimation standpoint : A--tterd-year-season effects fixed and sires fixed. B--I-Ierd-yearseasons random and sires fixed. C--~Ierd-yearseason effects fixed and sires random. Or D - Both sires and herd-year-seasons random. The assumptions in A lead to a least squares analysis, while those in B require a maximum likelihood analysis of the nature described by Cunningham (1). Model B also corresponds to the recovery of interbloek information in incomplete blocks. Both methods involve matrix inversion. The purpose of this study was to evaluate the relative merits of sire evaluation using Models A and B, as compared to the standard herdmate comparison procedure. Since Models A and B have not been widely used in the context of sire evaluation, we shall attempt to summarize briefly criteria which should be considered in choosing an optinmm method of analysis. For further infornmtion, References (1), (2), and (14) may be consuited. Nature of effects. The choice between Models A and B depends in part on whether it is appropriate to regard herd-year-season effects as a randonl sample from some much larger population. One problem in dealing with this ~ssue is that several distinct factors are conihined into one category. Season effects are fixed. Years are probably appropriately regarded as fixed effects. There seems to be somewhat more basis for assuming that herds are sampled at random. Sire effects also should be considered random. However, as pointed out by Harvey (3), sire constants may subsequently be regressed on the basis of a function of the appropriate variance components. Sampling variance of estimates. As discussed in References (1), (2), and (14), the sampling variance of the estimates of fixed effects is a minimmn when maximum likelihood (ML) estimation is employed appropriately. This comes about because the block (random effect) totals contain information on the fixed effects which can be combined with the intrablock estimates. There is little information available to predict how nmch can be gained by maxinmm likelihood estimation in nonorthogonal data situations. Certainly, the gain in precision is a function of the degree and pattern of confounding. Without confounding, the block comparisons would be free of fixed effects. A second factor influencing the gain in precision

783

through maximum likelihood estimation is (r,~/o-b~ the ratio of the error vaxiance to the block variance. Cunningham (1) stated that the advantage of this method decreases as the relative magnitude of O-b2 decreases. However, further study is needed on this question. When cr~2/o-F~ is zero, block effects are perfectly repeatable, and we have the least squares model. When o-~*/~b~ becomes velw large, a completely randomized model is appropriate:

Bias. The preceding comments have centered on the question of sampling error. Henderson (7) pointed out that certain kinds of selection introduce a bias into least squares estimates. The example he used was one in which a bias due to selection among cows (blocks) was introduced into estimates of yearly environmental and genetic trend. Randomizatio~b. I n the context of an incomplete block experiment, one makes up a set of treatment combinations (sires) and allocates these to the blocks (herd-year-seasons) by a suitable randomization process. I n sire evaluation ~his infers that a given combination of sires would be equally as likely to occur in the herd with the poorest environment as in the herd with the best enviromnental conditions. Harvey (5) has stressed the importance of" this assumption. This nlatter may be more critical for maximum likelihood analysis than for the customary least squares procedure. Data

This study involved an empirical comparison of three methods of sire evaluation: herdmate comparison, least squares, and maximum likelihood. The prima~w practical problem in using linear estimation is to keep the number of equations to be solved directly within manageable limits. The difficulty, therefore, largely lies in restricting the nmnber of' bulls to be compared. Two principal avenues of reducing the number of sire constants were used: 1) The data were restricted to an area serviced by only one artificial breeding cooperative; in this case the data were restricted to Holstein herds in Michigan; 2) the bulls used in these herds were categorized into three groups--a) natural-service sires, b) contemporary artificial insemination (AI) sires and c) other A I sires. The bulls in Group c) included older A I bulls as well as those which had fewer than 25 daughters. Only the bulls in Group b) were fitted individually. The scope of the problem was also reduced J . DAIRY SOIENCE VOL. 51, NO. 5

784

M I L L E R , M0 D A N I E L , AND P L O W M A N

by use of first records exclusively (the first available record at less than 36 months of age). Thus, it was not necessary to include a separate term for cow differences in the model. The magnitude of the computational effort was also reduced by restricting the data to calving dates within a three-year period, 1961-1963. Finally, certain requirements were made for the initial inclusion of a particular herd in the study. Since the primary objective of the study was to evaluate methods of comparing A I sires, herds using natural service would make no contribution to this goal. Therefore, no herdyears were selected initially which had less than three first-calf heifers produced by artificial service. In addition, each herd-year group was required to have at least six heifers calving. Animals whose sires were not identified were excluded, as were the progeny of grade sires. All lactations were at least 75 days. The records of cows culled before completing a lactation were extended to a 305-day basis. A total of 2,844 herd-year groups was examined in the initial selection of the data. Of these groups, 1,504 were omitted because less than six first-lactation records remained when the above restrictions were imposed. An additional 337 herd-years were excluded because there were less than three first lactations of cows produced by A I service. Thus, the data consisted of 10,986 lactations rel)resenting 1,003 herd-year groups and 1,095 sires. The data were then surveved % detail to determine a final classifies*io~l oe the sires represented. All non-AI siro~ were pooled together into one group. Art,ific~] in~eminatlon sires were separated into two grouos, one in which an estimate for each individual bull was to be obtained, and a residual category for which all remaining bulls were to be estimated as a group. The following criteria were employed in assigning A I sires to the two categories: 1) Artificial insemination bulls with less than 25 daughters were assigned to the residual A I group, 2) all bulls belonging to other than the three primary studs were assigned to the residual group, 3) all bulls entering service prior to 1956 were placed in the residual group, 4) all bulls assigned to the individual grouping were required to have daughters calving in 1963. These restrictions resulted in the selection of 40 bulls to be evaluated individually. These bulls were distributed among the three studs as follows: Stud A, 23 sires; Stud B, 8 sires; and Stud C, 9 sires. As the data were coded to sire groups, cerJ. DAII~Y SCIENCE ¥OL. 51, NO.. 5

tain additional types of records were discarded: 1) non-AI progeny of A I sires, 2) 318 progeny of 12 sires whose date of entry into service was uncertain, and 3) 31 records with birth dates subsequent to 1961. A total of 10,620 records was actually used for analyses to be described. As shown in Table 1, the number of daughters per sire ranged from 26 to 724. There were 5,470 daughters of these 40 sires. There were 2,554 and 2,585 records in the non-AI and residual A I sire categories, respectively, excluding single-record herd years. Milk and fat were adjusted to a mature-equivalent basis using the standard Dairy I t e r d Improvement Association age factors. Methods and Results

The 40 bulls described in Table I were evaluated using three different criteria: a) hcrdmate comparison, b) least squares, and c) maximum likelihood. Herdmate comparison. Herdmate comparisons were computed in the customary manner: The record of each daughter was expressed as a deviation from the average of all daughters of other sires calving in the same herd, year, and season. Two seasons were defined on the basis of previous data on trends in lactation averages by month of calving: November to April and May to October. There were 3,720 lactations in Season 1 (November to April) and 6,889 lactations in Season 2 (May to October). In contrast to usual sire evaluation procedures, this procedure is a true contemporary comparison, since only animals of approximately the same age are compared. Least squares. The application of least squares estimation to unbalanced nmltipleclassification data has been described in detail by Harvey (3). However, the procedure does not appear to have been used previously for large scale dairy, sire evaluation. The model fitted was : + g~ + s~,~ + e~j~,,~,~, where h~ is an effect common to observations in the i ~ herd year, aj is an effect present for all observations in the j,h season of calving, b~ is an element peculiar to the k *~ year of birth, (ab)~ is an effect possessed by all observations in the j ~ season and k ~ year of birth, gt is the effect of the 1t~ sire group (g~ refers to the group of all the non-AI

SIRE EVALUATION

sires and the residual A I sires, g, pertains to the group of 40 individual sires), and s ~ refers to the ,mt" sire or sire class in the l " sire group, and e , ; ~ is a random error peculiar to the ~ daughter in the lm t~ sire class. I n the application of least squares to obtain estimates of the group comparisons, all effects except error are assumed fixed in repeated sampling. As pointed out by H a r v e y (3), least squares estimates of random effects should be regressed to the over all mean using a function of the a p p r o p r i a t e variance components. Sires are usually eonsidered to be random effeets. Although the group of sires studied may not be a random selection, individual sires contribute only a random complement of genetic material to each offspring. This is in contrast to evaluating' fertilizer applications, for example, where each emnpound compared can be applied in exactly the same form and quantity to all experimental units allocated to it. F o u r birth-year groups were employed : 19581961. Although birth year and y e a r of record are correlated, the equations for birth years were included because the correlation is not perfect and a reasonable solution should be obtainable. A model omitting birth years can be fitted from stuns of squares obtained f o r the above model by simply deleting the birthyear equations. Birth-year comparisons are

TABLE l. Number of records per sire." Sire no.

No. of records

Sire no.

No. of records

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

128 29 176 53 86 61 395 147 62 549 101 374 40 67 90 53 169 29 151 87

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

88 41 38 69 485 28 28 28 44 58 478 75 220 35 724 35 26 42 36 45

" Totals do not include records of daughters in herd years where there were no other animals present

785

presumed to reflect changes in mean genetic merit in the population. The difficulty in interpretation of the birth-year comparison is increased by the presence of the sire effects in the model. A large portion of the variation due to sires is accounted for by including the s ~ effect in the model. Therefore, the birth-year comparisons largely involve changes in genetic merit arising from the dam's contribution, since they are partially adjusted for sire effects. The effects (ab)j~ are assmned to reflect differences in the influence of seasons on yield among the various birth-year groups. Such fluctuations presumably constitute a f o r m of genetic-environmental interaction. The numbers of observations in the various birth yearseason groups are presented in Table 2. As the model indicates, sires were split into two g r o u p s : Group 1 consists of all n o n - A I sires and all residual A I sires; Group 2 consists of the 40 individual bulls listed in Table 1. The purpose of this division was to 1) restrict the number of degrees of freedom for sires and 2) to compare the 40 individual sires only among themselves, rather than against all sires. The latter mode of comparison roughly corresponds to exclusion of the progeny of all bulls except those to be comoared from the herdmate averages of the bulls evaluated. Maximum likelihood. The method of maxim u m likelihood was first proposed for a.pp]ications in dairy cattle data by Henderson (7). I n actuality, maximmn likelihood covers all types of models, including the least squares situation, since the estimators obtained depend upon the distribution assmnptions made regarding the elements of the model. The least squares category applies to those, models where error is the only randomly distributed variable. As pointed out by Cunningham and Henderson (2), maximmn likelihood estimation results in the familiar procedure for the recovery of interblock information on treatments in the analysis of incomplete block designs. TABLE 2. Number of observations in birth yearseason groups, a Season

Birth year

No. of records

1 1 1 1 2 2 2 2

1958 1959 1960 1961 1958 1959 1960 1961

430 1,073 1,451 766 546 1,920 2,489 1,934

" Records from single-observation groups are not included.

herd-year

J. DAI~Y SCIENCE VOL, 51, NO. 5

786

M I L L E R , MC D A N I E L , A N D P L O W M A N

I n our study, the model assumed is the same as previously presented for least squares. The only difference in that the herd-year effect, h,, is presumed to be a random variable with variance c~2. This model is the same as that used by Cunningham (1), except for the aj and b~ effects. As pointed out by Miller et al. (14), the equations obtained using this model differ f r o m the least squares equations in that the constant c~j°/o-h~ is added to the diagonal coefficients of the submatrix for herd-years. The repeatability value was assumed to be 0.35 on the basis of a survey of estimates f r o m the literature. In contrast to the model used in Reference (14), in the present study only a single absorption is required, since the random effect (herd years) is not nested within a fixed classification. I n this situation it is necessary to include an equation for /x to obtain unbiased estimates of the fixed effects. The introduction of the ratio c%-~/¢h" increases the total degrees of freedom by one. The absorption of the h~ does not remove the tL effect, and this must be taken into account in the estinmtion procedure. An equation, adjusted for absorption of the h~, must be set up to estimate t*; otherwise, the estimates of the a~, bk, (ab)j~, and sz~ will be seriously biased due to the confounding with tL. W h e n this procedure is applied to balanced incomplete block problems, unbiased maximum likelihood estimates of the treatment differences can be obtained without accounting for /~. I f the usual least squares restrictions are applied to treatments a f t e r absorbing blocks, the maximum likelihood constants ( ~ ) are obtained unbiasedly, since t* is not confounded. I f no restriction is applied to treatments, the solutions yield estinmtes of ~ + t,, as shown by Cunningham and Henderson (2). The least squares analyses of variance were carried out using the preceding model. The effects of birth years and seasons X birth years were nonsignificant for both milk and f a t yield. Season effects were highly significant for both milk and f a t production. Differences among the individual A I bulls were highly significant. The F ratios f o r the one degree of freedom comparison of n o n - A I versus other A I sires were significant only at the 5% level. These snmller F ratios, in contrast to those for individual sires, are due to the decreased variance brought about by grouping. The model was subsequently reduced by deleting the interaction equations. The results were similar to those for the original model, except that the F ratios f o r season and birth ~1. DAIt~Y SCIENCE VO~. 51, NO.. 5

year were slightly increased. Birth-year differences remained nonsignificant. The least squares constants f r o m the original model are presented in Tables 3 and 4. The season constants in Table 3 show that cows calving November to A p r i l produce about 180 kg of milk and 5 kg of fat more than those freshening May to October. The comparison of the constants f o r the nonA I sire group to those f o r the other A I sire group shows that the residual A I bulls were on the average superior by 78 kg of milk and 3 kg of fat. These constants are adjusted f o r the effects of herds, years, herds × years, seasons, birth years, and seasons × birth years. Wadell and McGilliard (15) found that artificially sired cows in Michigan exceeded their naturally sired herdmates by 56 kg of milk and 1 kg of fat in the first lactation. The present data cover a later period of time than W a d e l l ' s study. Also, only about one-third of the total A I progeny in these herds was included in the A I grouping. The present analysis also provided a comparison between the mean of the 40 individual A I bulls and the mean of the n o n - A I and remaining A I bulls; this difference was 72 kg milk and 2.4 kg f a t in f a v o r of the individual A I sires. The individual sire constants a r e presented in Table 4. The range in standard errors f o r milk yield was f r o m a high of 243 kg (Sire 37) to a low of 46 kg (Sire 35). The sire constants f o r milk yield ranged from 996 kg (Sire 28) to --523 kg (Sire 37). Analyses of variance of milk and f a t based on the maxinmm likelihood model were carried out. The a p p r o p r i a t e measure of error f o r testing the magnitude of tim maximum likelihood estimates of the season, birth year, season X birth year, and sire effects is the least squares TABLE 3. Least squares constants for season, yeax of birth, and grouped sire effects. ~'b'¢ Class of effect Season

Identification

NovemberApril MayOctober Year of birth 1958 1959 1960 1961 Sires Non-AI sires Other A I sires

M~lk 91 -----14

Fat 2.4 ___0.5

--91 -----14 --56 -s- 43 4 -4- 26 --14 ~+ 26 67 -----37

--2.4 --0.9 0.4 --0.6 1.2

_+ 0.5 -4- ].5 -¢- 0.9 ± 0.9 -4- 1.3

--39 ___ 18

--1.5 +__0.6

39 + 18

1.5 ± 0.6

Season x birth yeax effects were nonsignificant. b Kg units. ± indicates standard error.

787

SII~E E V A L U A T I O N

TABLE 4.

L e a s t s q u a r e s c o n s t a n t s for i n d i v i d u a l artificial i n s e m i n a t i o n sires. ~

Sire no.

Milk

Fat

1 2 3 4 5 6 7 8 9 10 11 12 13 ]4 15 16 17 18 19 20

--340 35 - - 36 --494 290 --377 334 388 --499 --259 234 214 --127 --421 - - 44 292 353 --368 --239 207

-----

5.4 1.4 7.5 5.1 7.0 --19.4 11.2 17.8 --13.8 --10.5 - - 5.2 11.7 --18.7 - - 4.7 0.9 7.2 16.4 --14.0 3.1 - - 6.9

Si~-e no.

Milk

Fat

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

89 - - 43 44 - - 62 117 - - 52 135 996 --506 --328 488 486 100 765 102 --247 --523 --512 251 --444

4.7 - - 1.7 3.2 - - 3.9 - - 3.6 --11.5 8.8 28.6 --10.8 --11.5 27.3 14.8 - - 0.1 29.5 - - 1.1 - - 1.0 --26.3 - - 3.1 16.0 --20.9

a K g units.

was not significantly different from zero at the 5 % level. T h e m a x i m u m l i k e l i h o o d a n a l y s i s also provided an estimate of the over-all mean, b e i n g 6,351 k g o f m i l k a n d 231 k g o f f a t . The maxinmm likelihood constants for ind i v i d u a l s i r e s a r e l i s t e d in T a b l e 6. C o m p a r i s o n o f t h e s e r e s u l t s to t h o s e f o r ] e a s t s q u a r e s i n d i c a t e s a r a t h e r close a g r e e m e n t o n t h e whole. S i r e 15 s h o w s t h e g r e a t e s t d i s p a r i t y in r e l a t i o n to t h e l e a s t s q u a r e s e s t i m a t e s i n T a b l e 4, b e i n g a b o u t 115 k g . T h e s e c o n s t a n t s a p p e a r to b e s o m e w h a t m o r e v a r i a b l e t h a n t h e l e a s t squares estimates. The third method of sire evaluation considered was the standard herdmate comparison technique. Herdmates were defined as paternalTABLE 5.

M a x i m u m likelihood c o n s t a n t s f o r season, y e a r of birth, a n d g r o u p e d sire effects, a'b

Class of effect

Identifieation

Milk

Season

error mean square. The error mean squares from the maximum likelihood analysis are not useful for tests of hypotheses. The F values for seasons, seasons × birth y e a r s , a n d s i r e s w e r e s i m i l a r to t h e c o r r e s p o n d ing values for the least squares analysis. However, the F ratios for the maximum likelihood e s t i m a t e s o f b i r t h y e a r effects w e r e h i g h l y s i g nificant, whereas the least squares values were of negligible magnitude. Also, the F ratio for the group comparison of AI sires versus nonAI sires was nonsignificant for milk yield and s i g n i f i c a n t a t o n l y t h e 5 % level f o r f a t p r o d u c tion. In the least squares analysis both mean s q u a r e s w e r e s i g n i f i c a n t a t t h e 5 % level. T h e m a x i n m m - l i k e l i h o o d c o n s t a n t s f o r seas o n s , b i r t h y e a r s , a n d g r o u p e d sire effects a r e p r e s e n t e d i n T a b l e 5. T h e s e a s o n a l d i f f e r e n c e s estimated by this method were quite comparable to t h o s e o b t a i n e d b y l e a s t s q u a r e s . I n b o t h cases production in the winter season was higher than that in the summer season. The maximum likelihood birth year constants indic a t e a g a i n o f a b o u t 300 k g o f nfilk o v e r f o u r y e a r s . T h e s e r e s u l t s a p p a r e n t l y reflect f a c t o r s other than changes in genetic merit, especially since the birth-year constants are partially adj u s t e d f o r s i r e effects. R e s i d u a l a g e effects m a y have been involved. The maximum likelihood c o m p a r i s o n o f n o n - A I b u l l s to o t h e r A I s i r e s i n d i c a t e s a s u p e r i o r i t y o f a b o u t 50 k g o f m i l k a n d 2.5 k g f a t f o r t h e A I bulls. T h e s e d i f f e r ences are smaller than the ]east squares values; the maximum likelihood estimate for milk yield

N o v e m b e r to April 94 ± 14 M a y to October - - 9 4 ± 14 Year of b i r t h 1958 - - 1 5 0 ± 37 1959 - - 4 2 ± 24 1960 35 -~- 23 1961 158 ± 31 Sires Non-AI sires - - 2 7 ± 17 Other A I sires 27 ± 17

Fat 2.4 -----0.5 --2.4 --4.5 --1.3 1.2 4.7

± ± ± ± ±

0.5 1.3 0.8 0.8 1.1

--1.2 ~- 0.6 1.2 ± 0.6

K g units. b _____indicates s t a n d a r d error. TABLE 6.

M a x i m u m likelihood e s t i m a t e s of sire effects."

~ire no.

Milk

Fat

Sire no.

Milk

Fat

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

--388 139 - - 79 --512 214 --446 291 362 --535 --288 211 180 --117 --451 71 329 308 --401 --243 176

- - 7.1 2.2 - - 9.3 - - 5.9 4.3 --22.0 9.6 17.0 --14.8 --11.5 - - 6.3 10.2 --18.6 - - 6.1 5.0 8.2 14.7 --14.7 3.1 - - 8.3

21 22 23 24 25 26 27 29 28 30 31 32 33 34 35 36 37 38 39 40

165 24 139 --105 102 26 181 1,050 --473 --356 478 530 75 709 86 --256 --552 --481 256 --418

9.0 0.3 5.8 - - 5.6 - - 4.4 - - 8.3 10.2 31.2 - - 9.6 --13.2 26.8 16.6 - - 1.0 27.5 - - 1.9 - - 0.7 --26.2 - - 2.6 16.9 --20.5

a K g units. ~'. DAIRY SCIENCE VOL. 51, NO, 5

788

M I L L E R , MC D A N I E L , AND P L O W M A N

ly u n r e l a t e d a n i m a l s f r e s h e n i n g in the same herd, year, a n d season as the a n i m a l in question. Season definitions were the same as p r e viously s t a t e d : N o v e m b e r to A p r i l a n d M a y to October. The m e a n u n a d j u s t e d h e r d m a t e deviations f o r the 40 i n d i v i d u a l A I sires are presented in Table 7. C o m p a r i s o n of these r e s u l t s with those in Tables 4 a n d 6 discloses a f a i r l y close a g r e e m e n t between the t h r e e different estimates of sire effects. W h i l e a visual a p p r a i s a l of the estimates p r o vided b y t h e least squares, m a x i m u m likelihood, a n d h e r d m a t e c o m p a r i s o n p r o c e d u r e s gives a r o u g h idea of t h e i r agreement, a more precise m e a s u r e is needed. F o r this reason, t h e r a n k correlations a m o n g the v a r i o u s estimates were c o m p u t e d b y the S p e a r n m n p r o c e d u r e a n d are p r e s e n t e d in T a b l e 8. The r a n k correlations between the h e r d n l a t e c o m p a r i s o n evaluations a n d the least squares c o n s t a n t s were .94 a n d .q7 f o r m i l k a n d f a t , respectively. The c o r r e s p o n d i n g correlations between h e r d m a t e c o m p a r i s o n s a n d m a x i m u m likelihood e v a l u a t i o n s were .96 a n d .98. Maxim u m likelihood a n d least s q u a r e s r e s u l t e d in almost identical r a n k i n g of sires, as indicated b y the r a n k c o r r e l a t i o n values of .99 in Table 10 (r~ a n d r~). T h e slight t e n d e n c y f o r a closer correspondence between the m a x i m u m likelihood a n d least squares results m a y be due to t h e i r c o m p u t a t i o n a l similarities, in c o n t r a s t to the h e r d m a t e c o m p a r i s o n method. Discussion

The objective of the p r e s e n t s t u d y was to TABLE 7. Average herdmate deviations of individual artificial inseminatio~ sires. ~ Sire no.

Milk

Fat

Sire no.

Milk

Fat

1 2 3

--351 257 -- 34

-- 5.1 6.0 -- 7.2

21 22 23

58 72 219

4.4 2.8 10.0

4

--404

--

2.0

24

--

67

- - 4.0

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

294 --319 317 335 --478 --336 209 154 -- 42 --445 24 428 292 --228 --248 183

7.2 --17.6 9.6 16.1 --11.9 --13.2 -- 5.8 10.0 --16.5 -- 4.3 5.2 11.2 14.7 -- 8.3 3.0 -- 7.8

25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

54 149 340 1,194 --466 --286 404 552 34 870 54 --223 --260 --448 464 --473

-- 5.9 -- 4.9 15.8 36.0 -- 8.6 -- 9.0 25.9 17.4 -- 2.5 33.9 -- 2.3 1.4 --14.4 0.0 23.8 --22.1

Kg units. J. DAIRY SCIENC~ ~]'OL. 51, NO. 5

TABLE 8. Spearman rank correlations among herdmate comparison, least squares and nmximumlikelihood estimates of breeding values of 40 artificial insemination sires. ~

1 2 3 4 5

2

3

4

5

6

.81

.94 .77

.79 .97 .82

.96 .80 .99 .83

.81 .98 .81 .99 .83

~ l ~ - - H e r d m a t e milk; 2 = h e r d m a t e f a t ; 3 = least squares milk; 4---- least squares f a t ; 5 = maximum likelihood milk; 6 = maximum likelihood fat. e m p i r i c a l l y evaluate the t h r e e a l t e r n a t i v e p r o cedures f o r a n a l y z i n g p r o g e n y test results. I n the context of p a r a m e t e r e s t i m a t i o n (sire b r e e d i n g value), two f a c t o r s m u s t be considered: Bias a n d s a m p l i n g error. H e n d e r s o n (7) p o i n t e d out t h a t certain k i n d s of selection m a y lead to a bias in least squares estimates which can be c i r c m n v e n t e d b y comp u t i n g the m a x i m u m likelihood values. The i l l u s t r a t i o n used was a model fitting b i r t h - y e a r g r o u p s a n d y e a r of record on a within-cow basis. I n this case the usual s i t u a t i o n is characterized b y a c o r r e l a t i o n between the block (cow) means a n d the n u m b e r of observations p e r block (cow). I n the p r o b l e m of e s t i m a t i n g sire b r e e d i n g values, a n y r e l a t i o n s h i p between the h e r d - y e a r a v e r a g e p r o d u c t i o n a n d the size of h e r d is likely to be small a n d negative. H a r v e y (5) p o i n t e d out t h a t a p p l i c a t i o n of the m a x i m u m likelihood r e q u i r e s t h a t t h e t r e a t m e n t s (sires) be allocated r a n d o m l y to the blocks ( h e r d y e a r s ) . To i n v e s t i g a t e this question, the p r o d u c t - l n o m e n t c o r r e l a t i o n between the least squares sire c o n s t a n t s a n d the h e r d m a t e averages f o r the c o r r e s p o n d i n g bulls was computed. This c o r r e l a t i o n was only 0.12. Therefore, a n y r e l a t i o n s h i p t h a t m a y have existed between h e r d effects a n d t h e use of bulls was not a simple s i t u a t i o n of the best bulls b e i n g used in the best herds. I n the p r e s e n t study, the existence of i n t e r actions of h e r d years a n d sires w i t h seasons has been ignored in the least squares a n d maxim u m likelihood analyses. The h e r d m a t e comp a r i s o n p r o c e d u r e m a k e s allowance f o r possible h e r d - y e a r - s e a s o n interactions, b u t does n o t account f o r i n t e r a c t i o n of sires a n d seasons. All three e v a l u a t i o n p r o c e d u r e s are subject to e r r o r s caused b y i n t e r a c t i o n s of sires with h e r d years. H e n d e r s o n a n d C a r t e r (10) estimated t h a t 2 % of the v a r i a t i o n in agea d j u s t e d records was due to sire X h e r d i n t e r actions.

SIRE

EVALUATION

All three procedures, as applied in the present study, are subject to errors r e s u l t i n g f r o m inaccurate age-adjustment factors. The herdmate comparison requires the use of adjustment factors computed f r o m the data at hand or, more likely, from a previous larger body of data. However, there is no practical reason why age variation cannot be accounted f o r simultaneously by regression techniques in the least squares and maximum likelihood methods. This would be especially desirable if age × year interactions exist, as indicated by the work of K o h and Henderson (12). I n the present study, effects f o r birth-year groups and the interaction of birth-year groups with seasons were included in the maximum likelihood and least squares models. Season × birth-year effects were negligible in both analyses, but the results for birth-year groups were equivocal. Small and inconsistent estimates were obtained by least squares, but a large and positive trend was indicated by the maximum likelihood results. Residual age variation may be involved. However, a subsequent analysis of actual production indicated a negative trend over years. I n general, it appears to be a desirable goal to adjust estimates of sire breeding value for changes in genetic merit in the population. This is much more feasible with the maxinmm likelihood and least squares procedures than with herdmate comparisons. On the whole, with respect to bias, there app e a r to be valid theoretical reasons for pref e r r i n g the maxinmm likelihood and least squares procedures to the herdmate comparison. However, as indicated in Table 8, all three procedures ranked bulls in essentially the same order. There was a slight tendency for closer agreement between ]east squares and maximum likelihood, as compared to the correlations between these measures and herdmate comparisons. On the strength of these data, however, there appears to be little basis for p r e f e r r i n g one method to another, so far as bias is concerned. The second criterion useful in j u d g i n g the merits of the alternative procedures is sampling error. The principal theoretical advantage of maximum likelihood estimation is that the estimates of fixed effects possess smaller standard errors than those provided by any other procedure. I t is intuitively evident that this is true, since the diagonals of the randomeffects portion of the least squares matrix are increased by the fraction o-~/o-~ ~, thus introducing additional information. This results in reduced diagonal coefficients for the fixed effects portion of the inverse matrix. Using a

789

small numerical example, Cunningham (1) demonstrated that the standard errors of the maxim u m likelihood constants are slightly smaller. The standard errors of the sire constants were computed for both the maximum likelihood and least squares sire constants; these are presented in Table 9 for milk yield. The standard errors f o r the maximum likelihood estimates were computed by using the diagonal coeffÉcients f r o m these estimates and the error mean square f r o m least squares. The use of the residual mean square from the maximum likelihood solution is not correct (14). The results in Table 9 show that the standard errors f o r the maximum likelihood constants range f r o m the same size to about 5% smaller than the corresponding values for least squares. The average decrease was 1.6%. Thus, in the present application, the advantage of nmximum likelihood estinmtiou was small, as compared to least squares. The present results are incomplete, in the respect that standard errors were not obtained f o r herdmate comparisons, and thus the relative advantages of the methods cannot be fully assessed. However, it can be stated that the standard errors of the maximum likelihood estimates are smallest, provided that the ratio o-2/(o-o~ + o-~~) assumed is accurate. The value of repeatability used, 0.35, was based on estimates f r o m the reseaxeh literature. To examine the adequacy o£ the a p p r o x i TABLE 9. Standard errors for sire constants for mature equivalent milk yield. '' b Sire no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Standard error LS

ML

Sire no.

99 205 87 179 121 141 60 93 140 52 113 61 186 135 126 165 90 205 94 119

98 203 86 179 119 140 59 92 139 50 111 60 185 134 125 160 88 203 92 118

21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Standard error LS

ML

131 174 192 135 54 206 227 221 172 147 57 139 81 187 46 197 243 170 191 165

129 172 187 133 53 204 221 216 171 145 55 136 79 187 45 193 232 168 188 165

a K g units. b LS de~mtes least squares; ML denotes maximum likelihood.

J. DAIRY SCIENCE VOL. 51, NO. 5

790

MILLER,

MC D A N I E L ,

mation, an analysis of variance was conducted to estimate o-d f r o m the data. Henderson's Method Three (8) was used to estimate the components of variance. The results are presented in Table 10. The least squares sum of squares f o r herd years was obtained indirectly as follows: R (herd years) ~ ~ :Y~"- + R (sires) ~d j

2

~. 1

where the subscript i refers to herd years, j to the sire classification, and k to individuals within herd-year-sire groups. The symbol R ( ) indicates a least squares reduction sum of squares. The model assumed in this analysis included herd years and sires, but no interaction between herd years and sires. As in the incomplete blocks case, these interactions are assumed to be absent. The ratios in Table 10 are expressed in terms of &~'-f-3-~-', since the initial maximum likelihood model infers that sires are fixed effects. The results in Table 10 indicate that the repeatability assumed (0.35) was a reasonably accurate measure of the importance of herdyear effects in the sample of data employed. The assumed value was slightly larger than the estimate obtained f r o m the data.

Summary and Conclusions Three methods of sire evaluation were compared empirically: H e r d m a t e comparison, least squares, and maximum likelihood. The results f o r all three methods were in close agreement, as measured by the rank correlations of sire effects. These correlations ranged f r o m .94 to .99, based on breeding values for the same trait (milk or f a t ) . M a x i m u m likelihood theoretically circumvents certain biases which may affect herdmate and least squares results, but

TABLE 10. Es~mation of z,'-' by Henderson's Method Three. b'¢ Mean squa.res a Source

df

R(herd-years) R (sires) Error

1,002 41 9,576

(Kg) 2

Milk 6,045,626 9,345,877 1,101,462

Fat 7,827 16,544 1,363

units.

b Estimates of

. .30 (milk), .31 (fat). ffh2 --~ fie2

¢ K-value for herd-years -~ 10.585 [computed as described in Reference (8)]. J . DAIRY SCIENCE ~V*OL. 51, NO. 5

AND PLOWMAN

such situations were a p p a r e n t l y of little iul m portance in the data studied. Maximum likelihood estimates in theory have the smallest possible sampling errors. I n the present data the average decrease in sampling error of sire effects was 1.6%, as compared to least squares. The maximum reduction was about 5%. The model employed also contained effects for birth years, seasons, birth years × seasons, and n o n - A I sires versus other A I sires. The maximum likelihood and least squares results were reasonably consistent for seasons, birth years × seasons, and n o n - A I versus AI. H o w ever, the two methods gave widely different estimates for birth-year effects. The least squares analysis indicated that birth-year changes were negligible, while the maximum likelihood estimates showed a large positive trend. I t is concluded that the three methods r a n k bulls essentially the same and with similar precision in situations comparable to those in the data studied. The data were chosen to sinmlate the objective of discriminating among bulls becoming available for the first time to dairymen in a single state. Differences among the methods may be larger when data covering greater ranges of time and geographical areas are studied.

Acknowledgments The authors acknowledge the helpful s u g g e s t i o n s of Drs. L. D. VanVleck, W. R. Harvey, and K. A. Tabler regarding the analysis. References (1) Cmmingham, E. P. 1965. The evaluation of sires from progeny test data. Anita. Prod., 7 : 221. (2) Cunningham, E. P., and C. R. Henderson. 1966. Analytical technique for incomplete block experiments. Biometrics, 22:829. (3) Harvey, W. R. 1960. Least squares analysis of data, with unequal subclass numbers. ARS, USDA, ARS-20-8. (4) Harvey, W. R. 1964. Computing procedures for a generalized least-squares analysis program. Unpublished report. (5) Harvey, W. R. 1966. Least-squares and maximum-likelihood techniques. Paper preseated, Natl. Tech. Syrup. aad Workshop on Estimating Breeding Values of Dairy Sires and Cows. Washington, D.C. (6) Heidhues, T., L. D. Van Vleck, and C. R. Henderson. 1960. A comparison bctwee,L expected and actual accuracy of sire proofs under the New York system. J. Dairy Sci., 43: 878. (Abstract.) (7) Henderson, C. B. 1949. Estimation of c h a n g -

SIRE EVALUATION

(8)

(9)

(10)

(11)

es in herd environment. J. Dairy Sci., 32: 706. (Abstract.) Henderson, C. R. 1953. Estimation of variance and covariance components. Biometrics, 9: 226. Henderson, C. R. 1956. Cornell research on methods of selecting dairy sires. Proc. New Zealand Soc. Anita. Prod., 16: 69-76. Henderson, C. R., and H. W. Carter. 1957. Improvement of progeny tests by adjusting for herd, year, and season of freshening. J. Dairy Sci., 42: 638. (Abstract.) Kendrick, J. F. 1955. Standardizing Dairy Herd Improvement Association records in proving sires. D H I A Newsletter, ARS-52-1.

791

(12) Koh, Y. O, and C. R. Henderson. 1964. Year by age interaction of New York dairy records. J. Anim. Sci., 23: 851. (Abstract.) (13) McDaniel, B. T., R. H. Miller, and E. L. Corley. 1966. Relationships between initial and later A I sire progeny groups. J. Dairy Sci., 49: 724. (Abstract.) (14) Miller, R. H., W. R. Harvey, K. A. Tabler, B. T. McDaniel, and E. L. Corley. 1966. Maximum-likelihood estimates of age effects. J. Dairy Sci., 49: 65. (15) Wadell, L. H., and L. D. McGilliard. 1959. Influence of artificial breeding on production in Michigan Dairy herds. J. Dairy Sci., 42: 1079.

J. DAIRY SCIENCE VOL. 51. NO. 5