Experience with a Test-Day Model L. R. Schaeffer, J. Jamrozik, G. J. Kistemaker, and B. J. Van Doormaal† Centre for Genetic Improvement of Livestock, Department of Animal and Poultry Science, University of Guelph, Guelph, ON, Canada N1G 2W1 and †Canadian Dairy Network, 150 Research Lane, Suite 307, Guelph, ON, Canada N1G 4T2
ABSTRACT The Canadian Test-Day Model is a 12-trait random regression animal model in which traits are milk, fat, and protein test-day yields, and somatic cell scores on test days within each of first three lactations. Testday records from later lactations are not used. Random regressions (genetic and permanent environmental) were based on Wilmink’s three parameter function that includes an intercept, regression on days in milk, and regression on an exponential function to the power −0.05 times days in milk. The model was applied to over 22 million test-day records of over 1.4 million cows in seven dairy breeds for cows first calving since 1988. A theoretical comparison of test-day model to 305-d complete lactation animal model is given. Each animal in an analysis receives 36 additive genetic solutions (12 traits by three regression coefficients), and these are combined to give one estimated breeding value (EBV) for each of milk, fat, and protein yields, average daily somatic cell score and milk yield persistency (for bulls only). Correlation of yield EBV with previous 305-d lactation model EBV for bulls was 0.97 and for cows was 0.93 (Holsteins). A question is whether EBV for yield traits for each lactation should be combined into one overall EBV, and if so, what method to combine them. Implementation required development of new methods for approximation of reliabilities of EBV, inclusion of cows without test day records in analysis, but which were still alive and had progeny with test-day records, adjustments for heterogeneous herd-test date variances, and international comparisons. Efforts to inform the dairy industry about changes in EBV due to the model and recovering information needed to explain changes in specific animals’ EBV are significant challenges. The Canadian dairy industry will require a year or more to become comfortable with the test-day model
Received June 21, 1999. Accepted November 11, 1999. Corresponding author: L. R. Schaeffer; e-mail:
[email protected]. uoguelph.ca. 2000 J Dairy Sci 83:1135–1144
and to realize the impact it could have on selection decisions. (Key words: genetic evaluation, test-day model, random regressions, production) Abbreviation key: AM-PM = alternate morning-evening milk recording scheme, CPU = central processing unit, CTDM = Canadian Test-Day Model, LPI = lifetime profit index, TEV = total economic value. INTRODUCTION Canada officially adopted a multiple-trait, random regression, test-day animal model, (CTDM), in February 1999 to replace single-trait, repeated lactation records, animal models for production traits, and a fixed regression, test-day model for somatic cell score. The CTDM evolved from earlier studies on single-trait, random regression, test-day animal models over a period of 7 yr (4, 5, 7, 8, 10, 11, 14, 15, 16, 17, 18). The differences in genetic evaluations from CTDM compared with the previous official methodology were significant in both theory and practical application. The changes were from a system 1) where lactation 305-d yields were considered as repeated measures of the same trait to a system where each lactation was considered to be a separate trait and the analyses were on test-day yields, 2) where a standard lactation curve was assumed for each cow and lactation to a system where each lactation within a cow could have a different shape of lactation curve, 3) that included 305-d lactation yields from 1957 to a system that analyzed test-day yields only from 1988 to the present, and 4) that could be computed easily in a few days to a system that required 2 wk and a large amount of computer memory. The advantages of the CTDM are that it allows greater flexibility to be built into milk recording programs, it removes environmental effects from test-day records more accurately, it models the shape of the lactation curve and the variability of yields around some general shapes, and it provides more accurate genetic evaluations of cows in the range of 4 to 8% over evaluations from 305-d yields (11). Persistency within lactations can be estimated ge-
1135
1136
SCHAEFFER ET AL.
netically as well as across lactations. The variability between bulls’ daughters regarding where in the lactation they show their superiority is useful information for dairy producers. The objectives of this paper were to provide the exact model and details about the CTDM and to discuss issues that arose in the implementation of this new technology. CANADIAN TEST DAY MODEL The CTDM is a multiple-trait, random regression, test-day animal model. On a given test day, the kth cow is at day t in its lactation, in parity n (limited to first, second, or third parity), in herd-test date-parity group i, and calving within the jth time period, region, age group within parity, and season subclass. The cow is observed for 24-h milk, fat, and protein yields and somatic cell score; the observation vector for a cow on a given test day can be written as
ynt:ijk
( )(
y1nt:ijk 24-h milk yield, kg y2nt:ijk 24-h fat yield, kg = y = 24-h protein yield, kg 3n t:ijk y4nt:ijk somatic cell score
)
group within parity, and season of calving; HTDhn:i is a herd-test date-parity effect on all milking cows in lactation n in that herd on a specific test date; βhn:jm are the fixed regression coefficients, which differ by trait, h, and lactation number, n, and which are specific to subclass j of time-region-age-parity-season; ahn:km are the additive genetic, random regression coefficients, which differ by trait, h, and lactation number, n, and which are specific to each animal, k; phn:km are the permanent environmental, random regression coefficients, which differ by trait, h, and lactation number, n, and which are specific to each cow k; ehnt:ijk are the residual effects for each observation, q is the number of covariates describing the shape of the lactation curves; and ztm are the covariates associated with DIM, assumed to be the same for both fixed and random regressions. The model had five regions within Canada for the Holstein breed (only one region for the other breeds), two seasons of calving, and initially only one time period for all breeds. In the lactational animal model, periods of 5 yr each were formed because differences between age groups were found to change over time, which causes biases in genetic evaluations and genetic trends (13). Wilmink’s function (20) was chosen to describe the shapes of the lactation curves. The function is 3
In some cases, one or more of these traits may be missing because of the sampling scheme in which the cow owner may be enrolled, but 24-h milk yield is always required. The 24-h yields may be actual 24-h weights, or they may be estimated 24-h yields based on alternate morning-evening milk recording scheme (AM-PM) yields or samples from three times a day milkings. The observations are limited to tests between d 5 and 305 of lactation. The limit of 305 d was partly due to tradition, to keep the number of test-day records within reasonable limits, and to avoid including cows with nonstandard or abberant lactation curves. The model equation is assumed to be the same for all four traits. For trait h on DIM t it is, q
yhnt:ijk = HTDhn:i +
∑
βhn:jmztm
m=1 q
+
∑
ahn:kmztm
∑
Phn:kmztm + ehnt:ijk,
m=1 q
+
W(t) =
∑
bmztm,
m=1
where zt1 = 1; zt2 = t; and zt3 = exp−0.05t. The value of −0.05 in the third covariate was estimated by Wilmink (20) from Dutch Holstein Friesians. Based on Canadian data, the value that gave the highest correlation with actual yields was −0.035, but the value that gave the lowest mean absolute error was −0.07. The differences in results between −0.035 and −0.07 were not practically significant. Therefore, the value of −0.05 was kept for this model. The full model includes test-day records from the first three lactations. It is a four-trait model with separate effects for lactations 1, 2, and 3, specified within the equation of the model. In total there are 12 traits. For each animal and trait there are three regression coefficients to be estimated per animal for the additive genetic effects and three regression coefficients per cow to be estimated for the permanent environmental effects. This gives 36 additive genetic effects per animal and 36 permanent environmental effects per cow with records that need to be estimated.
m=1
where yhnt:ijk is a record of cow k made on day t of lactation n within herd-test date-parity effect i, for a cow belonging to subclass j for time period, region, age Journal of Dairy Science Vol. 83, No. 5, 2000
Covariance Matrices Because each ynt:ijk are separated in time, the residual (or temporary environmental) effects were assumed to
1137
SYMPOSIUM: TEST-DAY MODELS
be uncorrelated both within cows and between cows. The residual covariance matrix for ynt:ijk depends on the days in milk, and lactation number, and can be written as
( )
r11 r12 Rnt = r 13 r14
r12 r22 r23 r24
r13 r23 r33 r34
r14 r24 r34 r44
where rhh′ is the residual covariance between traits h and h′. If trait 4 (somatic cell score) was missing on a particular test day, then r14, r24, r34, and r44 would all be zero in Rnt. Rather than having 305 different residual covariance matrices per lactation, the lactation was divided into four periods: 5 to 45 d, 46 to 115 d, 116 to 265 d, and 266 to 305 d. The residual covariance matrices were assumed to be the same within each period. These matrices must be adjusted for the accuracy of the 24-h yields. For example, if the accuracy of 24-h yields based on a single morning milking is 0.89, then all diagonal elements of Rnt would be divided by 0.89, thereby increasing the residual variances. The accuracy of 24-h yields based on the sum of morning and evening yields was assumed to be 1.00. Accuracies for other recording schemes are determined from special studies as in (19). The covariance matrix of random permanent environmental regression coefficients represented by pk, a 36 by 1 vector, for a cow arranged by traits within lactation number, is a matrix of order 36 by 36 represented by P, and the covariance matrix of all animals with testday records is a block diagonal matrix, I ⊗ P, where every block is of order 36. Similarly, the covariance matrix of random additive genetic regression coefficients for animal k is represented by ak a 36 by 1 vector, is a matrix G, of order 36 by 36. The covariance matrix for all animals is A ⊗ G, where A is the numerator relationship matrix for all animals to be evaluated, which includes inbreeding coefficients. Thus, the regression coefficients are genetically related between animals, and within an animal the regression coefficients for milk yield in first lactation, for example, are genetically correlated with the regression coefficients for fat yield, protein yield, and somatic cell scores in first lactation, but also with coefficients for these traits and milk yield in later lactations. Through these genetic correlations it is possible for a cow to have a single test-day milk yield and to have genetic evaluations for all traits and lactations. Both G and P are assumed to be constant throughout Canada for a breed. Separate matrices were estimated for each breed in Canada, as reported by Jamrozik et
al. (8) but using a model with random regressions for permanent environmental effects as stated by Schaeffer (17). Under a 305-d lactation yield animal model, the heritability of milk, fat, and protein yields was assumed to be 0.33 for all breeds, but in the CTDM every breed has been shown to have a very different set of parameters in G, P, and R. Heterogeneous Variance Adjustment Heterogeneous herd-test date-parity variances exist in test day production data. Data were adjusted for heterogeneous herd-test date-parity variances on a trait by trait basis. Test-day yields in the first three lactations were analyzed in a fixed model that included the effects of DIM within lactations, effects of herdtest date-parity, fixed regressions (Wilmink’s function) within time periods-regions-age-season-parity classes, and fixed regressions within herd groups. The residuals, (R = y − yˆ), from this model were used to estimate an overall residual standard deviation (σ) and to estimate residual standard deviations for each DIM within parity (σt*p). At the same time, adjustments for the accuracy of the test-day record using the standard deviation of 24-h, AM-PM, or three times a day testing schemes (σacc) were made. The residuals were standardized as Rstd = R
( )
σ σ , σt*p σ acc
to estimate herd-test date-parity variances. Herd-test date-parity SD were smoothed by averaging SD from five previous test dates, (σi), in the herd and the overall average SD for animals in the genetic base, (σmean), to derive a factor to adjust observations. The genetic bases are cows that calved within the year 3 yr prior to the current year. That is, Factor =
5 (Ci dƒi σi) 25σmean + Σi=0 , 5 [25 + Σi=0 Ci dƒi] σmean
where Ci is the correlation between SD in different test dates, which depends on the interval in days between test dates, ranging from 0.70 if the interval is 10 d to 0.51 if the interval is 90 d, dfi are the number of observations in a herd-test date-parity subclass, and 25 is a weight roughly equivalent to a herd-test dateparity subclass with eight cows in it tested every 30 d 5 (i.e., 25 = Σi=0 Ci 8). Finally, the observations were adjusted by yadj =
(y − yˆ) + yˆ, Factor
Journal of Dairy Science Vol. 83, No. 5, 2000
1138
SCHAEFFER ET AL.
Table 1. Numbers of test-day records, cows, and herd-test date-parity (HTDP) sub-classes for February 1999 Canadian Test-Day Model. Breed
TD records
Cows
HTDP
Holstein Ayrshire Jersey Guernsey Brown Swiss Canadienne Milking Shorthorn Total
20,628,231 1,004,424 543,769 98,395 96,590 26,977 9792 22,408,178
1,318,603 60,661 35,502 6666 6412 1747 679 1,430,270
2,057,893 122,573 71,038 14,344 22,679 4497 2155 2,295,179
where yˆ is the predicted y from the analysis. The data adjusted for herd-test date-parity heterogeneity of variances were used as input to the genetic evaluation model. Data The immediate challenge in implementation of a testday model is the need to handle individual test day yields rather than 305-d lactation yields. The number of test-day records is roughly 10 times greater than the number of 305-d lactation records. A historical database of test-day records had to be constructed. Data was retrieved from four milk recording centers in Canada with varying measures of success, but test-day records before 1988 were not available except at the cost of reentering the data from microfiche. In contrast, completed 305-d lactation records date back to 1957. Several levels of edits and minimum standards were applied to the data prior to inclusion in the database. These were on the herd level, the cow level, and the test-day level. Historical test-day records also had to be matched to existing 305-d completed lactations. At least 80% of the cows in the 305-d lactation file had to have matching test-day records by year of calving for that year of test-day records to be included in genetic evaluation. This 80% minimum was met in the Maritimes and Quebec for cows first calving in 1988 and all subsequent years, 1990 in Ontario, 1991 in British Columbia, and 1992 in the Prairie provinces. The average percentage over all years and regions was 97%. All cows were required to have first lactation test-day records and only records from the first three lactations were included in genetic evaluation. The number of test-day records, cows, and herd-test date-parity subclasses are given in Table 1 for each breed for the first official run in February 1999. Cows averaged 15.7 test-day records (over the first three lactations), and herd-test date-parity subclasses averaged 9.8 test-day records. Even though there would be fewer years of test-day data for genetic evaluations, the pedigrees of animals and the relationship matrix would Journal of Dairy Science Vol. 83, No. 5, 2000
include the same animals as in the previous official runs. Computing The calculation of genetic evaluations in the Holstein breed required approximately 25 min of central processing unit (CPU) time per iteration. Mixed model equations for the multiple-trait random regression model on the full data set were solved by iteration on data with Gauss-Seidel and block iteration techniques. Two copies of the data set (sorted by HTD and cow, respectively) and inverted diagonal matrices for each block were read in each iteration. Pedigree data were stored in memory with animals ordered from youngest to oldest. The computer was an HP-UX 9000/800 (Hewlett-Packard Company, Palo Alto, CA) workstation with 2 gigabytes of random access memory. A minimum of 300 iterations are performed per run; starting values are solutions from the previous run. The achieved convergence criteria of sum of squares of differences between new and old solutions divided by the sum of squares of the new solutions was less than 9.9 × 10−8. Use of solutions from the previous run greatly reduces the number of iterations that are needed. With nearly 2 million animals in the Holstein breed (cows with records and ancestor dams and sires) and with 36 genetic regression coefficients per animal, the amount of memory needed to hold solutions and righthand sides of mixed model equations during the iteration process was close to 1.3 gigabytes of random access memory. To save solutions and diagonal blocks for all animals, as well as other information needed for publication of results, a total of 16 gigabytes of disk storage was required. Thus, adequate computing resources are a key requirement for running the CTDM. Computer technology will likely make advances and better computing algorithms may be found to reduce the time needed per run (6). Genetic Evaluations Each animal receives three additive genetic regression coefficients for each of 12 traits, or 36 total solutions. These coefficients are used to calculate 305-d EBV for each of the twelve traits. Let aˆ khnm represent the mth coefficient for trait h in lactation n and animal k, then a 305-d EBV is EBVkhn = 305 aˆ khn1 + 46,665 aˆ khn2 + 19.5042 aˆ khn3, for milk, fat, and protein yields (expressed in kilograms) and
SYMPOSIUM: TEST-DAY MODELS
ETAkhn = 0.5(aˆ khn1 + 153 aˆ khn2 + 0.063948 aˆ khn3), for average somatic cell score over the lactation, expressed as an ETA. See Jamrozik et al. (7) for derivation of the coefficients used in these equations. The EBV and ETA are expressed relative to a genetic base, defined as cows calving in the calendar year three years prior to the current year and having test-day records in the analysis. The Canadian genetic base is a rolling genetic base that gets updated in the first run of each year, contrary to other countries, such as the United States, which updates its genetic base every 5 yr. Persistency of milk yield can also be computed from the solutions to the CTDM. First, calculate EBV for yields on d 60 and d 280 of lactation as EBVkhn:60 = aˆ khn1 + 60 aˆ khn2 + 0.049787 aˆ khn3, and EBVkhn:280 = aˆ khn1 + 280 aˆ khn2 + 0.00000083 aˆ khn3, then if y¯ 60 and y¯ 280 are the average yields of cows in the genetic base on d 60 and 280, respectively, then persistency expressed in ETA is ETAkhn:p =
0.5(EBVkhn:280 − EBVkhn:60) + y¯ 280 × 100%. y¯ 60
Persistency ETA were calculated for milk yield only, but for each lactation separately, and only for bulls. Persistency ETA for cows could be affected by number of days open in that lactation, but this effect was assumed to average out over all progeny of a dairy bull. Without reliable conception dates on cows during the lactation, it is not possible to accurately account for number of days open in the model. Before the adoption of CTDM dairy producers were accustomed to seeing only one EBV for each yield trait and the separate EBV for each lactation needed to be reduced into one figure each of milk, fat, protein, somatic cell score, and persistency. The yield traits had to be standardized to a common SD before they could be combined. The desired SD (σdes) was the SD of the old 305-d lactation EBV for officially published bulls. The SD (σ) of yield trait EBV of bulls that qualify for official publication and were born between 1984 to 1993 inclusive were used to standardize the EBV within each lactation, as EBVstd = EBV
σdes . σ
Standardized milk, fat, and protein yield EBV were averaged across lactations 1, 2, and 3 to give one official
1139
EBV per yield trait. EBV for somatic cell score were combined differently because mastitis problems were economically more important in second lactations, so that the combined EBV (2) was ETA = 0.25 ETA1st + 0.65 ETA2nd + 0.10 ETA3rd. Similarly, persistency of milk yield was deemed more important in first lactations so that the combined EBV for persistency was ETA = 0.50 ETA1st + 0.25 ETA2nd + 0.25 ETA3rd. These combining procedures are subject to change in the future as further economic studies are conducted. All EBV and ETA information is available on the World Wide Web from Canadian Dairy Network (1). An example of the EBV information available on four dairy bulls from CTDM is given in Table 2. The production trait EBV have been standardized to the same SD within lactations so that they can be compared across lactations. Bull A has EBV for yield traits that are fairly similar for each lactation, indicating that its daughters maintain the same level of genetic superiority for yield traits as they mature. The daughters of Bull B, however, show a marked decline in superiority from first to third lactation, indicating a bull that may not be as profitable as the first. Thus, by publishing all 12 EBV plus three EBV for persistency of milk yield, more information can be gained by a dairy producer than from combined evaluations. Given that considerable effort has been made to calculate 36 genetic solutions per animal with the CTDM, combining information across lactations is a transition step. Indeed, as dairy producers become aware of the information as presented in Table 2, then the more they desire to see it, and plans to move to this type of expression are in progress for next year. Reliabilities Reliability of EBV must be approximated because a generalized inverse of the mixed model equations for the CTDM cannot be computed directly. In the future, Canada expects that fat and protein yields may not be available for all test-day yields, and in the past records of protein yields were not always collected. Therefore, the reliability of all yield traits should be based on the trait with the most limited information, i.e., protein yield. Thus, reliabilities should be conservative approximations. The approximation procedure is similar to that described by Graser and Tier (3). Recall that G is the additive genetic covariance matrix of order 36. Let Hk Journal of Dairy Science Vol. 83, No. 5, 2000
1140
SCHAEFFER ET AL. Table 2. Example information available on bulls from the Canadian Test-Day Model. Bull ID
Herds
Daus.1
A
26
35
Lact. (no.) 1 2 3
B
55
72 1 2 3
C
56
90 1 2 3
D
57
64 1 2 3
TD recs
Rel2
EBV Milk (kg)
EBV Fat (kg)
EBV Protein (kg)
ETA SCS4
ETA Pers.5 %
384 279 105 0
80
2314 2164 2490 2289
46 42 50 47
76 77 78 74
3.02 2.79 3.09 3.11
64 72 57 53
666 515 151 0 2386 1196 813 377 999 434 321 244
87
1768 2144 1659 1500 2288 2471 2538 1857 1603 1408 1699 1702
70 93 62 56 54 55 63 43 42 34 50 41
64 78 59 55 77 84 85 62 52 49 55 53
3.19 3.23 3.16 3.27 3.30 3.43 3.24 3.42 2.75 3.02 2.68 2.58
63 69 58 55 69 79 59 59 66 75 58 55
90
88
Daus. = Daughters. Rel = Reliability percentage. 3 EBV = Estimated breeding value. 4 ETA SCS = Estimated transmitting ability somatic cell scores. 5 Pers. = Persistency evaluation in percentage. 1 2
be the diagonal block of the mixed model equations for animal k, which includes information from each testday record on the animal, and refer to it as Zk′ R−1 Zk, plus G−1. Absorption of the permanent environmental effect of the animal is necessary, hence calculate Ck = (Hk − (Zk′ R−1Zk)(Zk′ R−1Zk + P−1)−1(Z′ kR−1Zk)) −1. Now extract submatrices of Ck and of G of order 9 by 9 corresponding to the elements just dealing with protein yield regression coefficients, and let them be denoted by Gp and Ckp. Form the following matrix
(
M= 305 46,665 19.5042 0 0
0 0
0 0
0
0
0
0
0
0
)
305 46,665 19.5042 0 0 0 , 0 0 0 305 46,665 19.5042
then calculate Cr = MCkpM′ and Gr = MGpM′, which will be of order 3 by 3. Finally, let f′ = (0.33333 0.33333 0.33333), then gr = f′Grf and cr = f′Crf Journal of Dairy Science Vol. 83, No. 5, 2000
are scalars representing the averaging of the three lactations. An effective number of progeny, ne can be derived as ne = λ
[( ) ]
gr −1, cr
where λ = (4 − h2)/h2. This quantity is calculated for every animal with test-day records, and for ancestor animals this quantity starts at zero. M and f′ are defined differently for somatic cell score and persistency. Next a selection index is constructed based on the number of effective maternal and paternal half-sibs and on number of effective progeny of the animal in question. The accuracy from this index is used as the reliability. In a single-trait situation, the reliabilities from this approximation procedure were highly correlated with the diagonals of the inverse of the mixed model equations (12). For the CTDM the approximate reliabilities were compared to the exact reliabilities on a small data set, and were found to agree closely (9). This approximation takes into account the number of test-day records per individual per lactation, the actual days in milk when they were recorded, the accuracy of each test-day record, and relatives’ information through the relationship matrix. Economic Indexes Canada has the Total Economic Value index (TEV) for commercial producers which includes production,
1141
SYMPOSIUM: TEST-DAY MODELS
longevity, and mastitis subindices, and the more popular Lifetime Profit Index (LPI) which includes only production and conformation subindices. Research is needed on modifying the production subindex in both TEV and LPI to utilize three separate EBV per yield trait and persistency evaluations. Given that the CTDM can distinguish among bulls on their persistency both within and across lactations, these traits may be more highly related to longevity than any combined EBV. The test-day record database should also be able to provide culling rates on bulls’ daughters that were not readily available in the past. EXPERIENCES Publication Criteria The publication criteria for bulls do not change greatly between an animal model and CTDM. Bulls should have 20 daughters with one test-day record beyond 90 DIM, and these daughters should appear in at least 10 herds. A minimum reliability of 60% on protein yield EBV is also required. For breeds other than Holstein, the minimums are 10 daughters with tests after 90 DIM in at least five herds. The publication standards for cows has changed considerably. A cow must have at least two supervised testday records with protein yields, and one of those must be after 60 DIM. Of the last two test-day records, at least one must be supervised, and the average interval between test-day records in the current lactation must not be greater than 50 d. A minimum reliability of 30% on protein yield EBV is also required. These are the requirements for official publication of cows on top lists or official pedigrees in a breed, but all cows receive EBV, and these are delivered to the herd owners for their own management usage. Breed associations may have additional requirements for their awards programs. These criteria can readily be applied to cows that change herds or that are imported into Canada. Under these standards, a cow could appear on an official list in February at say 75 DIM, followed by three unsupervised tests such that the cow was not publishable in May, followed by one or two supervised tests so that she was publishable again in August. Another cow might have unsupervised test-day records throughout lactations 1 and 2, but in third lactation its records are supervised, and now its EBV are publishable because the standards just are applied to the current lactation. A goal of the CTDM was to allow milk recording to have greater flexibility in the types of recording schemes that it offered to producers. Many producers found the monthly supervised scheme too costly and invasive of their time. At the same time, many breeders did not want to see official lists that included cows
that were not supervised at least partially during the lactation. Therefore, the possibility exists for a cow to have 10 unsupervised test-day records with a reliability greater than 30% and a very high EBV, but which would not appear on an official publication list. Producers can decide which recording scheme fits their budget and if having their cows appear on a publishable list is important, they will have to ask for supervised testing at least every other test day. At present the recording schemes chosen by herd owners has not changed much due to the CTDM. A new nationwide recording scheme is being adopted slowly. As mentioned previously, only test-day records from cows first calving since 1988 (later in some regions) and only test-day records from the first three lactations were used in the CTDM. Consequently, the CTDM excluded cows that completed their third lactation before 1990 and that were still alive and producing milk. These cows could possibly have daughters and would therefore receive EBV for all traits. Because they do not have any test-day records, supervised or unsupervised, they would not be eligible for official publication lists, but they may have appeared on official lists prior to CTDM, and may be worthy as bull dams. Therefore, a ‘blending’ algorithm was developed to merge their EBV based on a lactation model and on completed 305-d lactation records with calving dates up to March 1, 1994, with their test-day model EBV. If they were officially listed previously, then they would continue to appear on official lists until they were no longer alive. These cows should eventually disappear, and the blending can be terminated. Comparison to Previous Lactation Model Several test runs of the CTDM were made prior to February 1999 to prepare the industry for the change and to become comfortable with the changes in rankings of animals that would inevitably occur. In November 1998, a comparison between the test-day model combined EBV and the official 305-d lactation EBV found correlations between the two to be 0.97 (0.90 for somatic cell score), based on over 4200 Holstein bulls. The results for all breeds are shown in Table 3. Correlations with the separate lactation EBV from the test-day model with the November official EBV were slightly lower than 0.97 as seen in Table 4. These results plus estimated genetic correlations between lactations of 0.76 and the fact that shapes of lactation daily yields are very different between lactations indicates that the three lactations are distinctly different traits. The maximum increases and maximum decreases in combined protein EBV between the November official 305-d lactation EBVs and the November 1998 test-day Journal of Dairy Science Vol. 83, No. 5, 2000
1142
SCHAEFFER ET AL. Table 3. Correlations between lactation model EBV1 and Canadian Test-Day Model EBV for bulls only from November 1998. Breed
No.
Milk EBV
Fat EBV
Protein EBV
No.
SCS
Holstein Ayrshire Jersey Brown Swiss Guernsey Canadienne M. Shorthorn
4293 328 196 84 58 21 13
0.970 0.934 0.988 0.882 0.945 0.946 0.902
0.971 0.929 0.986 0.915 0.956 0.906 0.920
0.972 0.925 0.983 0.892 0.956 0.904 0.914
4186 308 189 55 53 ... ...
0.903 0.893 0.915 0.845 0.806 ... ...
ETA = Estimated breeding value.
1
model EBV are given in Table 5. Thus, the change from a lactation model to a test-day model could be very dramatic for some bulls. Although average changes in EBV and correlations indicated good agreement between models, the listings of the top bulls showed more significant re-rankings. In a May 1998 test run, for example, six new Holstein bulls appeared in the top 10, 22 in the top 50, and 31 in the top 100 compared to the results from the official lactational model. Unfortunately, these are the bulls in which producers are primarily interested, and which are marketed in other countries. Changes in the rankings of these bulls raise more suspicions about a genetic evaluation system than any shifts in rankings below the top 100. Good explanations and detailed information about these bulls are necessary to defuse the suspicions. While the general overall results were reassuring, the true test of confidence by producers is based on the top bulls and good answers are needed on these bulls.
for each lactation and combined EBV are shown in Table 6. Bulls with only first lactation daughters in 1997 had greater change in their combined EBV and lower correlation with their combined EBV in February 1999. Bulls with daughters in third lactation in February 1997 showed the smallest changes in EBV and highest correlations with February 1999 results. Canada makes four official runs per year for production traits, and therefore, quarterly runs between February 1997 and February 1999 are currently being simulated to study expected changes between runs. The changes shown in Table 6 were smaller than those in Table 5. Thus, the change in models resulted in greater changes in EBV than the addition of two years of test day records within the test day model. The CTDM is expected to yield more stable EBV than the lactation model over time.
Comparison Between CTDM Runs
Canada is heavily involved in the exportation of bull semen throughout the world, and has participated in Interbull activities since the early 1980s. Canada had to submit EBV from a test run (May 1998) of the CTDM and to provide validity tests on estimates of genetic trend to be included in the February 1999 Interbull evaluation of bulls. Interbull did a test run in September 1998 involving 22 countries and 6 breeds. Four countries had made substantive changes to their genetic evaluation procedures, including Canada’s switch to the test-day model. Canada supplied 3884 bull EBV for the Holstein breed compared to 4685 that were sent in August from the lactation model. Interbull re-estimated the sire genetic standard deviations for each trait as well as the genetic correlations among countries. Interbull expected, in September, that Canada would change officially to the test day model in February 1999. Genetic correlations generally decreased, but not more than 0.03 for any trait or breed. Genetic correlations between Canada and Germany, however, actually increased because Germany also uses a test-day model, although a fixed regression model. The number of Cana-
To study the stability that could be expected in the CTDM between runs, a dataset based on test-day records up to February 1997 was analyzed and compared to the official February 1999 results. Bulls with daughter TD records between 200 and 305 DIM in each lactation were separated according to the lactation number of their oldest 20 daughters in February 1997. The average changes, range of changes, and correlation between February 1997 and February 1999 in protein yield EBV
Table 4. Correlations between lactation model EBV and Canadian Test-Day Model EBV for separate lactations from November 1998 for 4293 Holstein bulls only. Lactation Trait
1
2
3
Combined1
Milk EBV Fat EBV Protein EBV
0.948 0.952 0.952
0.940 0.939 0.948
0.914 0.915 0.920
0.970 0.971 0.972
Combined = ¹⁄₃(Lactation 1 + Lactation 2 + Lactation 3).
1
Journal of Dairy Science Vol. 83, No. 5, 2000
International Considerations
1143
SYMPOSIUM: TEST-DAY MODELS Table 5. Maximum increase and decreases between Canadian Test-Day Model EBV and lactation model EBV for protein yield from November 1998 and number of bulls with absolute changes greater than 10 or 15 kg. Breed
No.
Max. decrease
Max. increase
No. with > 10 kg
No. with > 15 kg
Holstein Ayrshire Jersey Brown Swiss Guernsey Canadienne M. Shorthorn
4293 328 196 84 58 21 13
−23 −16 −15 −18 −5 −6 −15
34 28 17 21 18 11 7
328 28 11 25 8 1 1
71 13 2 10 1 0 0
dian bulls in the top 100 lists (expressed on the Canadian scale) decreased in comparison with the August 1998 Interbull run. Canada had about 800 fewer bulls from the CTDM for the test run, and in general, many of the older bulls had fewer daughters than they used to have under the lactation model because test-day records were available for cows first calving from 1988 only. For example, a bull could have dropped from 10,000 daughters to only 10 daughters in the test-day model. This same bull would still have many daughters in other countries and consequently the Interbull evaluation would be more strongly influenced by the daughters in other countries than by those in Canada. The same bull would have many sons in Canada, whose Interbull evaluations might also be affected. If second country proofs are deemed to be biased, then this bull’s Interbull evaluation and those of his sons could be biased. The answer to this problem is not clear, but the problem presents a significant marketing challenge for Canadian exporters. Because Canada plans to have separate EBV for each lactation as official proofs in 2000, from the CTDM,
then Interbull will have to determine how to analyze three protein yield EBV from one country rather than just one. The three protein yield EBV would be highly correlated, so that each protein yield EBV could not be considered as EBV based on a group of independent daughters, as they would be between countries. A multiple-trait extension to current Interbull analyses is technically feasible, but making the appropriate changes and testing programs by September 1999 is likely not possible. Canada may not be able to participate in future international evaluations. CONCLUSIONS Changing genetic evaluation methods from a 305-d lactation model to a test-day model has many ramifications that need to be considered. Producer acceptance is a key issue that will require good extension efforts. International acceptance also requires extension and coordination with Interbull. The additional benefits that can be obtained from a test-day model, such as persistency within and across lactations, better ac-
Table 6. Differences between Canadian Test-Day Model EBV from February 1997 with February 1999 for Holstein bulls only. Group of Bulls1
Lactation No.
1st
105
2nd
286
3rd
1682
Ave. Diff. Max. Decrease Max. Increase Correlation Ave. Diff. Max. Decrease Max. Increase Correlation Ave. Diff. Max. Decrease Max. Increase Correlation
1
2
3
Combined2
−2.78 −12 7 0.980 0.26 −9 14 0.994 1.60 −12 19 0.996
−4.09 −23 21 0.901 −2.53 −20 13 0.980 1.30 −13 22 0.994
0.13 −28 21 0.863 −2.66 −28 17 0.927 1.13 −15 17 0.989
−2.26 −18 14 0.944 −1.65 −13 11 0.982 1.34 −10 15 0.995
1 Group 1 includes bulls whose daughters have TD records only in lactation 1, Group 2 includes bulls whose daughters have TD records in lactations 1 and 2, and Group 3 includes bulls whose daughters have TD in all three lactations. 2 Combined = ¹⁄₃(Lactation 1 + Lactation 2 + Lactation 3).
Journal of Dairy Science Vol. 83, No. 5, 2000
1144
SCHAEFFER ET AL.
counting for herd-test date environments, movement of cows between herds, more flexibility in milk recording schemes, and more accurate genetic selection of bulls and cows outweigh the short-term disadvantages. The extra computing was not a problem for Canada, but could be too costly for countries with larger dairy cattle populations. New computing algorithms are being explored that would allow more random regression covariates per trait and possibly more lactations to be included. Further efforts are needed to determine the best ways of extracting the full potential from a test-day model analysis. ACKNOWLEDGMENTS Financial support from the Ontario Ministry of Agriculture, Food and Rural Affairs, the Cattle Breeding Research Council of Canada, and the Natural Sciences and Engineering Research Council are gratefully acknowledged. REFERENCES 1 Canadian Dairy Network website. 1999. http://www.cdn.ca/. 2 Dekkers, J.C.M., and J. P. Gibson. 1998. Applying breeding objectives to dairy cattle improvement. J. Dairy Sci. 81:19–35. 3 Graser, H. U., and B. Tier. 1997. Applying the concept of number of effective progeny to approximate accuracies of predictions derived from multiple trait analysis. Proc. Assoc. Adv. Anim. Breed. Genet. 12:547–551. 4 Jamrozik, J., G. J. Kistemaker, J.C.M. Dekkers, and L. R. Schaeffer. 1997. Comparison of possible covariates for use in a random regression model for analyses of test day yields. J. Dairy Sci. 80:2550–2556. 5 Jamrozik, J., and L. R. Schaeffer. 1997. Estimates of genetic parameters for a test day model with random regressions for production of first lactation Holsteins. J. Dairy Sci. 80:762–770. 6 Jamrozik, J., and L. R. Schaeffer. 1999. Comparison of two computing algorithms for solving mixed model equations for multiple
Journal of Dairy Science Vol. 83, No. 5, 2000
7 8
9 10 11 12 13 14 15 16 17
18
19
20
trait random regression test-day models. Livest. Prod. Sci. (in press.) Jamrozik, J., L. R. Schaeffer, and J.C.M. Dekkers. 1997. Genetic evaluation of dairy cattle using test day yields and random regression model. J. Dairy Sci. 80:1217–1226. Jamrozik, J., L. R. Schaeffer, and F. Grignola. 1998. Genetic parameters for production traits and somatic cell score of Canadian Holsteins with multiple trait random regression model. 6WCGALP. 23:303–306. Jamrozik, J., L. R. Schaeffer, and G. B. Jansen. Approximate accuracies of prediction from random regression models. Livest. Prod. Sci. In press. Jamrozik, J., L. R. Schaeffer, Z. Liu, and G. Jansen. 1997. Multiple trait random regression test day model for production traits. Interbull Bull. No. 16:43. Kistemaker, G. 1997. The comparison of random regression testday models and a 305-day model for evaluation of milk yield in dairy cattle. Ph.D. Thesis. University of Guelph. Koots, K. R., L. R. Schaeffer, and G. B. Jansen. 1997. Approximate accuracy of genetic evaluation under an animal model. J. Dairy Sci. 80(Suppl 1):226. (Abstr.) Ptak, E., H. S. Horst, and L. R. Schaeffer. 1993. Interaction of age and month of calving with year for Ontario Holstein production traits. J. Dairy Sci. 76:3792–3798. Ptak, E., and L. R. Schaeffer. 1993. Use of test-day yields for genetic evaluation of dairy sires and cows. Livest. Prod. Sci. 34:23–34. Reents, R., J.C.M. Dekkers, and L. R. Schaeffer. 1995. Genetic evaluation for somatic cell score with a test-day model for multiple lactations. J. Dairy Sci. 78:2858. Reents, R., J. Jamrozik, L. R. Schaeffer, and J.C.M. Dekkers. 1995. Estimation of genetic parameters for test-day records of somatic cell score. J. Dairy Sci. 78:2847. Schaeffer, L. R. 1997. Subject: random regressions. http://chuck. agsci.colostate.edu/wais/logs/agd 869258263.html. Accessed Nov. 18, 1997. Schaeffer, L. R., and J.C.M. Dekkers. 1994. Random regressions in animal models for test-day production in dairy cattle. Proc. 5th World Congr. Genet. Appl. Livest. Prod., Guelph 18:443. Schaeffer, L. R., J. Jamrozik, R. Van Dorp, D. F. Kelton, and D. W. Lazenby. 1999. Estimating daily yields of cows from different milking schemes. Livest. Prod. Sci. (in press.) Wilmink, J.B.M. 1987. Adjustment of test-day milk, fat and protein yields for age, season and stage of lactation. Livest. Prod. Sci. 16:335.