Color profile: Disabled Composite Default screen
714
Spatial patterns of tree height variations in a series of Douglas-fir progeny trials: implications for genetic testing Yong-Bi Fu, Alvin D. Yanchuk, and Gene Namkoong
Abstract: Spatial variation patterns of tree heights at ages from 6 to 12 years in a series of Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco) progeny trials that were conducted on 66 test sites over southern coastal British Columbia were examined with conventional statistics and geostatistical techniques. It was found that there were large variations in tree height over the years within and among the 66 test sites. The estimated proportions of the within-site variance explained by family, row, column, patchiness, and within-plot were on average 11, 7, 5, 12, and 47%, respectively, plus 7% due to unknown factors, and the applied blocking removed about 5% of the within-site variance. Significant gradients in row and column directions were observed in more than 44 test sites, and the estimated slopes ranged in average from 0.33 to 1.52 cm/plot. Patch sizes varied greatly over the test sites and ranged in average from 5.21 to 6.47 (plots), indicating that the average patch size for these trials was 18 m across. Temporal variations were large for family variance but not much for those variance proportions explained by row, column, patchiness, and within-plot. More gradients and larger patch sizes were found with older trees. Implications of these results are discussed for forest genetic testing. Résumé : Les auteurs ont étudié les patrons de variation spatiale de la hauteur des arbres à l’âge de 6 ans jusqu’à l’âge de 12 ans dans une série de tests de descendances de sapin Douglas (Pseudotsuga menziesii (Mirb.) Franco) établis sur 66 sites d’expérimentation dans le sud de la zone côtière de la Colombie-Britannique. Pour ce faire, ils ont employé des approches conventionnelles d’analyse statistique et des approches géostatistiques. Une forte variation de la hauteur des arbres aux différents âges a été observée au sein des 66 sites d’expérimentation et parmi ces derniers. Les proportions estimées de la variance intra-site dues à la famille, la rangée, la colonne, l’hétérogénéité spatiale et la variation intra-parcelle étaient respectivement de 11, 7, 5, 12 et 47%, en plus de 7% de la variance due à des facteurs inconnus. L’effet de répétition a permis d’éliminer 5% de la variance intra-site. Des gradients significatifs dans la direction de la rangée ou de la colonne ont été observés pour 44 sites d’expérimentation, et les pentes estimées variaient de 0,33 à 1,52 cm par parcelle en moyenne. La dimension des structures spatiales variait beaucoup au sein et parmi les sites d’expérimentation, avec des valeurs moyennes allant de 5,21 à 6,47 (parcelles). Ceci indique que les zones d’hétérogénéité affichaient, en moyenne, 18 m de diamètre pour ces essais. Alors que la variance familiale variait beaucoup dans le temps, la variation temporelle était faible pour les composantes de la variance expliquées par la rangée, la colonne, l’hétérogénéité spatiale et la variation intra-parcelle. Des gradients plus prononcés et de plus grandes zones d’hétérogénéité ont été observés lorsque les arbres étaient plus vieux. Les auteurs discutent des implications de ces résultats dans le cadre des essais expérimentaux en génétique forestière. [Traduit par la rédaction]
Fu et al.
723
Introduction Progeny trials in forest trees have proceeded for several decades and will continue to play a vital part in forest tree breeding, as these field trials provide valuable means to evaluate and select desirable individuals and populations for various breeding objectives (Namkoong et al. 1988; Fu et al. Received July 16, 1998. Accepted February 11, 1999. Y.B. Fu1 and G. Namkoong. Department of Forest Sciences, University of British Columbia, 3041–2424 Main Mall, Vancouver, BC V6T 1Z4, Canada. A.D. Yanchuk. Research Branch, B.C. Ministry of Forests, 712 Yates Street, Victoria, BC V8W 3E7, Canada. 1
Author to whom all correspondence should be sent at the following address: Plant Gene Resources of Canada, AAFC Saskatoon Research Centre, 107 Science Place, Saskatoon, SK S7N 0X2, Canada. e-mail:
[email protected]
Can. J. For. Res. 29: 714–723 (1999)
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:05:47 AM
1998). Analyses of previous progeny trials (Magnussen 1993a), however, indicated that genetic field trials of forest trees had not been free from technical and practical problems such as missing data, damage, outliers, microsites, competition, and genotype by environmental interaction. These problems have made the originally proposed data analyses more complicated and the employed field designs less efficient than previously thought. Concerns have arisen among many tree breeders toward the effectiveness of implementing current or future progeny trials. In the last decade, exploring and (or) developing field designs with small blocking (e.g., McCutchan et al. 1985; Loo-Dinkins and Tauer 1987; Williams and Matheson 1994) and incorporating spatial correlation in data analysis (e.g., Magnussen 1990) have slowly emerged. Gradually, the importance of understanding site variations for genetic testing in forest trees has been recognized, particularly in the development of new field designs (McCutchan © 1999 NRC Canada
Color profile: Disabled Composite Default screen
Fu et al. Fig. 1. Schematic variogram, illustrating the range, nugget, and sill. See text for definitions.
et al. 1985; Loo-Dinkins 1992; Magnussen 1993a). If a site exhibits a strong environmental gradient in a given direction, for example, the field layout could be effectively arranged to reduce the experimental error (Williams and Matheson 1994). If patch sizes are expected to be large, block size could be adjusted to remove more site variation (Fu et al. 1998). Also, if the spatial patterns of site variation are known, one could confidently formulate or select the proper spatial models in the analyses of the trial data (Magnussen 1990). However, information on site variation a priori is generally scarce, particularly with respect to forest genetic trials (Fu et al. 1998), and few studies have been made to characterize the variation patterns over test sites (Magnussen 1990). Site variation is the norm in forest genetic field trials, as the areas used are usually quite large in size (2–4 ha) and are often on slopes or terrains where environmental gradients (e.g., soil depth, drainage, etc.) and patchy microsite patterns in forest soils exist. Characterizing variation patterns for the sites to be used for trials, however, presents great challenges, as it requires detailed information on the sites in terms of site topology, soil properties, water levels, etc., and on their relations to growth patterns of various trees, all of which are difficult to obtain. Also, information from direct assessments of the sites to be used may not always be reliable for long-term predictions, as site variations are dynamic over time. Moreover, site variations may be specific, and the information on variation patterns of a test site may not be applicable to the other sites of interest, which makes such characterizations less appealing before starting a trial or even monitoring over time. In this study, we take an indirect approach to characterize the patterns of spatial variation that have been displayed in existing progeny trials. These patterns, particularly if characterized from a large number of test sites, should be useful for the development of forest genetic field trials in general. These characterizations should also provide information useful for assessing the performance of previous field designs in terms of design efficiency and for assisting efficient data analyses. To this end, we selected a series of Douglas-fir
715
(Pseudotsuga menziesii (Mirb.) Franco) progeny trials that were conducted on the 66 test sites widely distributed in southern coastal British Columbia (Heaman 1978; Yanchuk 1996). The residuals extracted from the observations of tree height over years 6–12 were examined using some geostatistical techniques such as variography to characterize patch variations and median-polishing method to infer marked gradients over test sites (Cressie 1991). Geostatistics, originally used in the mining industry (Matheron 1963), has been proven useful for characterizing and mapping spatial variations in many other scientific fields (Cressie 1991), but less attention is given in forestry (Moeur 1993; Liu and Burkhart 1994; Clarke et al. 1997). Geostatistics consists of variography and kriging. Variography uses variograms to characterize and model the spatial variance of samples, whereas kriging use the modeled variance to predict values between samples. A variogram expresses the variation between plots as a function of the distance separating them, which is defined below and illustrated in Fig. 1. As the distance between plots (called lag) increases, a variogram increases in value, meaning that the values among closely spaced plots are more highly correlated than those among more distant plots. A variogram will tend to reach a maximum or sill at a distance or range beyond which the values of plots are independent. The value of the variogram at the zero lag is usually called nugget (or the magnitude of within-plot variance as in a progeny trial), which represents unsampled spatial variation within plot, as well as the measurement error and “white noise.” Sill minus nugget could also be interpreted as the patch variation among correlated plots in a progeny trial. Clearly this technique can characterize the small-scale stochastic structure such as patchiness in terms of patch size (range) and patch variation. For the determination of the large-scale deterministic structure such as gradients, the median-polishing technique can be applied (Cressie 1991). The immediate objectives of this study were to (i) calculate average proportions of within-site variance explained by family, row, column, patchiness, and within plots, and the average proportion of within-site variance removed by used blocking; (ii) evaluate the proportions of the test sites showing significant gradients in row and column directions; (iii) determine the average patch size; and (iv) provide empirical evidence for justifying the use of incomplete block designs in forest progeny trials.
Materials and methods Progeny trial and test site The Douglas-fir progeny trials are deemed as a typical effort in British Columbia tree improvement programs. The trials were established from 1976 to 1986 on 88 test sites that were widely distributed in southern coastal areas (Fig. 2), with the aim of evaluating genetic variances and breeding values for coastal Douglas-fir in British Columbia. It included eight series of sixparent-tree disconnected half-diallel tests carried out over 10 years (Heaman 1978; Yanchuk 1996). Each of these 8 series was conducted on 11 different forest sites (Fig. 2), with each of about 150 full-sib families represented by four-tree row plots in four replicates on each site. Crosses were fully randomized within replicates (i.e., diallels were not blocked in replicates). Measurements of tree height and breast-height diameter were made twice over the ages © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:05:49 AM
Color profile: Disabled Composite Default screen
716
Can. J. For. Res. Vol. 29, 1999
Fig. 2. Distribution of the 66 test sites over the southern coastal British Columbia in eight test series of the Douglas-fir progeny trials.
of 6 to 12 for most of the sites. In this analysis, only six of the eight series (i.e., a total of 66 test sites) were used, as complete height measurements up to age 12 and data on row and columns are available on these sites.
Classical statistical analysis For each of the 66 sites, normality of 2000–3000 observations for tree height at the given age was examined by kurtosis and skewness tests (Snedecor and Cochran 1967). All the traits measured were normally distributed (results not shown). These observations then were subject to an analysis of variance components for family, row, and column, from which the proportions of the within-site variance explained individually by family, row, and column factors were estimated. In this analysis, only family, row, and column factors were considered and treated to be random. Another analysis of variance components was also conducted, considering family, replicate, row, and column as random factors, to determine how much the applied blocking (i.e., replicate) had removed the within-site variance. Note that replicates in this trial were arranged on a test site mainly following the direction of arbitrarily defined rows and could be treated as superimposed blocks in analysis. Also note that family and block were not orthogonal to row and column and such non-orthogonality could magnify the variation patterns observed in some of the test sites. These analyses of variance components were done using SAS® VARCOMP procedure (SAS Institute Inc. 1995) and repeated for the 66 test sites for height measured at two ages.
Spatial analysis of residuals For spatial analyses, residuals of tree height were used, as the environmental variations within the sites are of main concerns.
Thus family means were calculatedd and removed from each observation of the family member, which was done in SAS® GLM procedure (SAS Institute Inc. 1995). The resulting residuals, however, should still retain about half of the genetic variation, as within-family genetic variation can not be effectively removed from such a full-sib progeny data. These residuals could also be confounded with block and diallel effects. Figure 3 illustrates the distributions of the residuals at 11-year height over the three example test sites (CASS42, BOWS46, CYRC50). Residuals of a given site were first analyzed with a medianpolishing technique to (i) obtain the medians of the residuals for each of rows and columns for the analyses of the trends (or gradients) and (ii) generate detrended residuals by removing these medians from rows and columns for the analyses of the small-scale spatial structure. Note that medians rather than means were used here because the former is more robust than the latter to unequal numbers of observations among rows or columns. In this study, median data were generated with a program written in SAS® IML (SAS Institute Inc. 1995) and examined using SAS® REG procedure (with linear regressions) to determine how many sites exhibit significantly linear trends over row and column directions. As the trends in various sites may not always be linear, we also plotted the data and did visual inspections of the trend patterns. For ease of summary, the visual patterns were roughly described with several classes: L, linear trend; R, randomly distributed; M shaped; N shaped; U shaped; and V shaped; M shaped may be given to a Wshape pattern observed. It should be noted that, for simplicity, we considered the detrending only by rows and columns, not in other orientations. Such median-polishing may not capture all of the large-scale deterministic structure. Detrended residuals of a given site were analyzed using a © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:05:55 AM
Color profile: Disabled Composite Default screen
Fu et al.
717
Fig. 3. Illustration of spatial patterns of the residuals of tree height at age 11 at the three test sites CASS42, BOWS46, and CYRC50. The size of the dots represents the magnitude of the residuals present in the plot; the larger the dot, the bigger the residual. Note that the magnitudes of the residuals were added with a positive constant for greater than zero residuals and rescaled to 10 intervals; such adjustment was made only for such illustration of variation pattern, not in the analysis.
see the relationship between the sample variance and the lateral distance separating samples. As mentioned above, the lag distance where the variance approaches an asymptotic maximum, known as a sill, is the range across which data are spatially correlated. As the lag distance approaches zero, the variance usually approaches a finite value, called the nugget. The nugget represents the residual variation that can not be removed by close sampling and is the within-plot variance as in progeny trials with single-tree plots, as mentioned above. Typically, four variograms are usually constructed in the orientations of N–S, E–W, NW–SE, and NE–SW. These variograms are expected to be the same when there are no large-scale deterministic structures present in any orientations. This explains why a detrending treatment of the data is preferred before generating a variogram for analyses of patch variation. The variogram generated from data is actually an estimate of the local variogram in a finite geometric field, usually called the experimental variogram. The experimental variogram is in turn an approximation of the theoretical variogram, which is defined by an infinite number of pairs over an infinite field. The expected value of both the experimental and local variogram is the theoretical variogram. Thus, one could generalize observed spatial patterns by fitting the experimental variogram into various theoretical spatial models with regression techniques. The theoretical spatial models that are frequently examined are spherical, exponential, Gaussian, and hole effect models, but the spherical model (also called the spherical covariance model) is the most commonly used one in geostatistics (Cressie 1991). Specifically, the spherical covariance model describes the spatial pattern as
[2]
variogram to obtain sill, nugget, and range to characterize patch variations of each site. The main idea of the variogram technique is first to obtain experimental variances as a function of distance between plots and then to fit these experimental variances with a theoretical variance model, so that sill, nugget, and range can be estimated. The experimental variance of the detrended residuals as a function of distance is defined as
[1]
2γ(h) =
1 m ( h)
m( h)
∑
[Z(x i) − Z(x i + h)]2
i =1
where 2γ(h) is the variance for m(h) pairs of the residuals separated by a distance of h, known as a lag, and Z(xi) and Z(xi + h) are the values at positions xi and xi + h, respectively (Matheron 1963). Plotting the variance over the lag distance (h) generates a graph, usually called a variogram (Fig. 1). From this variogram, one can
3 h 1 h3 c0 + c1 − 2γ(h) = 2 α 2 α3 c0 + c1
h ≤α h >α
where c0 is the nugget, c1 the patch variance, c0 + c1 the sill or maximum value of the variogram, h the lag distance, and α the range or maximum distance over which the residuals are correlated. Fitting variograms to spatial models is a bit of an art, and the fitting procedures are usually performed with several weighting methods such as least squares (LS), weighted LS, nugget LS, variance LS, and nugget and variance LS. Details on the fitting can be found in geostatistical books (e.g., Cressie 1991). In this study, we analyzed the detrended residuals using UNCERT 1.20 program (Wingle et al. 1995) operating on a UNIX system, mainly as this program can handle the large data set in this study (it is a free software and can be retrieved using the anonymous ftp site at uncert.mines.edu/pub/uncert). An isotropic variogram for a test site was generated with the VARIO procedure using the distance of one plot (3-m spacing) as the minimum lag. The basic procedure to generate a variogram was to, for a given lag (say of one plot length), take all pairs of values separated one lag apart irrespective of directions, calculate the difference for each pair, sum up the difference for all pairs, and divide the sum by the number of pairs, which gave the value of the variogram for this lag. This procedure was repeated for other lag distances (say two plots, three plots, four plots, and so on). These variogram values versus lag distances were plotted to produce the experimental variogram. In this study, only two thirds of the maximum plot distance for a test site (i.e., the maximum lag distance of 50 plots apart) were considered because more than 10 000 pairs of values for each of these lags could be randomly sampled. The observed variogram was fitted using the VARIOFIT procedure with the spherical covariance model mentioned above and nugget and variance LS weighting method. This weighting method was used for fitting because nugget and sample variance differ among the test sites. From the fitting, the range, sill, nugget, and patch variance (c1) were obtained. To assess the fitting, visual inspections were also made on © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:05:58 AM
Color profile: Disabled Composite Default screen
718
Can. J. For. Res. Vol. 29, 1999
Table 1. Summary of the means and SDs of conventional statistics and geostatistical descriptions of variations in tree heights with five age-classes for different factors. Age-class (number of the sites measured per age-class) Factor
6 (33)
7 (33)
9 (12)
11 (32)
12 (22)
Overall*
Site mean Site variance Family variance (%) Row variance (%) Column variance (%) Patch variance (%) Plot variance (%) Block variance (%) Coefficient for rows† Coefficient for columns† Patchy size‡
220±63 3559±1792 8.4±3.7 6.7±5.8 5.1±4.9 10.7±5.3 61.5±10.3 6.7±6.8 0.65±0.69 (24) 0.40±0.33 (24) 5.97±2.26 (26)
243±56 4723±2298 10.8±4.0 5.7±2.8 3.5±3.1 13.0±11.9 61.1±12.6 4.0±3.6 0.51±0.39 (25) 0.42±0.50 (23) 5.21±1.79 (27)
464±100 8638±1338 11.8±4.8 10.4±8.1 9.9±9.4 12.7±4.5 54.0±8.8 9.4±9.1 1.52±1.08 (11) 0.62±0.49 (8) 6.47±2.87 (12)
608±142 16 590 ± 6512 10.1±4.7 6.7±7.7 7.1±8.4 10.8±5.9 59.5±10.7 5.3±6.4 1.46±1.43 (27) 0.96±1.02 (23) 5.97±1.86 (22)
636±112 18 265 ± 6659 13.7±5.5 7.7±5.3 5.0±4.5 12.8±4.9 56.0±8.0 5.4±8.0 1.06±0.72 (19) 0.73±0.61 (16) 5.69±1.83 (16)
325±35 6903±951 10.5±2.0 6.5±2.1 4.7±2.1 11.9±2.5 57.7±14.3 5.2±2.6 0.73±0.29 0.52±0.22 5.76±0.91
*The means and SDs for all the ages measured are obtained with minimum variance mean and minimum variance estimators, respectively, as 5 m 1 1 ∑ v i @ ∑ v and, 1@ ∑ v where m i and v i are the mean and variance for the age-class i (= 1 … 5). i i 1 i=1 i=1 † The value in parentheses is the number of the test sites that showed a significant (P < 0.05) regression of the medians over rows or columns. ‡ The value in parenthesis is the number of the test sites that showed the obvious visual fit with the spherical covariance model and was used for the calculations of the mean and SD. 5
the fitting graphs. Dividing c1 by the total sample variance for a test site gave the proportion of the within-site variation explained by patchiness. This was done for the height of two ages for each of the 66 test sites.
Temporal comparison of variation in height As tree height was measured twice for each site over the ages of 6–12 years (i.e., 12 sites at ages 6 and 9; 21 sites at ages 6 and 11, 11 sites at ages 7 and 11; 22 sites at ages 7 and 12), there were five age-classes available (age 6 with 33 sites, age 7 with 33 sites, age 9 with 12 sites, age 11 with 32 sites, and age 12 with 22 sites) to make temporal comparisons of variation patterns in tree height. Such comparisons, although preliminarily done, will provide useful information on the temporal stability of spatial variation.
Results Our analyses here have generated detailed descriptions of spatial variation patterns exhibited in tree height over the five age-classes from 6 to 12 on the 66 test sites of the Douglas-fir progeny trials. The descriptions include the site mean (cm) and variance; the percentages of family, row, column, patchiness, and within-plot variances over the withinsite variance; linear regression coefficients for rows and columns and the visual interpretations of trend patterns; and the patch size and the fitting of the spherical covariance model. We summarize these results in Table 1, and the detailed results for each site are available from the first author upon request. From Table 1, it is clear that site mean and variance, as well as the variation explained by family, row, and columns, varied greatly among the test sites. Also clear is that the proportions of the within-site variance that could be explained by family, row, column, patchiness, and within-plot were on average 8.4–13.7, 5.7–10.4, 3.5–9.9, 10.7–13, and 54–61.6%, respectively, depending on the age measured. Note that the proportion explained by within-plot was partly confounded with the within-family genetic variance which is approxi-
mately equal to the proportion due to family (i.e., 8.4– 13.7%). Thus the real proportion of the within-site variance explained by within-plot would probably be in the range of 46–48%. Roughly speaking for these trials, about 11% of the within-site variance was due to family (i.e., 22% to genetic factor); 7%, to rows; 5%, to columns; 12%, to patchiness; 47%, to within-plot; and 7%, to unknown factors. However, the proportion of the within-site variance explained by the current blocking ranged in average from 4 to 9.4% (depending the age measured) and was 5.2% for all the ages combined. There were 49 and 47 test sites showing significant (P < 0.05) linear gradients over rows and columns, respectively, for the height of ages 6 and 7 and 57 and 47 test sites displaying significantly linear trends over rows and columns, respectively, for the height of ages 9–12 (Table 1). The linear regression coefficients averaged over the sites for rows and columns were 0.51–1.52 and 0.40–1.02 cm/plot, respectively, depending on the ages of measurement (Table 1). A linear coefficient of 0.50 or 1.00 here means that a tree of 10 m would have an expected measurement of 10.50 or 11.00 m if the tree was planted 100 rows away. The frequency distributions of estimated slopes for rows and columns for the height at ages 6–7 are given in Figs. 4A and 4B. There were 14 test sites showing significantly linear coefficients of 2.00 or larger over rows (results not shown), indicating that there would be a difference of 2 m in tree height between the two trees 100 rows away in these sites. The gradients displayed on the 66 test sites were not always linear either in row or column directions. Examination of displayed patterns shows that there were 30 test sites displaying linear trends, 9 V shapes, 8 M shapes, 8 N shapes, 7 U shapes, and 4 randomly distributed patterns in the row direction, and 30 test sites with linear trends, 8 V shapes, 13 M shapes, 9 N shapes, 4 U shapes, and 2 randomly distributed patterns in the column direction. Figure 5 illustrates some visual patterns of gradients for rows and columns, © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:05:59 AM
Color profile: Disabled Composite Default screen
Fu et al. Fig. 4. Frequency distribution of estimated slopes for rows and columns and estimated range values for patchiness from the residuals of tree height at the ages of 6 and 7.
719
sonable fits over the detrended residuals in CASS42 and CYRC50 but not BOWS46. The reasons why about 24% of the test sites gave no fit with the spherical covariance model are not clear, but the large-scale spatial variation in other orientations (other than row and column directions) may be a factor, as shown in the site BOWS46. Considering only the test sites that visually fit the spherical covariance model, the estimates of the range parameter varied greatly among the sites (from 2.7 to 14.5) and were on average from 5.21 to 6.47 (see Table 1), depending on the ages of measurement. Figure 4C shows the frequency distributions of estimated range values for the measurements made at ages 6 or 7. For the whole Douglas-fir trial, the average patch size was roughly around 18 m across (6 plots × 3-m spacing). Temporal patterns of variation in tree height were summarized in Table 1. The means and variances of the test sites and the genetic effects greatly increased with the increased ages. Interestingly, the proportions of the within-site variance due to row, column, patchiness, and plot did not change much with respect to age. While gradients became more significant when trees grew older (e.g., 0.51 and 1.06 in the row direction for ages 7 and 12 in Table 1), patch sizes increased only slightly (e.g., α$ = 5.21 and 5.69 for ages 7 and 12; see also Fig. 4). Such small difference could be due to the short span of age measured (i.e., 5 years) and a typical progeny test of forest trees would last over 15–25 years. The difference could be much larger over time and further inflated once competition among trees occurs.
Discussion
where rows and columns had V shapes, respectively, in the site CASS42, N shape and linear in BOWS46, and linear and U shape in CYRC50. This is consistent with the patterns as shown in Fig. 3. Figure 5 also gives the linear fit of medians for rows and columns. There were 53 test sites showing the obvious visual fit of the spherical covariance model for the detrended residuals of height at ages 6 and 7 and 50 test sites for the residuals of height at ages 9–12 (Table 1). Figure 6 illustrates the visual fittings of the spherical covariance model for the three test sites (CASS42, BOWS46, CYRC50), where there were rea-
Our descriptive statistics analyses, combined with some geostatistical techniques, of these Douglas-fir progeny trials clearly demonstrated that there were large variations in tree height of age 6–12 within and among the 66 test sites. Such variations were caused by genetic and environmental factors, as well as their interactions (specifically plot–environmental interactions). When we examined the variation of a test site by arbitrarily separating related factors into family, row, column, patchiness, and plot, we found that there were about 11% of the within-site variance due to family (i.e., 22% to genetic factor), 7% to rows, 5% to columns, 12% to patchiness, 47% to within-plot, and 7% to unknown factors, but the currently applied blocking effectively removed about 5%. Gradients in row and column directions were observed in more than 44 test sites (i.e., two thirds test sites used), and the estimated slopes ranged in average from 0.33 to 1.52 cm/plot, depending on the age measured. Range values, estimated with a spherical covariance model, varied greatly over the test sites but were on average from 5.21 to 6.47 plots, indicating that the average patch size for these trials was around 18 m across. Temporal increases were large for site mean, site variance, and family variance but not much for the proportions of site variance explained by row, column, patchiness, and within-plot. When trees grew older, more significant gradients were found and larger patch sizes observed. These results have provided empirical evidence that the conventional, large complete blocking cannot accommodate the large environmental variation well on the majority of the sites examined and also shown that environments exhibited complex patterns that may not be easily © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:06:03 AM
Color profile: Disabled Composite Default screen
720
Can. J. For. Res. Vol. 29, 1999
Fig. 5. Variation patterns of residual medians of tree height at age 11 over rows and columns at the three test sites CASS42, BOWS46, and CYRC50. Note that the number of rows and columns differ for the three test sites.
modeled with simple gradients and spherical correlations. Such complexities clearly justify the need of developing and implementing small incomplete block designs that can better fragment and account for the environmental variations in the progeny trials of forest trees. Characterization of spatial variations with geostatistics Our spatial analyses of residuals presented here showed that general patterns of patch variation can be characterized in terms of patch variance and size and various trends over rows and columns can be displayed. These analyses first examined the large-scale deterministic structures with medianpolishing methods and then the small-scale stochastic structures with variography, as we are interested in the information on both gradients and patchiness for genetic testing. Such two-step analyses may be of advantage in the characterizations of spatial variation over other spatial analyses. For example, Magnussen (1990) assessed the extent of autocorrelations in a jack pine (Pinus banksiana Lamb.) progeny trial planted in three test sites, by computing the correlation coefficients for first-, second-, and third-order neighbours, and found that the first-order correlations could adequately describe the spatial process. Similar studies can also be
found (e.g., Fairfield-Smith 1938; Patterson and Hunter 1983) in which residuals were directly fitted with various spatial variation models without removing gradients. Although the information from these studies is still interesting, it is difficult to apply directly in the development and analysis of forest genetic tests. These analyses are adequate for describing the general patterns of variation over many test sites, as we are more interested in the questions such as how many of the sites showed a significant linear trend over rows, rather than whether linear trends truly existed in specific sites. If the analyses were aimed to generate information on specific test sites, such as whether the test site displays a spherical or exponential spatial variation, some technical aspects could be emphasized, i.e., more on the choice of an adequate model for a particular site. This requires a better understanding of the factors or mechanisms influencing the site variations (Magnussen 1990), which is rarely possible. For example, although three quarters of the test sites examined (i.e., about 50 sites) showed visual fit with the spherical model, we do not know with certainty how many of these sites were correctly fitted with the model, as we did not test the fitting in detail for each of the sites. It is certain, however, that spatial © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:06:07 AM
Color profile: Disabled Composite Default screen
Fu et al.
721
Fig. 6. Variograms and the fitting of the detrended residuals of tree height at different ages with a spherical covariance model at the three test sites CASS42, BOWS46, and CYRC50. The dots over lag distance represent the experimental variogram and the fitted line is the modeled variogram. Note that only about two thirds of the maximum distance between plots was shown and the variances at the Y axis differ for various variograms.
patterns of variation greatly differ over test sites and detailed characterizations of such patterns for specific sites could provide insights into dynamics of variation present on test sites. In our analyses, we considered the derived residuals as being truly reflective of effect-free environmental variations. As mentioned above, this is not true as they were confounded with the within-family genetic variation, block effects, and partially with diallel effects. The effects of such confounding could be relatively small for describing the general patterns of variation over many test sites but can be quite large for characterizations of spatial variations for specific sites. Also, we did not consider the outlying values, as there were large numbers of observations for each of these sites, and the effects of outlying values, if present, on the estimates of range parameter made are unknown. Ideally, any outlying values should be removed from the residuals for reliable estimations of range parameter. Implications for the Douglas-fir progeny trials Our results show that 90% of the sites used for the Douglas-fir progeny trials displayed large spatial variations
over years in terms of either having gradients in row or column directions and (or) patch variations. There were only six test sites showing basically no trends in row and column directions and very small patch variations (i.e., MTFL09, ADAM23, ROON53, FRED59, JORD66, and BACO82). This clearly implies that small blocking should have been applied to the trials planted on the majority of these sites. Averaging over the 66 test sites, there was about 24% of the site variance that could be explained together by row, column, and patchiness, while the conventional complete blocking explained about 5%. This clearly shows that the randomized complete block designs used in the progeny trial did contribute to the reduction of site variations in estimating genetic values, but there were 19% more that could have been removed if more efficient row and column designs such as the latinized row and column designs (e.g., see John and Williams 1998) were used. There were two thirds of the test sites found to display significantly linear gradients over row or columns and the applied blocking was partially effective in removing gradients. However, blocking in both row and column directions would be preferred as most of the test sites showed linear © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:06:11 AM
Color profile: Disabled Composite Default screen
722
trends in row and column directions. Such blocking should have different shapes as the patterns of trends were not always linear as observed. Patch variations explained about 12% of the within-site variance observed, and the average patch size for this trial was estimated to be 18 m across. Approximately three quarters of the test sites visually fitted the spherical model. All of these results suggest the importance of considering patch variations in the development of the field designs, although it is difficult to remove all the patch variations. Given such patterns of spatial variation, one would wonder how large the effects of such variation were on the estimations of genetic parameters in these trials. We examined this by looking for relations between estimates of family variance and estimated slopes for rows and columns and those between estimates of family variance and estimated range values. Our examination indicated that estimates of family variance are generally decreased with increased gradients and patch variation, although the relations were not statistically significant (results not shown). This might be due to the application of row plots, rather than single-tree plots (Magnussen 1993b). Currently, we are examining the effects of the spatial variations on estimating family means and the change of ranking families. These results clearly imply that smaller blocking, probably in both row and columns, should be applied to future progeny trials for possible higher efficiency of evaluating families or individuals. For example, if the alpha design (see Williams and Talbot 1996) was originally applied to the trials, a design efficiency of 1.20 (compared with randomized complete block designs) could have been achieved for estimating family means (derived from Table 3 of Fu et al. 1998), when the patch size of 18 m across was considered alone. With presence of gradients, the design efficiency could have been higher than 1.20. Implications for future genetic testing of forest trees The patterns of spatial variation derived from this large series of Douglas-fir progeny trials used here may not necessarily be applicable to genetic testings for other forest species. Test sites in southern British Columbia may differ from those elsewhere and the patterns of expressed microsite variation may also differ for different traits measured and for other tree species. However, some implications of these analyses and results are clear. First, characterizations of spatial variations in existing or old test sites with geostatistical techniques are possible for the progeny trials of other tree species, as shown here, and can provide useful information for the development of related progeny trials. Second, some patterns of spatial variation observed in Douglas-fir trials may be considered in the development of the future genetic tests of other major tree species in British Columbia such as Sitka spruce (Picea sitchensis (Bong.) Carr.), as these species grow in similar regions, and test sites may be similar. Third, these spatial analyses provide evidence that the scope for improving field genetic testing still is large and indicate that implementations of smaller blocking in row and (or) column directions in forest genetic trials can be more efficient than previously thought. Fourth, patch variation can be important, as shown here, but it is difficult to remove even with small blocking. Considerations of patch variation in de-
Can. J. For. Res. Vol. 29, 1999
velopments of effective field tests (such as by adjusting block sizes) and inclusions of covariance model (such as the spherical covariance model) in analyses of test data should be given due attention.
Acknowledgments We would like to thank Drs. Peter Clarke, Steen Magnussen, and John King for their stimulating discussion on the research; Dr. William Wingle for his excellent help in the data analyses with his UNCERT program; and Dr. Rowland Burdon for his constructive comments on the early version of the manuscript. This research is financially supported by Forest Renewal BC research grant No. HQ96445RE.
References Clarke, G.P.Y, Haines, L.M., and Fu, Y.B. 1997. Kriging and field experiments. In Proceedings of 51st Session of the International Statistical Institute, Istanbul, Turkey. Book 2. pp. 229–232. Cressie, N.A.C. 1991. Statistics for spatial data. John Wiley & Sons, New York. Fairfield-Smith, H. 1938. An empirical law describing heterogeneity in the yields of agricultural crops. J. Agric. Sci. 26: 1–29. Fu, Y.B., Clarke, G.P.Y., Namkoong, G., and Yanchuk, A. 1998. Incomplete block designs for genetic testing: statistical efficiencies of estimating family means. Can. J. For. Res. 28: 977–986. Heaman, J.C. 1978. Choosing strategies for a breeding program in Douglas-fir [Pseudotsuga menziesii (Mirb.) Franco] from British Columbia. In Proceedings of the 3rd World Consultation on Forest Tree Breeding, Canberra, Australia. pp. 1205–1214. John, J.A., and Williams, E.R. 1998. t-latinized designs. Aust. N.Z. J. Stat. 40: 111–118. Liu, J.P., and Burkhart, H.E. 1994. Spatial characteristics of diameter and total height in juvenile loblolly pine (Pinus taeda L.) plantations. For. Sci. 40: 774–786. Loo-Dinkins, J.A., and Tauer, C.G. 1987. Statistical efficiency of six progeny test field designs on three loblolly pine (Pinus taeda L.) site types. Can. J. For. Res. 17: 1066–1070. Loo-Dinkins, J.A. 1992. Field test design. In Handbook of quantitative forest genetics. Edited by L. Fins, S. Friedman, and J.V. Brotschol. Kluwer Acadamic Publisher, Boston. pp. 96–139. Magnussen, S. 1990. Application and comparison of spatial models in analyzing tree-genetics field trials. Can. J. For. Res. 20: 536–546. Magnussen, S. 1993a. Design and analysis of tree genetic trials. Can. J. For. Res. 23: 1144–1149. Magnussen, S. 1993b. Bias in genetic variance estimates due to spatial autocorrelation. Theor. Appl. Genet. 86: 349–355. Matheron, G. 1963. Principles of geostatistics. Econ. Geol. 58: 1246–1266. McCutchan, B.G., Ou, J.X., and Namkoong, G. 1985. A comparison of planned unbalanced designs for estimating heritability in perennial crops. Theor. Appl. Genet. 71: 536–544. Moeur, M. 1993. Characterizing spatial patterns of trees using stem-mapped data. For. Sci. 39: 756–775. Namkoong, G., Kang, H.C., and Brouard, J.S. 1988. Tree breeding principles and strategies. Monographs on Theoretical and Applied Genetics. Springer-Verlag, New York. Patterson, H.D., and Hunter, E.A. 1983. The efficiency of incomplete block designs in National List and Recommended List cereal variety trials. J. Agric. Sci. 101: 427–433. © 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:06:12 AM
Color profile: Disabled Composite Default screen
Fu et al. SAS Institute Inc. 1995. SAS® user’s guide, version 6.12 edition. SAS Institute Inc., Cary, N.C. Snedecor, G.W., and Cochran, W.G. 1967. Statistical methods. 6th ed. Iowa State University Press, Ames. Williams, E.R., and Matheson, A.C. 1994. Experimental design and analysis for use in tree improvement. CSIRO publication, Canberra, Australia. Williams, E.R., and Talbot, M. 1996. ALPHA+. Experimental designs for variety trials, version 2.3. Design user manual. CSIRO, Canberra, Australia, and SASS, Edinburgh.
723 Wingle, W.L., Poeter, E.P., and McKenna, S.A. 1995. UNCERT user’s guide: a geostatistical uncertainty analysis package applied to groundwater flow and contaminant transport modeling, (software and user’s guide), Colorado School of Mines, Denver. Yanchuk, A.D. 1996. General and specific combining ability from disconnected partial diallels of coastal Douglas-fir. Silvae Genet. 45: 37–45.
© 1999 NRC Canada
I:\cjfr\cjfr29\cjfr-06\X99-046.vp Thursday, May 27, 1999 11:06:12 AM