David is the son of Jack and Lyn Parsons. He was born in Derbyshire,. England, but ...... Pardee, Personal Communication, 2004). Grasses in New York can very.
ALFALFA-GRASS FORAGE QUALITY PREDICTION IN NEW YORK STATE
A Thesis Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Master of Science
by David Parsons January 2006
© 2006 David Parsons
ABSTRACT
Timing of spring forage harvest is critical to obtain optimal quality for animal production. For forage that serves as the primary fiber source in the diet, neutral detergent fiber (NDF) is the principal forage quality variable of concern. In New York State, most alfalfa is grown in mixed stands with grass, and models for estimating pure alfalfa NDF have not been applicable, even for the alfalfa component of the sward. Stands of first-cut pure alfalfa and alfalfa and grass mixes were sampled at two experimental sites and farmers’ fields in 19 New York counties during May and June 2004 and 2005. A range of plant measurements and environmental characteristics were recorded and used to develop prediction equations. The first objective was to evaluate predictive accuracy of existing equations for estimation of spring alfalfa NDF content in New York growing conditions, and estimate additional prediction equations based on data easily accessible to farmers. Equations were developed to estimate alfalfa NDF content in spring growth in New York State, using easily available variables and a typical cutting height used by farmers. Models with two or three explanatory variables were shown to have greater predictive accuracy than models containing more variables. Models combining alfalfa height, growing degree days and Julian date offer the greatest potential to increase predictive accuracy. Stage of maturity did not improve prediction accuracy. Predictions using the PEAQ equation with alfalfa sampled in NY were biased, possibly due to differences in cutting height between observations used to fit the equation, and typical cutting heights in New York State. An equation previously fit to
New York, using only the explanatory variable alfalfa height, was shown to be less biased. The second objective was to develop equations for estimating total mixed stand NDF with an emphasis on farmer useable equations based on easily obtainable data. Regression equations can be used to estimate the NDF of alfalfa, assisting farmers in decision making at harvest time. In New York State, where most alfalfa is grown in mixed stands with grass, there are no available models to estimate NDF. The most important explanatory variables were the fraction of grass and alfalfa height. Growing degree days and Julian date improved goodness of fit but were biased between years. Categorization of the grass fraction into 0.2, 0.4, 0.6 or 0.8 allows estimation without requiring species separations. Categorization decreased R2 and increased RMSE but is a variable that could be more easily utilized by farmers. Model validation found significant biases with some model estimates; however biases and prediction errors were small enough to suggest that the results are practically applicable to New York farms. The third objective was to test the suitability of existing equations for estimating the NDF of the alfalfa component of mixed stands and to better understand how alfalfa is affected by the presence of grass in mixed stands. The applicability of PEAQ and other models to mixed stand alfalfa NDF estimation was examined. As with pure alfalfa, PEAQ was shown to be the most biased model, possibly due to the difference between the cutting height used to generate the PEAQ equation and the cutting height used in this study. For farmer fields a model based on MAXHT was shown to be the best onevariable model. Presence of grass increases the number of nodes and
increases alfalfa height, however, the relationship between alfalfa height and NDF is not changed. Application of the models proposed could greatly help producers in timing harvesting operations to optimize the quality of harvested forage in New York State.
BIOGRAPHICAL SKETCH David is the son of Jack and Lyn Parsons. He was born in Derbyshire, England, but spent the majority of his childhood years in Tasmania, Australia. He received a Bachelor of Agricultural Science with Honours from the University of Tasmania in 1996.
From 1997 to 2000 he worked for the
Falkland Islands Government as a forage agronomist. From 2000 to 2003 he worked as a crop and forage agronomist in the Riverina area of New South Wales, Australia. David married his wife Chelsea in 1995 and they have three children, Darby, Bethan, and Harriet.
iii
To Chelsea, for agreeing to be dragged to the other side of the world for me to pursue my studies.
iv
ACKNOWLEDGMENTS
I would like to thank numerous people from Cornell Cooperative Extension for their assistance in collecting samples, in particular Jen Beckman, Peter Barney, Aaron Gabriel, Jeff Miller, Bruce Tillapaugh, Rick Faucett, Michael Hunter, and Michael Davis. In addition, numerous farmers were kind enough to let myself and others collect samples from their fields. Jerry Cherney provided the impetus for this study, and his connections and organizational skills enabled it to be much broader geographically than it otherwise would have been. Jerry was always supportive and offered expertise, time and resources when necessary to ensure the success of the study. I would like to thank Robert Blake who also served on my graduate committee and provided valuable comments in preparation of this thesis. I would like to thank Hugh Gauch Jr. who provided direction in statistical analysis. I would like to acknowledge Sam Beer for assistance in many areas, for being there to discuss ideas with, and to help with the logistics of the whole study, both in the field and the lab. I also thank Kai Ming Zhao, Molly Lebowitz and Leon Hatch for their help with harvesting, sample preparation and analysis. I would like to acknowledge the financial support of the Department of Crop and Soil Sciences. The research was also supported by a Kieckhefer Adirondack Fellowship. I am grateful for the association I have had with many people at Cornell, both faculty and students, particularly fellow graduate students who have
v
helped me stay positive and focussed in the dull windowless tower that is Bradfield Hall. Finally, and by no means last in importance, I would like to thank my family, both here and overseas for their encouragement and support. Maybe when I’m all finished I’ll be able to cook dinner a bit more regularly. Hopefully my children will have some memory of their time in Ithaca and feel that this has been worthwhile.
vi
TABLE OF CONTENTS
1. Estimation of Spring Forage Quality for Alfalfa in New York State…………..1 1.1 Abstract……………………………………………………………………….1 1.2 Introduction…………………………………………………………………...1 1.3 Estimation of alfalfa NDF content………………………………………….2 1.4 Prediction equations for alfalfa NDF estimation………………………….3 1.5 Model selection for use by producers……………………………………..6 1.6 The value of alfalfa maturity in estimation of alfalfa NDF……………….7 1.7 Validation of the PEAQ Equation and other models……………………..9 1.8 Conclusions…………………………………………………………………13 1.9 References………………………………………………………………….15
2. Fiber Content Estimation of Mixed Alfalfa-Grass Stands……………………17 2.1 Abstract…………………………………………………………………….17 2.2 Introduction………………………………………………………………...18 2.3 Materials and methods……………………………………………………19 2.3.1 Field Study……………………………………………………………19 2.3.2 Data analysis…………………………………………………………25 2.4 Results and discussion…………………………………………………...26 2.4.1 Practical Equations for Producers………………………………….34
vii
2.5 Conclusions………………………………………………………………..43 2.6 References………………………………………………………………. .47 3. Alfalfa Morphology and Fiber Estimation in Mixed Stands………………… 49 3.1 Abstract…………………………………………………………………….. 49 3.2 Introduction………………………………………………………………….49 3.3 Materials and methods…………………………………………………… 51 3.3.1 Sampling of mixed stands…………………………………………..51 3.3.2 Alfalfa morphology experiment……………………………………..55 3.3.3 Predictive Equations and Statistical Analysis……………………..56 3.4 Results and discussion…………………………………………………….58 3.4.1 Evaluation of existing equations……………………………………58 3.4.2 Development of new equations…………………………………….62 3.4.3 Alfalfa morphology in mixed stands………………………………..65 3.5 Conclusions…………………………………………………………………71 3.6 References………………………………………………………………….74
viii
LIST OF FIGURES
Figure 1.1 Relationship between predicted and actual NDF using the PEAQ model…………………………………………………………………………….11 Figure 1.2 Relationship between predicted and actual NDF using the NYPQ model…………………………………………………………………………….12 Figure 1.3 Components of mean squared deviation (MSD) for the PEAQ and NYPQ models…………………………………………………………………...13 Figure 2.1 Components of mean squared deviation (MSD) for mixed alfalfagrass multiple regression models……………………………………………..36 Figure 2.2 Components of mean squared deviation (MSD) for mixed alfalfagrass multiple regression models using categories for the fraction of grass in the sward……………………………………………………………………..41 Figure 3.1 Components of mean squared deviation (MSD) of regression models used to estimate the NDF of the alfalfa component of mixed stands……………………………………………………………………………60 Figure 3.2 Comparison of regression models used to estimate the NDF of the alfalfa component of mixed stands……………………………………………61 Figure 3.3 Biplot for sample means and variables……………………………...69
ix
LIST OF TABLES
Table 1.1 Variable selection procedure summary, listing the three best models for each number of variables, sorted by Schwarz Bayesian criterion………4 Table 1.2 Practical equations for estimating alfalfa NDF in NY State………….7 Table 1.3 Coefficient of determination (r2), root mean square error (RMSE), slope (b) and y-intercept (a), derived from the regressions of PEAQ and NYPQ estimates on observed forage quality values…………………………9 Table 2.1 Descriptions of variables evaluated as potential predictors of NDF content in swards of alfalfa and grass………………………………………..20 Table 2.2 Alfalfa maturity categories used to assign a numerical value to the most mature stem in the sampling area……………………………………...23 Table 2.3 Grass maturity categories used to assign a numerical value to the most mature tiller in the sampling area………………………………………24 Table 2.4 Selected regression models to estimate mixed alfalfa and grass NDF………………………………………………………………………………27 Table 2.5 Coefficient of determination (r2), root mean square error (RMSE), slope (b), and y-intercept (a) derived from the regressions of equation estimates on observed forage quality values from the corresponding dataset pair……………………………………………………………………...31 Table 2.6 Selected regression models to estimate mixed alfalfa and grass NDF using categories for the fraction of grass in the sward……………….37 Table 2.7 Coefficient of determination (r2), root mean square error (RMSE), slope (b), and y-intercept (a) derived from the regressions of equation estimates on observed forage quality values from the corresponding dataset pair……………………………………………………………………...39
x
Table 2.8 Nested variable selection for estimation of mixed alfalfa and grass NDF using categories for the fraction of grass in the sward……………….44 Table 3.1 Descriptions of variables evaluated as potential predictors of mixed alfalfa and grass NDF………………………………………………………….53 Table 3.2 Coefficient of determination (r2), root mean square error (RMSE), slope (b), and y-intercept (a) derived from the regressions of equation estimates on observed forage quality values………………………………..59 Table 3.3 Selected regression models to estimate alfalfa NDF in mixed stands……………………………………………………………………………63 Table 3.4 Treatment means for alfalfa height, alfalfa stem diameter, number of alfalfa nodes, alfalfa internode length, alfalfa leaf-to-stem ratio, alfalfa yield, grass yield and bluegrass yield……………………………………………….66
xi
LIST OF ABBREVIATIONS
DM, dry matter; GDD, growing degree days; LC, lack of correlation; MSD, mean squared deviation; NDF, neutral detergent fiber; NU, nonunity slope; PEAQ, predictive equations for alfalfa quality; RMSE, root mean squared error; SB, squared bias; SBC, Schwarz Bayesian criterion.
xii
1. Estimation of Spring Forage Quality for Alfalfa in New York State 1.1 Abstract Equations were developed to estimate alfalfa NDF content in spring growth in New York State, using easily available variables and a typical cutting height used by growers. Models with two or three explanatory variables had greater predictive accuracy than models containing more variables. Models combining alfalfa height, growing degree days and Julian date offer the greatest potential to increase predictive accuracy. Stage of maturity did not improve prediction accuracy. Predictions using the PEAQ equation with alfalfa sampled in NY were biased, possibly due to differences in cutting height between observations used to fit the equation, and typical cutting heights in New York State. An equation previously fit to New York, using only the explanatory variable alfalfa height, was less biased.
1.2 Introduction Timing of spring forage harvest is critical to obtain optimal quality for animal production. For forage that serves as the primary fiber source in the diet, the content of neutral detergent fiber (NDF) is a key forage quality variable. The optimal forage NDF for high producing dairy cows is approximately 45% for alfalfa silage (2). In addition there is a relatively small range in optimal alfalfa NDF (3), emphasizing the need for quick and accurate methods for determining NDF. Numerous methods have been developed to estimate alfalfa NDF, including predictive equations based on weather, chronological age and plant morphology (5). The most widely used of these are the predictive equations for alfalfa quality (PEAQ) models (8) which
1
2
consider plant height and maturity stage. Although the initial NDF model was calibrated for Wisconsin, equations have been validated for other regions of the United States, including Ohio (10) and New York (1). Objectives of this study were to evaluate predictive accuracy of existing equations for estimation of spring alfalfa NDF content under New York growing conditions, and estimate additional prediction equations based on data easily accessible to growers.
1.3 Estimation of alfalfa NDF content Stands of pure first-cut alfalfa were sampled in growers’ fields in 19 NY counties during May and June 2004 and 2005. Additional samples were taken from various field experiments that included alfalfa, near Dryden and Ithaca, NY. Fields with alfalfa height of at least 12 in. were identified and a representative sampling area was visually identified in 2004. In 2005 a hoop of 36 in. diameter was used to define a representative portion of the field as the sample area. Height of the tallest alfalfa stem in the sample area was measured to the terminal bud (MAXHT). The maturity categories of Kalu and Fick (8) were used to assign a numerical value to the most mature stem in the sample area (MAXSTAGE). A representative sample of 1 to 1.5 lb of alfalfa was hand clipped from the sample area at a height of 4 in. to approximate typical alfalfa harvest height. The time of sampling was recorded and converted to a 24 hour decimal time (TIME). The date of sampling was transformed to Julian date, the number of days from the beginning of the year (JDATE). The altitude of the field was recorded (ALTF). The geographic coordinates, which were not used as explanatory variables, were overlayed with
3
the co-ordinates of all NY State weather stations using Manifold (Enterprise Edition 6.50, CDA International Ltd., San Mateo, CA). Voronoi cells were created to enable determination of the nearest weather station for each field and its altitude (ALTWS) was used as a potential predictor. Accumulated growing degree days were calculated for each site, using base 32oF (GDD32) and base 41oF (GDD41). Growing degree day accumulation was initiated when the mean temperature exceeded the base for five consecutive days. In total, 109 samples of alfalfa were taken. The aim of the study was to develop robust equations for estimation of alfalfa NDF content. Therefore samples were collected by numerous people, over a wide geographic area and two climatically different years, from fields with variable alfalfa and weed densities. Samples were analyzed for NDF content based on the procedure described by Van Soest et al. (12), using the ANKOM fiber analyzer with filter bags. Selection of explanatory variables was performed using PROC RSQUARE and regression analysis was performed using PROC GLM (SAS for Windows Release 9.1, SAS Institute Inc., Cary, NC).
1.4 Prediction equations for alfalfa NDF estimation In Table 1.1 the best three models are included for equations containing one to eight variables. Model evaluation was based on several statistics. The coefficient of determination (r2 or R2) is the proportion of the variation explained by variables in the model and estimates goodness of fit of the model to the data. Root mean square error (RMSE) is the standard deviation around the regression line and is another measure of goodness of fit. In model
4
Table 1.1 Variable selection procedure summary, listing the three best models for each number of variables, sorted by Schwarz Bayesian criterion (SBC). Number of
r2 or
Variables
R2
RMSE
SBC
Variables in Model
1
0.88
1.87
144.1
MAXHT
1
0.58
3.56
283.9
GDD41
1
0.58
3.60
286.3
GDD32
2
0.92
1.61
115.4
GDD41 MAXHT
2
0.91
1.62
116.3
GDD32 MAXHT
2
0.90
1.71
128.2
JDATE MAXHT
3
0.92
1.57
112.7
GDD32 ALTWS MAXHT
3
0.92
1.57
112.7
GDD41 ALTWS MAXHT
3
0.92
1.58
114.5
GDD32 ALTF MAXHT
4
0.92
1.54
113.0
JDATE GDD41 ALTWS MAXHT
4
0.92
1.56
115.1
JDATE GDD41 ALTF MAXHT
4
0.92
1.57
116.2
JDATE GDD32 ALTWS MAXHT
5
0.92
1.54
116.6
JDATE GDD41 ALTWS TIME MAXHT
5
0.92
1.55
117.1
JDATE GDD41 ALTWS MAXSTAGE MAXHT
5
0.92
1.55
117.3
JDATE GDD32 GDD41 ALTWS MAXHT
6
0.93
1.55
120.8
JDATE GDD41 ALTWS TIME MAXSTAGE MAXHT
6
0.92
1.55
120.9
JDATE GDD32 GDD41 ALTWS TIME MAXHT
6
0.92
1.55
121.2
JDATE GDD41 ALTF ALTWS TIME MAXHT
7
0.93
1.55
125.1
JDATE GDD32 GDD41 ALTWS TIME MAXSTAGE MAXHT
7
0.93
1.55
125.2
JDATE GDD41 ALTF ALTWS TIME MAXSTAGE MAXHT
7
0.93
1.55
125.4
JDATE GDD32 GDD41 ALTF ALTWS TIME MAXHT
8
0.93
1.56
129.4
JDATE GDD32 GDD41 ALTF ALTWS TIME MAXSTAGE MAXHT
5
development RMSE gives a measure of calibration error of the model and has the same units as the variable predicted, in this case NDF as a percentage of dry matter content. A good model therefore has a high R2 and a low RMSE. However, R2 and RMSE assess model accuracy in terms of the data used to construct the model; neither of these statistics assesses predictive accuracy of the model for independent data sets. The Schwarz Bayesian criterion (SBC) is a statistic that has components relating to both the fit of the model and the number of parameters in the model and in contrast to R2 and RMSE, is a better assessment of predictive accuracy. Statistics such as R2 and RMSE tend to overfit the data used for model construction at the expense of accuracy for new data (6). The results in Table 1.1 will first be considered in terms of how many predictor variables should be included in the model. Focusing on the best model for each number of variables, R2 with one variable in the model is 0.88 and increases with more variables in the model to a maximum value of 0.93. The best one-variable model has an RMSE of 1.87, and initially decreases with extra variables, with a minimum of 1.54 reached with four or five variables models. After this point, the RMSE begins to rise slightly and the 8-variable model has an RMSE of 1.56. The best one-variable model has an SBC value of 144.1. The minimum SBC value of 112.7 is reached with three variables in the model. As further variables are added SBC begins to rise and has a value of 129.4 for the eight-variable model. These results suggest that although models with a high number of variables increase the goodness of fit of the model, predictive accuracy is optimized with just three variables. This outcome exemplifies a widely observed response called Ockham’s hill wherein models with too few parameters underfit real signal whereas models with too many
6
parameters overfit spurious noise, so a relatively parsimonious model is most predictively accurate (6). Regarding the most useful variables for alfalfa NDF prediction, the best one-variable model is based on MAXHT, with an R2 of 0.88, RMSE of 1.87 and an SBC of 144.1 (Table 1.1). The next best one-variable model, based on GDD41, has a much lower R2 (0.58), and values of RMSE (3.56) and SBC (286.3) almost two times that of the previous model. The best two-variable model includes both MAXHT and GDD41 and results in a SBC of 115.4, lower than the best one-variable model. The best three-variable model includes MAXHT, either GDD41 or GDD32, and ALTWS, and the corresponding SBC of 112.7 is the lowest of all models.
1.5 Model selection for use by producers In choosing models for use by growers, both the predictive accuracy of a parameter and the accessibility of the parameter data should be considered. For example, Table 1.1 suggests that alfalfa NDF estimation models can be improved by incorporating growing degree day data; however this information is not always readily available to growers, particularly data from a weather station in close proximity to the farm. As an alternative, Julian date is a much more accessible variable, and thus equations including JDATE should be considered even though they may be less accurate than those including growing degree days. Table 1.2 lists some potentially useful equations for growers. The model incorporating only alfalfa height (eq. 1) may be of sufficient accuracy to meet the goal of farmers, which is to estimate the standing alfalfa NDF to aid in determining when the alfalfa is ready for spring
7
harvest. The advantage of a simple one-variable model is that calculations could easily be made by the farmer in the field, using a stick with NDF estimates on the side, or alternatively just a calculator. If increased model accuracy is desired and the information is available, growing degree days could be added to the model (eq. 2). In addition, if growing degree day information is available it is likely that the altitude of the weather station is also known, and thus a three-variable model with the best predictive accuracy could be used (eq. 4). However, if growing degree data was not available a farmer could improve predictive accuracy by incorporating Julian date in the model (eq. 3) and possibly also the altitude of the field (eq. 5).
Table 1.2 Practical equations for estimating alfalfa NDF in NY State. Equation r2 or no.
R2
RMSE
1
0.88
1.87
144.1 6.77 + 1.03(MAXHT)
2
0.92
1.61
115.4 6.89 + 0.0076(GDD41) + 0.85(MAXHT)
3
0.90
1.71
128.2 -7.03 + 0.11(JDATE) + (0.94)MAXHT
4
0.92
1.57
112.7 6.38 + 0.0072(GDD41) + 0.00090(ALTWS)
SBC
Model
+ 0.84(MAXHT) 5
0.91
1.63
120.6 -8.00 + 0.11(JDATE) + 0.0013(ALTF) + 0.95(MAXHT)
1.6 The value of alfalfa maturity in estimation of alfalfa NDF Equations proposed by Hintz and Albrecht (8) and validated by Sulc et al. (11) focused on the use of both alfalfa height and maturity in estimating
8
alfalfa NDF. Results of the variable selection procedure in this study (Table 1.1) confirmed the usefulness of MAXHT, but failed to demonstrate the usefulness of MAXSTAGE. To further determine whether MAXSTAGE is a significant variable in estimating alfalfa NDF, it was added to a model containing MAXHT. When MAXSTAGE is added to the model the R2 rises from 0.88 to 0.89, and the RMSE drops from 1.87 to 1.86, which is negligible from a forage quality standpoint. However when MAXHT is already in the model, MAXSTAGE does not significantly (P> 0.13) contribute to further explaining any variation in alfalfa NDF. In addition the SBC increases from 144.1 to 146.5, suggesting that the predictive accuracy of the model may actually decrease with MAXSTAGE in the model. Although it is recognized that with a larger number of samples MAXSTAGE is more likely to be statistically significant, the value of MAXSTAGE in a model for farmer use is questionable. These results confirm the work of Cherney (1) that an equation with MAXHT alone is acceptable for New York State, possibly because alfalfa in the Northeast often remains vegetative for prolonged periods with little change in the maturity stage (3). In addition to making calculations more complex for the farmer, these results suggest that the addition of MAXSTAGE contributes little additional information and is possibly detrimental to the predictive accuracy of the model.
9 Table 1.3 Coefficient of determination (r2), root mean square error (RMSE), slope (b) and y-intercept (a), derived from the regressions of PEAQ and NYPQ estimates on observed forage quality values. Prob.
Prob.
r2
RMSE
b
SEb
b=1.0
a
SEa
a=o
0.88
1.89
1.37
0.048
***
-16.0
1.59
***
NYPQ 0.88
1.87
1.31
0.046
***
-9.3
1.35
***
PEAQ
*** Significant at a probability level of 0.001.
1.7 Validation of the PEAQ Equation and other models Validation tests described by Fick & Janson (4) were applied to the PEAQ model (8) and also to an equation derived by Cherney (1) for NY State, hereafter referred to as NYPQ. Equations were tested by regressing the actual laboratory measurements on the estimated values from the predictive equations. A nearly perfect prediction equation would have an intercept a=0, a slope b=1, R2 near 1 and nil error (RMSE). Results in Table 1.3 show that both models are similar in their goodness of fit, with R2 of 0.88 for both models and RMSE of 1.89 for PEAQ and 1.87 for NYPQ. The b-values of 1.37 for PEAQ and 1.31 for NYPQ were of similar magnitude, with slopes significantly different from 1. Both models also had y-intercepts significantly different from 0, although the a-value for NYPQ (-9.3) was less than that of PEAQ (-16.0), indicating less bias in the NYPQ model. The relationships between predicted and actual NDF for the PEAQ and NYPQ models (Fig. 1.1 and 1.2) indicate bias in both models. Gauch et al. (7) proposed the partitioning of mean
10
squared deviation (MSD) into squared bias (SB), nonunity slope (NU) and lack of correlation (LC) as an alternative method to better understand the appropriateness of a statistical model. The three components add up to MSD and have distinct meanings, with SB, NU and LC relating to translation, rotation and scatter, respectively. PEAQ has a much larger overall MSD (19.98) than NYPQ (4.98) (Fig. 1.3). The LC component for PEAQ (3.49) denoting scatter of the data, is very similar to that for NYPQ (3.44) and both are within acceptable levels. The NU component for PEAQ (1.93) denoting rotation, is greater than that for NYPQ (1.46) and again both are within acceptable levels. Finally, the SB component for PEAQ (14.56) denoting translation, is much greater than that for NYPQ (0.09). These results suggest that the NYPQ model predicts NDF better than PEAQ for these data, primarily due to a much lower SB. The high SB of the PEAQ model means that its use would regularly result in overestimation of NDF values. Although there are numerous possible reasons for this, one possibility is the cutting heights used for the models. The PEAQ equation was based on a cutting height of 1.5 in., whereas the NYPQ model and this study were based on a cutting height of 4 in., representing a typical cutting height. Sampling lower would include material of higher NDF near the base of the plant; however the effect of this extra 2.5 in. would be diluted with increasing plant height. This may account for the larger NU in the PEAQ equation and the increasing underestimation of NDF for low values of NDF.
11
45
NDF = 16.89 + 0.69(MAXHT) + 0.81(MAXSTG)
Actual NDF (%)
40
35
30 1:1 25
20
15 15
20
25
30
35
40
45
Predicted NDF (%) Figure 1.1 Relationship between predicted and actual NDF using the PEAQ model.
12
45
NDF = 12.27 + 0.785(MAXHT)
Actual NDF (%)
40
35
30 1:1 25
20
15 15
20
25
30
35
40
45
Predicted NDF (%) Figure 1.2 Relationship between predicted and actual NDF using the NYPQ model.
13
Mean Squared Deviation (MSD)
25 SB NU LC
20
15
10
5
0 PEAQ
NYPQ
Model Figure 1.3 Components of mean squared deviation (MSD) for the PEAQ and NYPQ models. The three components are squared bias (SB), non-unity slope (NU), and lack of correlation (LC).
1.8 Conclusions Accurate estimation of first-cut alfalfa NDF for NY can be achieved with few variables. In addition to traditional statistics to assess model accuracy, a measure of predictive accuracy, such as SBC, can result in the selection of more parsimonious models. Alfalfa height alone is a good predictor of alfalfa NDF; however, model accuracy can be increased by including another variable in the model, such as growing degree days or Julian date. However, including alfalfa maturity did not increase model accuracy. The PEAQ equation
14
was biased for NY State fields when alfalfa is harvested at a 4-inch stubble height. Ultimately farmers are only interested in knowing the NDF content of the portion of the plant that they will harvest. Thus we conclude that the NYPQ equation, using only alfalfa height, is a more appropriate model for New York State conditions. With further validation, the additional models in this paper offer potential for improved accuracy of predicting alfalfa NDF content.
15
1.9 References 1. Cherney, J.H. 1995. Spring alfalfa harvest in relation to growing degree days. P. 29-36 In Proc. Natl. Alfalfa Symp., 25th, Syracuse, NY. 27-28 Feb. 1995. Certified Alfalfa Seed Council, Woodland, CA. 2. Cherney, J.H., D.J.R. Cherney, D.G. Fox, L.E. Chase, and P.J. Van Soest. 1994. Evaluating forages for dairy cattle. In Proc. Amer. Forage & Grassld. Council. 3:207. 3. Cherney, J.H., and R.M. Sulc 1997. Predicting first cutting alfalfa quality. p. 53-65. In Silage: Field to Feedbunk. Proceedings from the North American Conference. 11-13 Feb., 1997, Hershey, PA. NRAES-99. Northeast Regional Agricultural Engineering Service, Ithaca, NY. 4. Fick, G.W., and C.G. Janson. 1990. Testing mean stage as a predictor of alfalfa forage quality with growth chamber trials. Crop Sci. 30:678-682. 5. Fick, G.W., P.W. Wilkens, and J.H. Cherney. 1994. Modeling forage quality changes in the growing crop. p. 757-795. In (G.C. Fahey, ed.) Forage Quality, Evaluation, and Utilization. ASA, CSSA, SSSA, Madison, WI. 6. Gauch, H.G. 2002. Scientific method in practice. Cambridge University Press, Cambridge. 7. Gauch, H.G., J.T.G. Hwang, and G.W. Fick. 2003. Model evaluation by comparison of model-based predictions and measured values. Agron. J. 95:1442-1446. 8. Hintz, R.W., and K.A. Albrecht. 1991. Prediction of alfalfa chemical composition from maturity and plant morphology. Crop Sci. 31:1561-1565. 9. Kalu, B.A., and G.W. Fick. 1981. Quantifying morphological development of alfalfa for studies of herbage quality. Crop Sci. 21:267-271.
16
10. Sulc, R.M. 1996. Equations for predicting quality of alfalfa. p 115-124. In 1996 Proc. Tri-State Dairy Nutrition Conf., Fort Wayne, IN, 14-15 May, 1996. 11. Sulc, R.M., K.A. Albrecht, J.H. Cherney, M.H. Hall, S.C. Mueller, and S.B. Orloff. 1997. Field testing a rapid method for estimating alfalfa quality. Agron. J. 89:952-957. 12. Van Soest, P.J., J.B. Robertson, and B.A. Lewis. 1991. Methods for dietary fiber, neutral detergent fiber, and nonstarch polysaccharides in relation to animal nutrition. J. Dairy Sci. 74:3583-3597.
2. Fiber Content Estimation of Mixed Alfalfa-Grass Stands 2.1 Abstract Regression equations can be used to estimate the NDF of alfalfa, assisting producers in decision making at harvest time. In New York State, where most alfalfa is grown in mixed stands with grass, there are no available models to estimate NDF. The objectives of this experiment were to develop equations for estimating total mixed stand NDF with an emphasis on producer useable equations based on easily obtainable data. Stands of first-cut alfalfa and grass (0.1 to 0.9 fraction grass) were sampled at two experimental sites and producers’ fields in 19 New York counties during May and June 2004 and 2005. A range of plant measurements and environmental characteristics were recorded and used to develop prediction equations. For selection of two to five variable models using 899 data points, R2 ranged from 0.89 to 0.94 and RMSE ranged from 21.2 to 30.1 g kg -1. The most important explanatory variables were the fraction of grass and alfalfa height. Growing degree days and Julian date improved goodness of fit but were biased between years. Categorization of the grass fraction into 0.2, 0.4, 0.6 or 0.8 allows estimation without requiring species separations. Categorization decreased R2 and increased RMSE but is a variable that could be more easily utilized by producers. Model validation found significant biases with some model estimates; however biases and prediction errors were small enough to suggest that the results are practically applicable to New York farms.
17
2.2 Introduction Timing of spring forage harvest is critical to obtain optimal quality for animal production. For forage that serves as the primary fiber source in the diet, neutral detergent fiber (NDF) is the principal forage quality variable of concern. The target NDF at harvest is 400 g kg-1 for pure alfalfa silage and 500 g kg-1 for pure grass silage (Cherney et al. 1994). In addition there is a relatively small range in optimal alfalfa NDF (Cherney & Sulc, 1997), emphasizing the need for quick and accurate methods for determining NDF. A number of methods have been developed to estimate alfalfa NDF, including models based on weather, chronological age and plant morphology (Fick et al. 1994). The most widely used of these are the PEAQ equations (Hintz & Albrecht, 1991). Hintz and Albrecht found that equations using the tallest stem and maturity of the most mature stem in the sample gave acceptable root mean squared error (RMSE) compared to more complex methods involving mean stage by weight (Fick & Janson, 1990) or mean stage by count (Allen & Fick, 1990; Kalu & Fick, 1981). Although the initial model was validated for Wisconsin, equations have been developed for other regions of the United States, including Ohio (Sulc, 1996) and New York (Cherney, 1995). In addition, the original PEAQ equations have been evaluated in New York, Pennsylvania, Ohio, California and Wisconsin (Sulc et al. 1997). Results indicated some biases in using the equations outside the state of development; however the prediction errors were sufficiently low to suggest the PEAQ equations are robust over a wide range of environments. The estimation of forage quality is more complex in New York, where an estimated >80% of alfalfa is grown in mixed stands with grass (W.D.
18
19
Pardee, Personal Communication, 2004). Grasses in New York can very rapidly increase in NDF during the harvest period (Cherney et al. 1993), and producers often harvest stands containing grass before the fields of pure alfalfa are harvested. Consequently there is a need for simple field-based methods for estimating the NDF of alfalfa stands with a grass component. Not only is the estimate of the PEAQ model of unknown accuracy for estimating the NDF of the alfalfa portion of the sward but there are no available equations for estimating the NDF of the grass portion. In addition it is not known how alfalfa and grass interact at different proportions in the sward to affect the NDF of each component. For practical purposes, a producer is ultimately interested in the NDF of the total mixture rather than the NDF of the individual components. Thus the objectives of this experiment were (i) Develop equations for estimating total mixed stand NDF using a combination of environmental measurements and sward characteristics and (ii) Develop producer useable equations based on easily obtainable data with a focus on measurable sward characteristics.
2.3 Materials and methods 2.3.1 Field Study Spring growth of alfalfa and grass mixed stands were sampled at two experimental sites and 150 producers’ fields in 19 New York counties during May and June 2004 and 2005. The experimental sites were the Cornell University Caldwell Field Research Farm (42.45oN, 76.46oE, 276 m, 0-2% slope) near Ithaca, NY, and Mount Pleasant Research Farm (42.46 oN, 76.37oE, 520 m, 0-6% slope) near Dryden NY. The soil at Caldwell Field is a
20
Table 2.1 Descriptions of variables evaluated as potential predictors of NDF content in swards of alfalfa and grass. Variable
Description
ALTD
Difference between altitudes of weather station and field (m)
ALTF
Altitude of sampling field (m)
ALTWS GCANOPY
Altitude of weather station (m) Height of the grass canopy in the sample area (cm)
GDD0
Accumulated growing degree days, base 0oC
GDD5
Accumulated growing degree days, base 5oC
GEST
Estimated fraction of grass in the sample area
GFRAC
Actual fraction of grass in the sample
GGRP
Grouped fraction of grass in the sample1
GMAXHT
Height of the tallest grass plant in the sample area (cm)
GMAXNDX
Developmental stage of most mature grass tiller2
GMAXSTG
Developmental stage of most mature grass tiller in the sample area using simplified system3
GSPECIES
Major grass species in each sample area
JDATE
Number of days from the beginning of the year
MAXHT
Height of the tallest alfalfa stem in the sample area (cm)
MAXSTAGE Morphological stage of development of the most mature alfalfa stem in the sample area4 TIME
Time of sampling (decimal hours)
1
Fraction grass groups: 0.2, 0.4, 0.6 or 0.8.
2
Determined using system of Moore and Moser, 1995.
3
Simplified system based on Moore and Moser, 1995 (see Table 2.3).
4
Determined using system of Kalu and Fick, 1981 (See Table 2.2).
21
Niagara silt loam (fine-silty, mixed, active, mesic Aeric Epiqualfs) and the soil at Mount Pleasant is a Mardin silt loam (coarse-loamy, mixed, mesic Typic Fragiudepts). The experimental design at each site was a randomized complete block design with four blocks. Each block included 3 different alfalfagrass species mixtures at 2 grass seeding rates, and 1 plot of pure alfalfa,giving a total of 28 plots. Each plot measured 2.7 by 6 m (16.2 m 2) with 0.15 m between plots and 0.3 m alleys between blocks. Plots were seeded on 19 May 2003 at Caldwell Field and 23 May 2003 at Mount Pleasant. All plots were seeded at Caldwell Field with ‘Hytest 340PLH’ alfalfa (Medicago sativa L.) at 13.44 kg ha-1 and at Mount Pleasant with ‘Hytest 104PLH’ alfalfa at 13.44 kg ha-1 using a Brillion seeder (Brillion Farm Equipment, Brillion, WI). Grass plots were seeded using a Carter seeder (Carter Mfg., Brookston, IN). ‘Richmond’ timothy (Phleum pratense L.) was seeded at 3.36 and 6.72 kg ha-1, ‘Okay’ orchardgrass (Dactylis glomerata L.) was seeded at 4.48 and 8.96 kg ha-1, ‘Rival’ reed canarygrass (Phalaris arundinaceae L.) was seeded at 6.72 and 13.44 kg ha-1. Seeding rates were calculated using pure live seed. Lime, phosphorus and potassium were applied according to soil test recommendations. Plots were lightly hand-weeded in April 2004 and April 2005. Producer fields were identified with alfalfa height of at least 30 cm, and fields and plots were sampled using the same methods. To define a representative portion of the field or plot as the sample area, an area of approximately 1 m2 was visually identified in 2004, and in 2005 a hoop of comparable area was used. The data collected and variable abbreviations are summarized in Table 2.1. Height of the tallest alfalfa stem in the sample area
22
was measured to the terminal bud (MAXHT). The alfalfa maturity categories of Kalu and Fick (1981) (Table 2.2) were used to assign a numerical value to the most mature stem in the sample area (MAXSTAGE). The major grass species was recorded (GSPECIES). The height of the tallest grass tiller in the sample area was measured by fully extending the leaf (GMAXHT). The average grass canopy height of the sample area was measured with no extension of leaves (GCANOPY). The developmental stage of the most mature grass tiller in the sample area was determined using the staging system of Moore and Moser (1995) (GMAXNDX). Determination of the index number requires knowledge of the total number of leaves or nodes that will appear before reaching the next development stage. Because this system requires prior knowledge of development norms for each member species, a simplified grass staging system (GMAXSTG) was created that is potentially more useable by nonscientists (see Table 2.3). Time of sampling was recorded and converted to a decimal number (TIME); for example, 2.30pm was converted to 14.5. The fraction of grass in the sample area was visually estimated (GEST). A representative sample of 500 to 750 g of alfalfa and grass was hand clipped from the sample area at a height of 10 cm, an approximation of typical harvest height. Date of sampling was transformed to its Julian date (JDATE), the number of days from the beginning of the year. The altitude of the field was recorded (ALTF), as were the geographic co-ordinates. Co-ordinates of the fields were overlayed with the co-ordinates of all New York State weather stations using Manifold (Enterprise Edition 6.50, CDA International Ltd., San Mateo, CA). Voronoi cells were created to determine the nearest weather station for each field, thus enabling the calculation of individualized growing degree days. Accumulated growing degree days were calculated using both
23
base 0oC (GDD0) and base 5oC (GDD5). Accumulation of growing degree days was initiated when the mean temperature exceeded the base for five consecutive days. The altitude of the nearest weather station (ALTWS) was used as a potential explanatory variable.
Table 2.2 Alfalfa maturity categories1 used to assign a numerical value to the most mature stem in the sampling area. MAXSTAGE value
1
Stage of maturity
0
Stem length 30 cm
3
1-2 nodes with visible buds
4
>2 nodes with visible buds
5
1 node with at least 1 open flower
6
≥2 nodes with an open flower
Adapted from Kalu and Fick, 1981.
Samples were separated and oven dried at 60oC until a constant dry weight was reached. Sub-samples of dried alfalfa and grass were weighed and the actual fraction of grass in the sample (GFRAC) was calculated. Samples with GFRAC 90 were not used for further analysis. Estimating the fraction of grass in a mixed stand can be difficult, and it is difficult for producers to accurately estimate GFRAC in the field. Thus, samples were allocated grouping values of 0.2, 0.4, 0.6, or 0.8, (GGRP) in
24
accordance with the nearest GFRAC value. Grass and alfalfa samples (0.25 g) were analysed for NDF content using the procedure described by Van Soest et al. (1991), using the ANKOM fiber analyser with filter bags.
Table 2.3 Grass maturity categories1 used to assign a numerical value to the most mature tiller in the sampling area. GMAXSTG
1
value
Stage Description
1 2 3 4 5 6 7 8
1 collared leaf 2 collared leaves 3 collared leaves 4 collared leaves 5 collared leaves 6 collared leaves 7 collared leaves 8 collared leaves
9 10 11 12 13 14 15 16 17
1 palpable or visible node 2 palpable or visible nodes 3 palpable or visible nodes 4 palpable or visible nodes 5 palpable or visible nodes 6 palpable or visible nodes 7 palpable or visible nodes Inflorescence emerging Inflorescence emerged, peduncle not elongated
18 19 20
Peduncle elongated Anthesis Post anthesis
Based on a simplification of Moore and Moser, 1995.
25
2.3.2 Data analysis The dataset was randomly partitioned into two replicates, hereafter referred to as split1 and split2. The purpose of this partitioning was to analyze if data from numerous sites and two sampling years can be pooled. The dataset was also split by year, hereafter referred to as Y2004 and Y2005. The purpose of this split was to analyze bias between years and determine whether data from different years can be combined. PROC RSQUARE variable selection procedure (SAS for Windows Release 9.1, SAS Institute Inc., Cary, NC) was used on the combined dataset to identify models that maximized the coefficient of determination (r2 or R2) and with minimum root mean squared error (RMSE) and Schwarz Bayesian criterion (SBC). RMSE has the same units as the variable predicted and in model construction is the calibration error of the model. In model validation RMSE is the prediction error of the model. The SBC is a statistic that has components relating to both fit and the number of parameters in the model. The SBC is a better indicator of predictive accuracy than R2 and RMSE. Four potential prediction equations were chosen for the combined dataset and PROC GLM was used to fit equations containing the same explanatory variables for split1, split2, Y2004 and Y2005. Model evaluation was performed by fitting the Y2005 dataset to equations derived from the Y2004 data and vice versa. Similarly, split1 was fit to the equations derived from split2 and vice versa. A number of parameters were used in evaluating the equations as no single statistical test can adequately describe the goodness of fit of the model. All regression equations were tested for intercepts at the origin (a = 0) and unitary coefficients (b = 1).
26
Gauch et al. (2003) proposed the partitioning of mean squared deviation (MSD) into its components: squared bias (SB), nonunity slope (NU) and lack of correlation (LC). This step can aid understanding about the appropriateness of a statistical model. The three components have distinct meanings, with SB relating to translation, NU relating to rotation and LC relating to scatter.
2.4 Results and discussion To determine the similarity of the experimental plot data to the producer field data, variable selection methods were used to develop promising equations based only on the producer field data. Two datasets were fitted to these equations i) the producer field dataset and ii) the total dataset including plots and producer fields. The results of the MSD values from these regressions indicate that the MSD is only slightly increased when the entire dataset is used. For example, the MSD of a three-variable model based on GFRAC, MAXHT and GDD5 increased from 478 to 491. Because the regression of plot data on equations for the producer field data does not dramatically increase the MSD, we can conclude that the plot data can be used in conjunction with the producer field data without significant error. Therefore, all subsequent analyses are based on the aggregated plot and producer field datasets. Table 2.4 shows promising equations for mixed alfalfa and grass estimation selected from the variable selection procedure. Using the entire dataset the best two-variable model (Eq. 1) consisting of MAXHT and GFRAC
27
Table 2.4 Selected regression models to estimate mixed alfalfa and grass NDF, based on the entire dataset and datasets split by year (2004, 2005) and split randomly (split1, split2).1 Eq. no.
n
R2
RMSE
SBC
Model2
g kg-1 Entire dataset 1 899 2 899 3 899 4 899
0.89 0.94 0.92 0.94
30.1 22.4 25.0 21.2
5682 5187 5374 5115
87.1 + 3.2(MAXHT) + 313(GFRAC) 91.2 + 2.1(MAXHT) + 290(GFRAC) + 0.28(GDD5) -229 + 2.6(MAXHT) + 307(GFRAC) + 2.5(JDATE) -58.2 + 2.1(MAXHT) + 290(GFRAC) + 0.19(GDD5) + 1.1(JDATE) + 0.033(ALTWS)
Y2004 dataset 5 714 6 714 7 714 8 714
0.89 0.94 0.93 0.94
29.5 21.8 23.7 21.2
4394 4006 4116 3981
92.2 + 3.2(MAXHT) + 312(GFRAC) 96.2 + 1.9(MAXHT) + 290(GFRAC) + 0.28(GDD5) -302 + 2.1(MAXHT) + 309(GFRAC) + 3.3(JDATE) -36.7 + 1.9(MAXHT) + 291(GFRAC) + 0.20(GDD5) + 1.0(JDATE) + 0.038(ALTWS)
Y2005 dataset 9 185 10 185 11 185 12 185
0.88 0.95 0.96 0.96
29.6 19.1 17.9 17.2
1266 1108 1085 1079
73.7 + 3.5(MAXHT) + 320(GFRAC) 56.3 + 2.5(MAXHT) + 278(GFRAC) + 0.40(GDD5) -491 + 2.4(MAXHT) + 291(GFRAC) + 4.3(JDATE) -309 + 2.3(MAXHT) + 283(GFRAC) + 0.17(GDD5) + 2.9(JDATE) - 0.0020(ALTWS)
28
Table 2.4 (Continued). Eq. no. n R2
RMSE g kg-1
SBC
Model2
Split 1 dataset 13 435 14 435 15 435 16 435
0.88 0.93 0.92 0.94
29.9 22.9 24.7 21.7
2761 2544 2613 2520
90.0 + 3.3(MAXHT) + 315(GFRAC) 97.0 + 2.1(MAXHT) + 289(GFRAC) + 0.27(GDD5) -219 + 2.6(MAXHT) + 3.0(GFRAC) + 2.5(JDATE) -48.9 + 2.1(MAXHT) + 287(GFRAC) + 0.18(GDD5) + 1.1(JDATE) + 0.032(ALTWS)
Split 2 dataset 17 464 18 464 19 464 20 464
0.90 0.95 0.93 0.95
29.5 21.4 24.6 20.1
2934 2657 2780 2617
87.9 + 3.3(MAXHT) + 310(GFRAC) 86.4 + 2.1(MAXHT) + 290(GFRAC) + 0.30(GDD5) -238 + 2.5(MAXHT) + 310(GFRAC) + 2.6(JDATE) -69.5 + 2.1(MAXHT) + 292(GFRAC) + 0.21(GDD5) + 1.2(JDATE) + 0.034(ALTWS)
1
R2, coefficient of determination; RMSE, root mean squared error; SBC, Schwarz Bayesian criterion.
2
For descriptions of model variables see Table 2.2.
29
has an R2 of 0.89 and an RMSE of 30.1. Three-variable models with either GDD5 or JDATE added gave better results than the two-variable model. The model with GDD5 (Eq. 2) has an R2 of 0.94 and an RMSE of 22.4, signifying a better goodness of fit than the two-variable model. The model with JDATE (Eq. 3) has an R2 of 0.92 and an RMSE of 25.0, signifying a better goodness of fit than the two-variable model, but a poorer fit than for the model containing GDD5. The model with JDATE was included in the equations because JDATE is a simple calculation compared to GDD5 which requires access to meteorological data. To assess the possibility of using a large combination of predictor variables, the best five-variable model was also selected (Eq. 4). This resulted in an R2 of 0.94, equivalent to the best three-variable model and an RMSE of 21.2, lower than the best three-variable model. The two-variable PEAQ model for NDF (Hintz & Albrecht, 1991) had an R2 of 0.89 and an RMSE of 26.2. Thus in comparison the results for mixed alfalfa-grass are promising in terms of goodness of fit, with acceptable values for both R2 and RMSE. The SBC value for the two-variable model is 5682. There is a drop in SBC to 5374 with the addition of JDATE to the model, indicating improved fit. There is a further drop to 5187 for the GDD5 three-variable model. The SBC for the five-variable model is the lowest (5115) suggesting that predictive accuracy can be improved for mixed alfalfa-grass with the use of models with a high number of predictors. The variables used to construct the models for the entire dataset were used to develop equations for the datasets split1, split2, Y2004 and Y2005 (see Table 2.4). For the Y2004, split1 and split2 datasets, R2 and RMSE
30
values are of similar magnitude to the entire dataset. In contrast, for the 2005 dataset the R2 values are comparatively higher and RMSE lower for the threevariable models and the five-variable model. The SBC values cannot be compared between datasets due to the different number of data points; however the trend is similar for all datasets. For the Y2004, split1 and split2 datasets the SBC from highest to lowest is in the order: two variables, three variables with JDATE, three variables with GDD5, and five variables. The exception is the Y2005 dataset which has a lower SBC for the two-variable model with GDD5 than with JDATE. Table 2.5 shows the results of regressing the observed NDF values from the split1, split2, Y2004 and Y2005 datasets on the models developed from the corresponding dataset. For the Y2004 and Y2005 datasets the range of values for R2 (0.88 to 0.95) and RMSE (19.5 to 29.7) are acceptable. For Eqs. 6-8 (Y2005 dataset) all b-values exceed unity and all intercepts are negative. Thus NDF content is overestimated, particularly at low values. For Eq. 5 the b-value does not differ from unity and the a-value is not significantly different from 0. For Eqs. 9-12 (Y2004 dataset) the b-values are all significantly less than 1 and the a-values are all significantly greater than 0. For this replicate of dataset and equations NDF is underestimated, particularly at lower values of NDF. These results for Eqs. 5-12 make it difficult to assess the comparative validity of the models. For example, the model with the lowest R2 (Eq. 5) is also the only one that does not have a significantly biased slope or intercept. Alternatively, Eq. 7 which has the highest R2 also has an a-value of -55.18, which is significantly less than 0. Thus it is difficult to compare the strengths
31 Table 2.5 Coefficient of determination (r2), root mean square error (RMSE), slope (b), and y-intercept (a) derived from the regressions of equation estimates1 on observed forage quality values from the corresponding dataset pair. Dataset pairs were i) Y2004 and Y2005 ii) split1 and split2. Slope
Intercept
SEb
Prob. b=1.0
a
SEa
Prob. a=o
Y2005 dataset, Y2004 models 5 0.88 29.5 1.050 6 0.94 21.5 1.119 7 0.95 19.5 1.061 8 0.94 20.7 1.105
0.0286 0.0215 0.0184 0.0204
NS *** ** ***
-21.89 -41.96 -55.18 -46.44
12.24 9.04 8.37 8.74
NS *** *** ***
Y2004 dataset, Y2005 models 9 0.89 29.7 0.952 10 0.93 24.3 0.860 11 0.92 24.7 0.933 12 0.93 23.9 0.901
0.0127 0.0092 0.0101 0.0095
*** *** *** ***
20.22 47.74 61.08 58.00
5.78 4.34 4.29 4.17
*** *** *** ***
Split2 dataset, split1 models 13 0.90 29.3 0.994 14 0.94 21.6 1.027 15 0.93 24.4 1.015 16 0.95 20.1 1.031
0.0158 0.0117 0.0132 0.0109
NS * NS **
-0.97 -13.43 -8.38 -14.47
7.04 5.17 5.85 4.82
NS ** NS **
Split1 dataset, split2 models 17 0.88 29.9 1.007 0.018 18 0.93 22.8 0.969 0.0129 19 0.92 24.7 0.981 0.0142 20 0.94 21.7 0.965 0.0122
NS * NS **
0.80 13.97 8.81 15.60
8.15 5.90 6.48 5.55
NS * NS **
Eq. no.
r2
RMSE
b
g kg-1
*, **, *** Significant at a probability level of 0.05, 0.01 and 0.001, respectively. 1
Equation numbers correspond to those in Table 2.4.
32
and limitations of the models solely using these methods. Partitioning MSD provides a way to better understand predictive success and provide a basis for addressing the dilemma of which model is best. The results for the Y2005 dataset (Table 2.5) show that the order of lowest to highest MSD is the fivevariable model (488, Eq. 8), the three-variable model with GDD5 (587, Eq. 6), the two-variable model (878, Eq. 5) and the three-variable model with JDATE (1184, Eq. 7). Eq. 5 has very low values for SB (0) and NU (15) but a high value for LC (863), signifying that the main contributor to MSD is scatter. Eq. 7 has a very high value for SB (785), a value of 22 for NU and a value of 377 for LC, signifying that the largest contributor to MSD is translation error. Fig. 2.1 shows SB, NU and LC components of MSD for the models. The prominent feature of Fig. 2.1 is that the models differ greatly regarding which component contributes the most to MSD. For the Y2005 dataset it is evident that Eq.5 has an MSD almost entirely composed of LC. Eqs. 6-8 all have LC values similar to each other, however the SB component of Eq. 7 dramatically increases MSD. The MSD partitioning pattern is slightly different for the Y2004 dataset. Once again the two-variable model (Eq. 9) has an MSD almost entirely composed of LC. Eqs. 10-12 all have similar values for LC, however the SB component of the three-variable model with JDATE (Eq. 11) dramatically increases MSD. For Eqs. 10 and 12, SB and NU also contribute a greater proportion of the MSD than the corresponding equations in the Y2005 dataset. As a result Eq. 12 has only a marginally lower MSD than Eq. 9, and Eq. 10 has a higher MSD than Eq. 9, even though the LC component is much smaller. We can conclude from Fig. 2.1 that the 714 points used to construct theY2004 model were more successful in estimating the NDF of the Y2005 dataset than the 185 points used to construct the Y2005 model in estimating
33
the NDF of the Y2004 dataset. These results demonstrate the danger of building a model from data of one year and testing it on data from a different year, particularly when the predictor variables are seasonally dependent, such as growing degree days and Julian date. In particular, inclusion of Julian date (Eqs. 7 and 11) increased model bias. A further reason for understanding the components of MSD is that the implications of these errors can be very different. For example, if there is a legitimate explanation for an observed translation error (SB) it may be possible to adjust the model to compensate for this. Alternatively, errors due to scatter (LC) may be more difficult to ameliorate. The results for the split1 and split2 datasets are similar to each other. Coefficients of determination (R2) range from 0.88 to 0.94 for split1 and 0.90 to 0.95 for split2. RMSE values range from 21.7 to 29.9 for split1 and 20.1 to 29.3 for split2. Goodness of fit in terms of both R2 and RMSE is best for the five-variable models (Eqs. 16 and 20) and worst for the two variable models (Eqs. 13 and 17). The three-variable models with GDD5 (Eqs. 14 and 18) have better R2 and RMSE than those with JDATE (Eqs. 15 and 19). All of the models have slopes close to 1.0, ranging from 0.965 to 1.031, however bvalues are significantly different from 1 for Eqs. 14 and 18 (P