Agricultural Information Research 19(2), 2010. 36–42
Available online at www.jstage.jst.go.jp/
Original Paper
Potential Predictability of Local Paddy Rice Yield Variation Using a Crop Model with Local Areal Information Toshichika Iizumi*1), Kenji Ishida2), Masayuki Yokozawa1) and Motoki Nishimori1) 1) Agro-Meteorology Division, National Institute for Agro-Environmental Sciences, 3-1-3 Kannondai, Tsukuba 305-8604, Japan 2) Department of Rural Planning, National Institute for Rural Engineering, 2-1-6 Kannondai, Tsukuba 305-8609, Japan
Abstract This study examined the potential predictability of paddy rice yield variation on a local scale (approximately 2 km×2 km) by using a prefectural-scale dynamic paddy rice simulation model (the PRYSBI model) with the local observed weather data, taking the 160 local areas in Tochigi Prefecture, Japan, as the study area. From the comparison of the simulated and observed local yields during the 11-year period (1993, 1995, 1998–2006), the PRYSBI model showed high capability to simulate the interannual variation of area-mean local yield over the study area with the quite large root-mean-square error (RMSE) of 2.1 Mg ha–1. However, the RMSE has the statistical-significant relationship with the local areal features on agriculture. We thereby incorporated the local areal features in the simulated yields in the manner of the multiplicative model approach. The parameter values of the multiplicative model were estimated by using the Bayesian approach, which can count the uncertainty of parameter value in a stochastic manner. By this means, the potential predictability in terms of the coefficient of determination (r2) and RMSE between the simulated and observed local yields improved from r2=0.430 to r2=0.527 and from RMSE=2.0 Mg ha–1 to RMSE=0.4 Mg ha–1 compared to the PRYSBI model alone. The potential predictability of local yield could improve by incorporating the local areal features in the output of crop model.
Keywords Bayesian approach, crop model, crop yield variation, local areal feature, prediction
Introduction
predicted variance of crop yield on a large scale to that on a smaller area (so-called “downscaling”) is the research issue
The interannual variation of crop yield often occurs accompa-
addressed in this study.
nying the variations of surrounding natural environmental condi-
On a smaller scale, crop yield depends on farm management,
tions, especially, weather conditions. The prediction of crop yield
agricultural policy, market price, and technological option. How-
variation has been studied in the world’s major agricultural areas
ever, unfortunately, it is difficult to obtain such detailed informa-
to support agricultural decision making (e.g., Hansen et al. 2004)
tion from individual farms without intense field survey. On the
and is still an important research issue. The typical spatial resolu-
other hand, some studies report that some local areal features
tion of prediction ranges from 110 km (Challinor et al. 2005) to
related to agriculture, which available from widely-published
280 km (Hansen et al. 2004) except for the state-of-the-art work
agricultural censuses, show the statistical relationship with local
of Baigorria et al. (2008) (20 km). However, higher resolution is
yield. As the examples, Iizumi et al. (2007, 2008) explains that
needed for farmers and agricultural cooperatives in Japan because
the insured paddy rice yield loss is often more severe in local
many farmlands in Japan scatter over suburbs with mixed land-
areas with higher percentages of elderly population and non-
use and intermediate and mountainous areas with complex ter-
regular agricultural workers than in the average local areas.
rain. Therefore, the development of methodology to translate a
Consequently, incorporating the local areal features in prediction may improve the predictability of local yield variation, although
* Corresponding Author E-mail:
[email protected]
36
the local areal features are insufficient to cover the whole of socioeconomic, political, and technological conditions for paddy
Agricultural Information Research 19(2),2010
managements for fertilization, irrigation, and pest control. The originally required forcing data are daily time series of daily maximum and minimum surface air temperatures (Tx and Tn, respectively) and incoming solar radiation (Sr) averaged over the paddy areas in a prefecture and annual atmospheric carbon dioxide (CO2) concentration. The PRYSBI model is calibrated on the basis of the prefectural data of Tochigi in the previous study of our research group (Iizumi et al. 2009) but not on local areal data. The parameter values of the model were obtained from Iizumi et al. (2009) and are common across the local areas. In this study, the weather data given to the PRYSBI model differed by local areas to account for the differences in local weather conditions. The daily data on Tx, Tn, and Sr, with spatial resolution of 1 km, were obtained from the Mesh-AMeDAS data provided by the National Institute for Agro-Environmental Sciences (Seino 1993). The procedures for local weather data provision consisted of three steps. Step 1: Collect the Tx values for the n meshes, including the land of local area j (Txi j k, k=1, Fig. 1
Geographical map of topography, paddy area (green), and 160 local areas (red)
..., n) (°C). The subscripts i, j, and k denote day i, local area j, and mesh k, respectively; Step 2: Calculate the ratio of paddy areas to the 1-km mesh area k, Rj k, by,
rice production. For this study, we selected the 160 local areas in the eastern
Rj k = Pj k / Mj k ,
(1)
part of Tochigi Prefecture, Japan, as the study area (Fig. 1). The study area includes various terrain and socioeconomic conditions
where Pj k and Mj k denote the paddy area (ha) and the mesh area
(see Iizumi et al. 2007 for details). We examined the potential
(=100 ha), respectively. The paddy area distribution, with spatial
predictability of paddy rice yield on a local scale (the spatial
resolution of 100 m, was obtained from the Digital Numerical
resolution was approximately 2 km×2 km) by using a dynamic
Land Information (GSI 1998) (Fig. 1); and Step 3: Calculate the
paddy rice simulation model designed originally for a prefectural
Tx value on day i for local area j, Tx i j , by the weighting average,
scale and the observed weather data on a local scale. Here, we used the historical observed weather data instead of seasonal climate forecast; thus, what we examined was the ‘potential’
Tx i j =
∑ nk=1 Rj k Txi j k ⁄ ∑ nk= 1 Rj k ,
(2)
capability of the simulation model to predict local yield variation.
We iterated the three steps for each day from 1993 through
After that, we assessed the improvement of the potential predict-
2006 for the 160 local areas. The weighted averages of Tn and Sr
ability achieved by the incorporation of the local areal features in
were calculated in the same manner as for Tx.
the simulated local yield in the manner of multiplicative model
The annual atmospheric CO2 concentration data during the
approach. Considering the fact that the effect of local areal fea-
same period were obtained from the World Data Center for
tures on local yield is less clear than that of practical cultivation
Greenhouse Gases (WDCGG, 2009). The annual values were
management, we used the Bayesian approach for the parameter
computed from the monthly values at the Ryori site (39.03°N,
estimation of the multiplicative model to count the uncertainty of
141.82°E) observed by the Japan Meteorological Agency and
parameter values in a stochastic manner.
commonly used for all local areas.
Model and Data
Local paddy rice yield data
Crop model
local areas on the basis of the insurance records during the 11
The actual local yield of paddy rice was calculated for the 160 We used the Process-based Regional-scale Rice Yield Simulator
years (1993, 1995, 1998–2006) when the reported yield loss is
with Bayesian Inference (PRYSBI) (Iizumi et al. 2009), designed
greater than the 70% of the standard yield. The insured records
to simulate the typical growth, development, and grain yield of
were provided by the National Agricultural Insurance Associa-
paddy rice on a prefectural scale (approximately 14,000 km2) in
tion and the local organization of agricultural insurance. To
accordance with given forcing data and the assumption of optimal
obtain the actual local yield, first, we calculated the insured yield 37
Potential Predictability of Local Paddy Rice Yield Variation Using a Crop Model with Local Areal Information
Fig. 2
Time series (a) and scatter plot (b) of simulated and observed local yields during the 11 years. The number of local areas for areal average is shown in (a). In (b), gray and black plot indicates local yield and area-mean local yield, respectively
loss for each local area by dividing the total insured loss of all
the Association of Agriculture and Forestry Statistics. We com-
participating members in a local area by the total insured paddy
puted the annual values of local areal features from 1993 through
area in the local area. Second, the actual local yield was calcu-
2006 on the basis of the quinquennial census data from 1990 to
lated on the basis of the following definition of insured yield loss,
2005 by applying the temporal linear interpolation (or extrapolation) technique to the two nearest-neighboring values.
⎧ ϕyi – yi ( ϕyi > yi ) , Li = ⎨ ( ϕyi ≤ yi ) ⎩ 0
(3)
To select the candidates of local areal features for analysis, first, we extracted more than 80 local areal features, of which
where Li: the insured local yield loss (Mg ha–1), ϕ: the coverage –1
definitions are common throughout the study period. Second, we
ratio (=0.7 in the study area), yi : the standard yield (Mg ha ), and
removed or aggregated the similar local areal features. For
yi: the actual local yield (Mg ha–1). The subscript i denotes the
instance, the numbers of 15–29, 30–39, and 40–59 year-old
year i. Thus, the insured yield loss of zero was recorded if the
regular agricultural workers were aggregated. Third, we exam-
actual yield was greater than the criterion. In contrast, if the actual
ined the correlation coefficient between the local areal features
yield was less than the criterion, the yield loss below 70% of the
and observed local yield and detected the local areal features that
standard yield was recorded as the insured yield loss.
have the positively- or negatively-strong correlation. Finally, we
Therefore, the actual local yield was correctly computable
selected and used the following four local areal features for this
from Eq. (3) if the insured local yield loss was not zero. Other-
study: (1) Labor (%), the percentage of the number of 15–59
wise, the missing data was given for the actual local yield. The
year-old regular agricultural workers to the total number of
number of local areas that experienced the yield loss varies year
agricultural workers; (2) Area (ha), the total acreage of managed
by year; however, out of the 160, not a few local areas experi-
paddy area; (3) Seller (%), the percentage of the number of the
enced the yield loss (Fig. 2). In total, we obtained the actual local
paddy rice farms to the total number of farms, whose paddy rice
yields that were not the missing data for the 38.7% of 1,760
sales total was the highest in the whole farm sales total; and (4)
possible cases (160 local areas×11 years) (hereafter referred to
Helper (%), the percentage of the number of the paddy rice farms
as the observed local yields).
to the total number of farms, which entrusted some of (or all)
Due to this, the observed local yield used is skewed to the
paddy rice cultivation managements to other farms.
lower side in contrast with the ‘true’ probabilistic distribution of local yield. This limitation of data availability makes the evalua-
Results and Discussion
tion of the PRYSBI model impossible for the years with near standard and better than standard yields. However, the potential
Potential predictability: The PRYSBI model alone
predictability of the PRYSBI model is assessable for the years
Fig. 2 shows the comparison of the simulated yield average
with the sever yield losses, which is practically important in the
over the local areas and the corresponding observations during
prediction.
the 11 years. The number of local areas for the average varied annually because of the limitation of observed local yield. The
Local areal feature data
interannual variation of the area-mean simulated yield sometimes
The local areal features related to agriculture were obtained
resembles that of the observation in terms of temporal pattern.
from the Agricultural Census Settlement Cards 2005 provided by
However, the absolute value of the area-mean simulated yields
38
Agricultural Information Research 19(2),2010
Fig. 3
Geographical patterns of (a) correlation coefficients and (b) root-mean-square error (Mg ha–1) between simulated and observed local yields during 11 years
was almost doubled in comparison with the observations. The obtained Pearson’s correlation coefficient (r) and root-mean-square error (RMSE) between the area-mean simulated and observed local yields were r=0.675 and RMSE=2.1 Mg ha–1, respectively. The correlation and RMSE between the simulated and observed local yields during the 11 years for each local area are shown in Fig. 3. We only displayed the local areas in which the observed local yield was available during more than 5 years. The calculated correlations were r≧0.6 in most local areas over the study area except for the imponderable western area (Fig. 3a), indicating that the PRYSBI model has high capability to simulate the interannual variation of the local yield in many local areas. On the other hand, larger RMSEs (≧2.0 Mg ha–1) were observed over the northeastern and eastern areas (Fig. 3b). The RMSE observed is discussed in the next section.
Fig. 4 Geographical patterns of (a) Labor (%), (b) Area (ha), (c) Seller (%), and (b) Helper (%) averaged over 11 years (see Section “Local areal feature data” for details)
39
Potential Predictability of Local Paddy Rice Yield Variation Using a Crop Model with Local Areal Information
Fig. 5
Scatter plots between the RMSE in simulated local yield and (a) Labor, (b) Area, (c) Seller, and (d) Helper. Regression line and correlation coefficient are shown. Asterisks indicate the statistical significance of correlation coefficient at the 1% level
Error of the simulated local yield and the local areal features
the relationship between the local areal features discussed and
Fig. 4 shows the geographical patterns of the four local areal
practical cultivation management is not clear, we dared to incor-
features (i.e., Labor, Area, Seller, and Helper). Obvious areal dif-
porate some of these local areal features in the prediction because
ferences were noted: the higher percentage of Labor (≧50%) in
the PRYSBI model does not count any local areal features in its
the western and central areas; the larger acreage of Area (≧15 ha)
simulation.
in the western and southern areas; the higher percentage of Seller (≧80%) in the western and central areas; and the higher percentage of Helper (≧70%) in the southern and central areas. According to these spatial patterns, it is reasonable to summarize that the western area is the major paddy rice production area for sale. The northeastern area is in poorer condition for paddy rice production
Bayesian inference Here, we developed the following multiplicative model to incorporate the local areal features in the simulated local yield: YOBS = a YSIM · Lb · Ac · Sd · eε,
(4)
in terms of labor and land and thus the farmers in this area pro-
where YOBS: the observed local yield; YSIM: the simulated local
duce paddy rice for captive consumption rather than for sale. The
yield; L: Labor; A: Area; S: Seller; ε: the error term; and a, b,
southern and central areas are in more advantaged conditions than
c, and d: the parameters. Helper was eliminated because the
the northeastern area, although consolidation of paddy areas from
statistically-significant correlation was not found as described in
many farmers to a few major farmers is still taking place.
the above section. The implication of this multiplicative model
Fig. 5 depict the relationship between the RMSE in simulated
was that observed local yields were expressed as a function of
local yield and each of the local areal features. Small RMSE
simulated local yields and scaling factors for the PRYSBI model
emerged along with the increase in Labor, Area, and Seller. The
bias (a) and local areal features (b, c, and d). By the transforma-
obtained correlation coefficients are statistically significant at the
tion, we obtained the following equation,
1% level for all local areal features except for Helper. The RMSE largely scattered regardless of level of Helper.
ln YOBS = ln a + ln YSIM + b · ln L + c · ln A + d · ln S + ε. (5)
The obtained relationships between the RMSE and the local
Then we assumed that ln a is the intercept term and ε is the
areal features indicate that the large simulation error is observed
error term that distributes a normal distribution with zero mean
in the local areas where characterized by lower values of Labor,
and variance of σ2 (i.e., ε ~ N (0, σ2)).
Area, and Seller than in the other local areas. Therefore, although 40
The relationship between the local areal features and the local
Agricultural Information Research 19(2),2010
Table 1 Posterior means, standard deviations (SD), and 90% (2.5– 97.5%)-probability intervals of the parameter value 90%-probability interval Variables
Mean
SD 2.5%
97.5%
ln a
–0.7805
0.1129
(–1.0031,
–0.5562)
b (Labor)
–0.0288
0.0096
(–0.0475,
–0.0099)
c (Area)
0.0815
0.0095
( 0.0628,
0.1003)
d (Seller)
0.0302
0.0181
(–0.0058,
0.0657)
σ2
0.0159
0.0012
( 0.0138,
0.0184)
yield could vary due to the other factors not counted here. To count such uncertainty, we used the Bayesian inference that estimates the parameter value as a probabilistic distribution. The posterior probability density functions (PDFs) of the parameter values were estimated on the basis of the six odd-year data using the Gibbs sampler algorithm (Geman and Geman 1984) of the statistical software R ver. 4.1 (R Development Core Team 2009).
Fig. 6
Scatter plot between simulated local yield with adjustment (red) and without adjustment (gray) against the corresponding observations during the 11 years
We used the Gelman-Rubin (G-R) statistics (Gelman and Rubin 1992) to assess the convergence (see Iizumi et al. 2009 for
the incorporation of local areal features in the output of the
details).
PRYSBI model has a potential to improve the potential predict-
After iterating 15,000 times for each of three chains, we dis-
ability of local yield variation.
carded the first 5,000 samples as the burn-in period and computed the posterior PDFs of the parameter values from the remaining
Concluding Remarks
30,000 samples consisting of 3 chains×(15,000–5,000) samples (Table 1). The obtained G-R values were one for all parameters,
This study assessed the potential predictability of local yield
indicating the convergences of all parameter values to each sta-
variation by using the dynamic crop simulation model and some
tionary distribution. Parameter values smaller (larger) than one
of local areal features on agriculture. The obtained results showed
indicate the downward (upward) adjustment of simulated local
that the potential predictability of local yield could improve by
yields. The parameter value of Area was comparatively larger
incorporating the local areal features in the output of the crop
than those of Seller and Labor. However, the 97.5-percentile
model, compared to the crop model alone.
value of Seller was larger than the 2.5-percentile value of Area;
The local areal features are easy to access because they are
this suggests that Seller provided more important information
available from published agricultural censuses; however, it is
than Area in a few cases.
needed to address the relationship between the local areal features and practical cultivation management. In addition, there is a space
Potential predictability: The PRYSBI model with local areal
to be filled for the consideration of agricultural policy, technical
features
options, and market price in the prediction. Knowledge from farm
Finally, we assessed the improvement of potential predictability
management study would help to solve these problems. We
when incorporating the local areal features in the simulated local
believe that interdisciplinary work bridging between crop model
yield in the manner of the multiplicative model approach. To do
simulation and farm management study is novel and has the
so, first, we adjusted the simulated local yield during the 11 years
potential to improve the prediction more realistically.
based on the posterior means of the parameter values and obtained Fig. 6. We then compared the two types of simulated
Acknowledgement
local yields (with or without adjustment). Even though the validation period included the independent five even-year data, 2
We acknowledge the National Agricultural Insurance Asso-
the calculated coefficient of determination (r ) and RMSE
ciation and the local organization of agricultural insurance for
between the simulated local yield with adjustment and the corre-
providing the agricultural insurance records. This study was
2
2
sponding observation improved from r =0.430 to r =0.527 and
supported by the Global Environmental Research Fund (S-4 and
from RMSE=2.0 Mg ha–1 to RMSE=0.4 Mg ha–1, respectively,
S-5-3) of the Ministry of the Environment, Japan.
compared to the simulated local yield without adjustment. Thus, 41
Potential Predictability of Local Paddy Rice Yield Variation Using a Crop Model with Local Areal Information
References Baigorria, G. A., J. W. Jones and J. J. O’Brien (2008) Potential predictability of crop yield using an ensemble climate forecast by a regional circulation model, Agric. For. Meteorol., 148: 1353–1361. Challinor, A. J., T. R. Wheeler, J. M. Slingo, et al. (2005) Simulation of crop yields using ERA-40: Limits to skill and nonstationarity in weather yield relationships, J. Appl. Meteorol., 44: 516–531. Gelman, A and D. B. Rubin (1992) Inference from iterative simulation using multiple sequences, Statistical Sci., 7: 457–511. Geman, S and D. Geman (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images, Trans. Pattern Anal. Machine Intelligence, 6: 721–741. Geographical Survey Institute (GSI) (1998) ‘User’s Guide for Numerical Map: Revised 2nd Edition’, Japan Map Center, Japan, 500 pp. (in Japanese). Hansen, J. W., A. Potgieter and M. K. Tippett (2004) Using a general circulation model to forecast regional wheat yields in northeast Australia, Agric. For. Meteorol., 127: 77–92. Horie, T., H. Nakagawa, M. Ohnishi, et al. (1995) Rice production in Japan under current and future climates, ed. Matthews, R. B., M. J. Kropff and D. Bachelet, ‘Modeling the Impact of Climate Change on Rice Production in Asia’, IRRI and CAB International, UK, 143– 164.
42
Iizumi, T., K. Ishida, S. Hirako, et al. (2007) Influence of rural socioeconomic characteristics on rice yield damage: A case study using GIS in Motegi-cho and Ichikai-cho, Tochigi, J. Japan Agric. Syst. Soc., 23: 273–282. Iizumi, T., K. Ishida, S. Hirako, et al. (2008) Resistance to cool-summer damage resulting from level of cultivation practices represented by farm household characteristics, J. Japan Agric. Syst. Soc., 24: 103– 112. Iizumi, T., M. Yokozawa and M. Nishimori (2009) Parameter estimation and uncertainty analysis of a large-scale crop model for paddy rice: Application of a Bayesian approach, Agric. For. Meteorol., 149: 333–348. R Development Core Team (2009) R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Austria. Seino, H. (1993) An estimation of distribution of meteorological elements using GIS and AMeDAS data. J. Agric. Meteorol., 48: 379–383. World Data Center for Greenhouse Gases (WDCGG) (2009) WMO Global Atmosphere Watch, , browsed on Jun. 23, 2009. Received July 17, 2009 Accepted January 25, 2010 Agro-informatics & Technology