Potential Predictability of Local Paddy Rice Yield Variation ... - J-Stage

Agricultural Information Research 19(2), 2010. 36–42

Available online at www.jstage.jst.go.jp/

Original Paper

Potential Predictability of Local Paddy Rice Yield Variation Using a Crop Model with Local Areal Information Toshichika Iizumi*1), Kenji Ishida2), Masayuki Yokozawa1) and Motoki Nishimori1) 1) Agro-Meteorology Division, National Institute for Agro-Environmental Sciences, 3-1-3 Kannondai, Tsukuba 305-8604, Japan 2) Department of Rural Planning, National Institute for Rural Engineering, 2-1-6 Kannondai, Tsukuba 305-8609, Japan

Abstract This study examined the potential predictability of paddy rice yield variation on a local scale (approximately 2 km×2 km) by using a prefectural-scale dynamic paddy rice simulation model (the PRYSBI model) with the local observed weather data, taking the 160 local areas in Tochigi Prefecture, Japan, as the study area. From the comparison of the simulated and observed local yields during the 11-year period (1993, 1995, 1998–2006), the PRYSBI model showed high capability to simulate the interannual variation of area-mean local yield over the study area with the quite large root-mean-square error (RMSE) of 2.1 Mg ha–1. However, the RMSE has the statistical-significant relationship with the local areal features on agriculture. We thereby incorporated the local areal features in the simulated yields in the manner of the multiplicative model approach. The parameter values of the multiplicative model were estimated by using the Bayesian approach, which can count the uncertainty of parameter value in a stochastic manner. By this means, the potential predictability in terms of the coefficient of determination (r2) and RMSE between the simulated and observed local yields improved from r2=0.430 to r2=0.527 and from RMSE=2.0 Mg ha–1 to RMSE=0.4 Mg ha–1 compared to the PRYSBI model alone. The potential predictability of local yield could improve by incorporating the local areal features in the output of crop model.

Keywords Bayesian approach, crop model, crop yield variation, local areal feature, prediction

Introduction

predicted variance of crop yield on a large scale to that on a smaller area (so-called “downscaling”) is the research issue

The interannual variation of crop yield often occurs accompa-

addressed in this study.

nying the variations of surrounding natural environmental condi-

On a smaller scale, crop yield depends on farm management,

tions, especially, weather conditions. The prediction of crop yield

agricultural policy, market price, and technological option. How-

variation has been studied in the world’s major agricultural areas

ever, unfortunately, it is difficult to obtain such detailed informa-

to support agricultural decision making (e.g., Hansen et al. 2004)

tion from individual farms without intense field survey. On the

and is still an important research issue. The typical spatial resolu-

other hand, some studies report that some local areal features

tion of prediction ranges from 110 km (Challinor et al. 2005) to

related to agriculture, which available from widely-published

280 km (Hansen et al. 2004) except for the state-of-the-art work

agricultural censuses, show the statistical relationship with local

of Baigorria et al. (2008) (20 km). However, higher resolution is

yield. As the examples, Iizumi et al. (2007, 2008) explains that

needed for farmers and agricultural cooperatives in Japan because

the insured paddy rice yield loss is often more severe in local

many farmlands in Japan scatter over suburbs with mixed land-

areas with higher percentages of elderly population and non-

use and intermediate and mountainous areas with complex ter-

regular agricultural workers than in the average local areas.

rain. Therefore, the development of methodology to translate a

Consequently, incorporating the local areal features in prediction may improve the predictability of local yield variation, although

* Corresponding Author E-mail: [email protected]

36

the local areal features are insufficient to cover the whole of socioeconomic, political, and technological conditions for paddy

Agricultural Information Research 19(2),2010

managements for fertilization, irrigation, and pest control. The originally required forcing data are daily time series of daily maximum and minimum surface air temperatures (Tx and Tn, respectively) and incoming solar radiation (Sr) averaged over the paddy areas in a prefecture and annual atmospheric carbon dioxide (CO2) concentration. The PRYSBI model is calibrated on the basis of the prefectural data of Tochigi in the previous study of our research group (Iizumi et al. 2009) but not on local areal data. The parameter values of the model were obtained from Iizumi et al. (2009) and are common across the local areas. In this study, the weather data given to the PRYSBI model differed by local areas to account for the differences in local weather conditions. The daily data on Tx, Tn, and Sr, with spatial resolution of 1 km, were obtained from the Mesh-AMeDAS data provided by the National Institute for Agro-Environmental Sciences (Seino 1993). The procedures for local weather data provision consisted of three steps. Step 1: Collect the Tx values for the n meshes, including the land of local area j (Txi j k, k=1, Fig. 1

Geographical map of topography, paddy area (green), and 160 local areas (red)

..., n) (°C). The subscripts i, j, and k denote day i, local area j, and mesh k, respectively; Step 2: Calculate the ratio of paddy areas to the 1-km mesh area k, Rj k, by,

rice production. For this study, we selected the 160 local areas in the eastern

Rj k = Pj k / Mj k ,

(1)

part of Tochigi Prefecture, Japan, as the study area (Fig. 1). The study area includes various terrain and socioeconomic conditions

where Pj k and Mj k denote the paddy area (ha) and the mesh area

(see Iizumi et al. 2007 for details). We examined the potential

(=100 ha), respectively. The paddy area distribution, with spatial

predictability of paddy rice yield on a local scale (the spatial

resolution of 100 m, was obtained from the Digital Numerical

resolution was approximately 2 km×2 km) by using a dynamic

Land Information (GSI 1998) (Fig. 1); and Step 3: Calculate the

paddy rice simulation model designed originally for a prefectural

Tx value on day i for local area j, Tx i j , by the weighting average,

scale and the observed weather data on a local scale. Here, we used the historical observed weather data instead of seasonal climate forecast; thus, what we examined was the ‘potential’

Tx i j =

∑ nk=1 Rj k Txi j k ⁄ ∑ nk= 1 Rj k ,

(2)

capability of the simulation model to predict local yield variation.

We iterated the three steps for each day from 1993 through

After that, we assessed the improvement of the potential predict-

2006 for the 160 local areas. The weighted averages of Tn and Sr

ability achieved by the incorporation of the local areal features in

were calculated in the same manner as for Tx.

the simulated local yield in the manner of multiplicative model

The annual atmospheric CO2 concentration data during the

approach. Considering the fact that the effect of local areal fea-

same period were obtained from the World Data Center for

tures on local yield is less clear than that of practical cultivation

Greenhouse Gases (WDCGG, 2009). The annual values were

management, we used the Bayesian approach for the parameter

computed from the monthly values at the Ryori site (39.03°N,

estimation of the multiplicative model to count the uncertainty of

141.82°E) observed by the Japan Meteorological Agency and

parameter values in a stochastic manner.

commonly used for all local areas.

Model and Data

Local paddy rice yield data

Crop model

local areas on the basis of the insurance records during the 11

The actual local yield of paddy rice was calculated for the 160 We used the Process-based Regional-scale Rice Yield Simulator

years (1993, 1995, 1998–2006) when the reported yield loss is

with Bayesian Inference (PRYSBI) (Iizumi et al. 2009), designed

greater than the 70% of the standard yield. The insured records

to simulate the typical growth, development, and grain yield of

were provided by the National Agricultural Insurance Associa-

paddy rice on a prefectural scale (approximately 14,000 km2) in

tion and the local organization of agricultural insurance. To

accordance with given forcing data and the assumption of optimal

obtain the actual local yield, first, we calculated the insured yield 37

Potential Predictability of Local Paddy Rice Yield Variation Using a Crop Model with Local Areal Information

Fig. 2

Time series (a) and scatter plot (b) of simulated and observed local yields during the 11 years. The number of local areas for areal average is shown in (a). In (b), gray and black plot indicates local yield and area-mean local yield, respectively

loss for each local area by dividing the total insured loss of all

the Association of Agriculture and Forestry Statistics. We com-

participating members in a local area by the total insured paddy

puted the annual values of local areal features from 1993 through

area in the local area. Second, the actual local yield was calcu-

2006 on the basis of the quinquennial census data from 1990 to

lated on the basis of the following definition of insured yield loss,

2005 by applying the temporal linear interpolation (or extrapolation) technique to the two nearest-neighboring values.

⎧ ϕyi – yi ( ϕyi > yi ) , Li = ⎨ ( ϕyi ≤ yi ) ⎩ 0

(3)

To select the candidates of local areal features for analysis, first, we extracted more than 80 local areal features, of which

where Li: the insured local yield loss (Mg ha–1), ϕ: the coverage –1

definitions are common throughout the study period. Second, we

ratio (=0.7 in the study area), yi : the standard yield (Mg ha ), and

removed or aggregated the similar local areal features. For

yi: the actual local yield (Mg ha–1). The subscript i denotes the

instance, the numbers of 15–29, 30–39, and 40–59 year-old

year i. Thus, the insured yield loss of zero was recorded if the

regular agricultural workers were aggregated. Third, we exam-

actual yield was greater than the criterion. In contrast, if the actual

ined the correlation coefficient between the local areal features

yield was less than the criterion, the yield loss below 70% of the

and observed local yield and detected the local areal features that

standard yield was recorded as the insured yield loss.

have the positively- or negatively-strong correlation. Finally, we

Therefore, the actual local yield was correctly computable

selected and used the following four local areal features for this

from Eq. (3) if the insured local yield loss was not zero. Other-

study: (1) Labor (%), the percentage of the number of 15–59

wise, the missing data was given for the actual local yield. The

year-old regular agricultural workers to the total number of

number of local areas that experienced the yield loss varies year

agricultural workers; (2) Area (ha), the total acreage of managed

by year; however, out of the 160, not a few local areas experi-

paddy area; (3) Seller (%), the percentage of the number of the

enced the yield loss (Fig. 2). In total, we obtained the actual local

paddy rice farms to the total number of farms, whose paddy rice

yields that were not the missing data for the 38.7% of 1,760

sales total was the highest in the whole farm sales total; and (4)

possible cases (160 local areas×11 years) (hereafter referred to

Helper (%), the percentage of the number of the paddy rice farms

as the observed local yields).

to the total number of farms, which entrusted some of (or all)

Due to this, the observed local yield used is skewed to the

paddy rice cultivation managements to other farms.

lower side in contrast with the ‘true’ probabilistic distribution of local yield. This limitation of data availability makes the evalua-

Results and Discussion

tion of the PRYSBI model impossible for the years with near standard and better than standard yields. However, the potential

Potential predictability: The PRYSBI model alone

predictability of the PRYSBI model is assessable for the years

Fig. 2 shows the comparison of the simulated yield average

with the sever yield losses, which is practically important in the

over the local areas and the corresponding observations during

prediction.

the 11 years. The number of local areas for the average varied annually because of the limitation of observed local yield. The

Local areal feature data

interannual variation of the area-mean simulated yield sometimes

The local areal features related to agriculture were obtained

resembles that of the observation in terms of temporal pattern.

from the Agricultural Census Settlement Cards 2005 provided by

However, the absolute value of the area-mean simulated yields

38


Fig. 3

Geographical patterns of (a) correlation coefficients and (b) root-mean-square error (Mg ha–1) between simulated and observed local yields during 11 years

was almost doubled in comparison with the observations. The obtained Pearson’s correlation coefficient (r) and root-mean-square error (RMSE) between the area-mean simulated and observed local yields were r=0.675 and RMSE=2.1 Mg ha–1, respectively. The correlation and RMSE between the simulated and observed local yields during the 11 years for each local area are shown in Fig. 3. We only displayed the local areas in which the observed local yield was available during more than 5 years. The calculated correlations were r≧0.6 in most local areas over the study area except for the imponderable western area (Fig. 3a), indicating that the PRYSBI model has high capability to simulate the interannual variation of the local yield in many local areas. On the other hand, larger RMSEs (≧2.0 Mg ha–1) were observed over the northeastern and eastern areas (Fig. 3b). The RMSE observed is discussed in the next section.

Fig. 4 Geographical patterns of (a) Labor (%), (b) Area (ha), (c) Seller (%), and (b) Helper (%) averaged over 11 years (see Section “Local areal feature data” for details)

39


Fig. 5

Scatter plots between the RMSE in simulated local yield and (a) Labor, (b) Area, (c) Seller, and (d) Helper. Regression line and correlation coefficient are shown. Asterisks indicate the statistical significance of correlation coefficient at the 1% level

Error of the simulated local yield and the local areal features

the relationship between the local areal features discussed and

Fig. 4 shows the geographical patterns of the four local areal

practical cultivation management is not clear, we dared to incor-

features (i.e., Labor, Area, Seller, and Helper). Obvious areal dif-

porate some of these local areal features in the prediction because

ferences were noted: the higher percentage of Labor (≧50%) in

the PRYSBI model does not count any local areal features in its

the western and central areas; the larger acreage of Area (≧15 ha)

simulation.

in the western and southern areas; the higher percentage of Seller (≧80%) in the western and central areas; and the higher percentage of Helper (≧70%) in the southern and central areas. According to these spatial patterns, it is reasonable to summarize that the western area is the major paddy rice production area for sale. The northeastern area is in poorer condition for paddy rice production

Bayesian inference Here, we developed the following multiplicative model to incorporate the local areal features in the simulated local yield: YOBS = a YSIM · Lb · Ac · Sd · eε,

(4)

in terms of labor and land and thus the farmers in this area pro-

where YOBS: the observed local yield; YSIM: the simulated local

duce paddy rice for captive consumption rather than for sale. The

yield; L: Labor; A: Area; S: Seller; ε: the error term; and a, b,

southern and central areas are in more advantaged conditions than

c, and d: the parameters. Helper was eliminated because the

the northeastern area, although consolidation of paddy areas from

statistically-significant correlation was not found as described in

many farmers to a few major farmers is still taking place.

the above section. The implication of this multiplicative model

Fig. 5 depict the relationship between the RMSE in simulated

was that observed local yields were expressed as a function of

local yield and each of the local areal features. Small RMSE

simulated local yields and scaling factors for the PRYSBI model

emerged along with the increase in Labor, Area, and Seller. The

bias (a) and local areal features (b, c, and d). By the transforma-

obtained correlation coefficients are statistically significant at the

tion, we obtained the following equation,

1% level for all local areal features except for Helper. The RMSE largely scattered regardless of level of Helper.

ln YOBS = ln a + ln YSIM + b · ln L + c · ln A + d · ln S + ε. (5)

The obtained relationships between the RMSE and the local

Then we assumed that ln a is the intercept term and ε is the

areal features indicate that the large simulation error is observed

error term that distributes a normal distribution with zero mean

in the local areas where characterized by lower values of Labor,

and variance of σ2 (i.e., ε ~ N (0, σ2)).

Area, and Seller than in the other local areas. Therefore, although 40

The relationship between the local areal features and the local


Table 1 Posterior means, standard deviations (SD), and 90% (2.5– 97.5%)-probability intervals of the parameter value 90%-probability interval Variables

Mean

SD 2.5%

97.5%

ln a

–0.7805

0.1129

(–1.0031,

–0.5562)

b (Labor)

–0.0288

0.0096

(–0.0475,

–0.0099)

c (Area)

0.0815

0.0095

( 0.0628,

0.1003)

d (Seller)

0.0302

0.0181

(–0.0058,

0.0657)

σ2

0.0159

0.0012

( 0.0138,

0.0184)

yield could vary due to the other factors not counted here. To count such uncertainty, we used the Bayesian inference that estimates the parameter value as a probabilistic distribution. The posterior probability density functions (PDFs) of the parameter values were estimated on the basis of the six odd-year data using the Gibbs sampler algorithm (Geman and Geman 1984) of the statistical software R ver. 4.1 (R Development Core Team 2009).

Fig. 6

Scatter plot between simulated local yield with adjustment (red) and without adjustment (gray) against the corresponding observations during the 11 years

We used the Gelman-Rubin (G-R) statistics (Gelman and Rubin 1992) to assess the convergence (see Iizumi et al. 2009 for

the incorporation of local areal features in the output of the

details).

PRYSBI model has a potential to improve the potential predict-

After iterating 15,000 times for each of three chains, we dis-

ability of local yield variation.

carded the first 5,000 samples as the burn-in period and computed the posterior PDFs of the parameter values from the remaining

Concluding Remarks

30,000 samples consisting of 3 chains×(15,000–5,000) samples (Table 1). The obtained G-R values were one for all parameters,

This study assessed the potential predictability of local yield

indicating the convergences of all parameter values to each sta-

variation by using the dynamic crop simulation model and some

tionary distribution. Parameter values smaller (larger) than one

of local areal features on agriculture. The obtained results showed

indicate the downward (upward) adjustment of simulated local

that the potential predictability of local yield could improve by

yields. The parameter value of Area was comparatively larger

incorporating the local areal features in the output of the crop

than those of Seller and Labor. However, the 97.5-percentile

model, compared to the crop model alone.

value of Seller was larger than the 2.5-percentile value of Area;

The local areal features are easy to access because they are

this suggests that Seller provided more important information

available from published agricultural censuses; however, it is

than Area in a few cases.

needed to address the relationship between the local areal features and practical cultivation management. In addition, there is a space

Potential predictability: The PRYSBI model with local areal

to be filled for the consideration of agricultural policy, technical

features

options, and market price in the prediction. Knowledge from farm

Finally, we assessed the improvement of potential predictability

management study would help to solve these problems. We

when incorporating the local areal features in the simulated local

believe that interdisciplinary work bridging between crop model

yield in the manner of the multiplicative model approach. To do

simulation and farm management study is novel and has the

so, first, we adjusted the simulated local yield during the 11 years

potential to improve the prediction more realistically.

based on the posterior means of the parameter values and obtained Fig. 6. We then compared the two types of simulated

Acknowledgement

local yields (with or without adjustment). Even though the validation period included the independent five even-year data, 2

We acknowledge the National Agricultural Insurance Asso-

the calculated coefficient of determination (r ) and RMSE

ciation and the local organization of agricultural insurance for

between the simulated local yield with adjustment and the corre-

providing the agricultural insurance records. This study was

2

2

sponding observation improved from r =0.430 to r =0.527 and

supported by the Global Environmental Research Fund (S-4 and

from RMSE=2.0 Mg ha–1 to RMSE=0.4 Mg ha–1, respectively,

S-5-3) of the Ministry of the Environment, Japan.

compared to the simulated local yield without adjustment. Thus, 41


References Baigorria, G. A., J. W. Jones and J. J. O’Brien (2008) Potential predictability of crop yield using an ensemble climate forecast by a regional circulation model, Agric. For. Meteorol., 148: 1353–1361. Challinor, A. J., T. R. Wheeler, J. M. Slingo, et al. (2005) Simulation of crop yields using ERA-40: Limits to skill and nonstationarity in weather yield relationships, J. Appl. Meteorol., 44: 516–531. Gelman, A and D. B. Rubin (1992) Inference from iterative simulation using multiple sequences, Statistical Sci., 7: 457–511. Geman, S and D. Geman (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images, Trans. Pattern Anal. Machine Intelligence, 6: 721–741. Geographical Survey Institute (GSI) (1998) ‘User’s Guide for Numerical Map: Revised 2nd Edition’, Japan Map Center, Japan, 500 pp. (in Japanese). Hansen, J. W., A. Potgieter and M. K. Tippett (2004) Using a general circulation model to forecast regional wheat yields in northeast Australia, Agric. For. Meteorol., 127: 77–92. Horie, T., H. Nakagawa, M. Ohnishi, et al. (1995) Rice production in Japan under current and future climates, ed. Matthews, R. B., M. J. Kropff and D. Bachelet, ‘Modeling the Impact of Climate Change on Rice Production in Asia’, IRRI and CAB International, UK, 143– 164.

42

Iizumi, T., K. Ishida, S. Hirako, et al. (2007) Influence of rural socioeconomic characteristics on rice yield damage: A case study using GIS in Motegi-cho and Ichikai-cho, Tochigi, J. Japan Agric. Syst. Soc., 23: 273–282. Iizumi, T., K. Ishida, S. Hirako, et al. (2008) Resistance to cool-summer damage resulting from level of cultivation practices represented by farm household characteristics, J. Japan Agric. Syst. Soc., 24: 103– 112. Iizumi, T., M. Yokozawa and M. Nishimori (2009) Parameter estimation and uncertainty analysis of a large-scale crop model for paddy rice: Application of a Bayesian approach, Agric. For. Meteorol., 149: 333–348. R Development Core Team (2009) R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Austria. Seino, H. (1993) An estimation of distribution of meteorological elements using GIS and AMeDAS data. J. Agric. Meteorol., 48: 379–383. World Data Center for Greenhouse Gases (WDCGG) (2009) WMO Global Atmosphere Watch, , browsed on Jun. 23, 2009. Received July 17, 2009 Accepted January 25, 2010 Agro-informatics & Technology