Pedosphere 23(4): 417–421, 2013 ISSN 1002-0160/CN 32-1315/P c 2013 Soil Science Society of China Published by Elsevier B.V. and Science Press
Application of Visible/Near-Infrared Spectra in Modeling of Soil Total Phosphorus∗1 HU Xue-Yu1,2,∗2 1 School 2 Soil
of Environmental Studies, China University of Geosciences, Wuhan 430074 (China) and Water Science Department, University of Florida, Gainesville, FL 32611 (USA)
(Received September 5, 2012; revised March 22, 2013)
ABSTRACT Overabundance of phosphorus (P) in soils and water is of great concern and has received much attention in Florida, USA. Therefore, it is essential to analyze and predict the distribution of P in soils across large areas. This study was undertaken to model the variation of soil total phosphorus (TP) in Florida. A total of 448 soil samples were collected from different soil types. Soil samples were analyzed by chemical reference method and scanned in the visible/near-infrared (VNIR) region of 350–2 500 nm. Partial least squares regression (PLSR) calibration model was developed between chemical reference values and VNIR values. The coefficient of determination (R2 ) and the root mean squares error (RMSE) of calibration and validation sets, and the residual prediction deviation (RPD) were used to evaluate the models. The R2 in calibration and validation for log-transformed TP (log TP) were 0.69 and 0.65, respectively, indicating that VNIR calibration obtained in this study accounted for at least 65% of the variance in log TP using only VNIR spectra, and the high RPD of 2.82 obtained suggested that the spectral model derived in this study was suitable and robust to predict TP in a wide range of soil types, being representative of Florida soil conditions. Key Words:
Florida, partial least square regression, prediction, spectral model, visible/near-infrared spectroscopy
Citation: Hu, X. Y. 2013. Application of visible/near-infrared spectra in modeling of soil total phosphorus. Pedosphere. 23(4): 417–421.
Soil phosphorus (P) exists in inorganic and organic forms, and each form is a continuum of many P compounds existing in equilibrium with each other and ranging from solution P (taken up by plants and soil organisms) to very stable P in immobilized forms (Olsen and Sommers, 1982; White, 1997). In order to elevate the concentration of solution P for high crop yield, high phosphorus fertilizer is applied. In agricultural land, P enrichment in surface soils is likely related to human activity such as PO3− 4 fertilizer application (Chen and Ma, 2001). Arable lands are the main diffuse source of P pollution to surface-water bodies (Sharpley et al., 1992; Ul´en and F¨olster, 2007), which can accelerate eutrophication in lakes and streams surrounding these areas. Therefore, it is critical to cost-effectively assess/monitor the concentration of soil P for sustainable land management and environmental conservation. Thus, cost-effective soil P analysis is needed. In recent years, visible/near-infrared diffuse reflectance spectroscopy (VNIRS) was used to estimate soil properties (McCarty et al., 2002; Viscarra Rossel et al., 2006; Vasques et al., 2008). Models of soil carbon ob∗1 Supported
tained using VNIRS often predict soil carbon with high accuracy, explaining more than 80% of its variability (Chang and Laird, 2002). Vasques et al. (2008) investigated the feasibility of VNIRS to determine the concentration of carbon in soils collected in the Santa Fe River Watershed, Florida, and demonstrated that the the VNIRS models were robust and stable enough to be applied for similar soils. Existing studies have shown that VNIRS is available to provide rapid information about soil physical and chemical properties, e.g. moisture, carbon, nitrogen, phosphorus and calcium, and cation exchange capacity in an economical manner (Ben-Dor and Banin, 1995; Chang et al., 2001; Reeves and McCarty, 2001; Odlare et al., 2005; Cozzolino and Moron, 2006; Maleki et al., 2006; Wetterlind et al., 2008; Mouazen et al., 2009). However, many of the investigations involved a limited number of samples, or the samples came from a limited number of sites of similar range of soils. A robust model obtained using VNIRS should be capable of predicting soil P values across large regions with a range of soil types where variations in the in-
by the National Natural Science Foundation of China (No. 41071159) and the Cooperative Ecosystem Studies UnitNational Resources Conservation Service (NRCS), USA. ∗2 Corresponding author. E-mail:
[email protected].
418
organic and organic components of soils are present (Dunn et al., 2002). Thus, the objective of the present work was to explore the use of VNIRS to predict soil total phosphorus in a wide range of soils in Florida in the southeastern United States. MATERIALS AND METHODS Soil samples Soils used in this study had been sampled and characterized as a part of the Florida Cooperative Soil Survey Program conducted jointly by the University of Florida Soil and Water Science Department and the United States Department of Agriculture Natural Resources Conservation Service from the mid 1960s until the 1990s. A total of 448 surface soil samples (genetic horizons A, A1, Ap, O, O1, and Op) were selected from a pool of 8 272 archived samples to ensure taxonomic and geographic representation. Sampling depth varies from the top 1 to 76 cm, with a mean of 16 cm. The soil samples proportionally represent all soil orders that occur in Florida. Twenty-eight percent of the samples occurred in Spodosols, 22% in Entisols, 19% in Ultisols, 14% in Alfisols, 10% in Histosols, 4% in Mollisols, and 3% in Inceptisols. Geographical representation of the samples was achieved via an adequate scattering of samples throughout the state and a representation of the most extensively occurring soil series mapped within each county. These soils were from 51 counties that covered about 80% of the total land area of Florida (Chen et al., 1999; Chen and Ma, 2001). Chemistry reference analysis of phosphorus Basic sample preparation consisted of air-drying, ball-milling and sieving through a 0.25-mm mesh. Soil samples were digested using the United States Environmental Protection Agency Method 3052 (USEPA, 1995). Approximately 1.000 g of soil was weighed into a 120-mL Teflon pressure digestion vessel, and 9 mL of concentrated HNO3 , 4 mL of concentrated HF, and 1 mL of concentrated HCl were added. Samples and reagents were well mixed, sealed, and digested in a CEM-2000 digestion microwave oven (CEM Corp., Matthews, NC) for 20 min at 0.83 × 106 Pa (120 psi). After cooling, 2 g of H3 BO3 were added to the digested solution to neutralize excess HF. The samples were then filtered through Whatman #42 filters (Whatman International Ltd., Maidstone, UK) and diluted to 100 mL with deionized distilled water. Total phosphorus (TP) was determined on an inductively coupled plasma spectrophotometer (Thermo
X. Y. HU
Jarrell Ash ICAP 61-E, Norwalk, CT). Method detection limit for the P determination is 10 mg P kg−1 soil (Chen and Ma, 2001). The laboratory TP measurements were positively skewed, and a logarithm transformation was applied to TP in order to normalize the distribution before running the regression. Thus, the VNIRS model was developed based on log-transformed values that approximate a Gaussian distribution. Measurement of spectra in the visible/near-infrared range Soil samples were dried for 12 h at 45 ◦ C and then placed in the glass chamber. They were scanned for reflectance in a QualitySpec Pro spectroradiometer (Analytical Spectral Devices Inc., Boulder, CO). The instrument measures reflectance in the wavelength range of 350–2 500 nm at 1-nm intervals. Soil samples were scanned four times, rotated by an angle of 90◦ . Reference spectrum using Spectralon (LabSphere, North Sutton, NH) was collected prior to the first scan and at every 25 samples (100 scans). An average spectral curve was calculated based on four scans of each soil sample. Pre-processing of soil spectra The collected soil spectral curves were composed of 2 151 reflectance measurements (bands) for each sample. Typical soil reflectance curves in the VNIR region for various soil types are shown in Fig. 1. The average soil spectral curves, obtained from the four rotations, were smoothed across a moving window of 9 nm using the Savitzky-Golay algorithm (Savitzky and Golay, 1964) to reduce random noise, and then ave-
Fig. 1 Typical soil reflectance curves in the wavelength range of 359–2 500 nm for various soil types.
APPLYING INFRARED SPECTRA TO MODEL SOIL P
raged (pooled) across 10-nm intervals to reduce the dimensionality of the data and to match the spectral resolution of the spectroradiometer in the near-infrared region. This resulted in the reduction of the soil spectra to 214 reflectance values. Finally, Savitzky-Golay 1st derivative using a 1st-order polynomial across a 9band window was applied (Vasques et al., 2008). Preprocessing of soil spectra was implemented in the Unscrambler 9.5 software (CAMO Technologies Inc., Woodbridge, NJ). Development of the visible/near-infrared spectroscopy model The total of 448 soil samples was split randomly into 302 samples for calibration (model development) and 146 samples for validation. Partial least squares regression (PLSR) was performed on the calibration set using the orthogonalized PLSR algorithm for one Y -variable (PLS-1) and full cross-validation. Compared to other multivariate calibration techniques such as stepwise multiple linear regression (SMLR), principal components regression (PCR), regression tree and committee trees, partial least squares regression performed best while modeling soil carbon using visible/near-infrared spectra (Vasques et al., 2008). Partial least squares regression reduces the data, noise, and computation time, with minor loss of the information contained in the original variables. In this study PLSR was conducted in the Unscrambler 9.5 (CAMO Technologies Inc., Woodbridge, NJ) to relate the soil spectral data to the chemical reference data of soil TP using the calibration set. Cross validation was used to avoid overfitting of the calibration model. The VNIRS model was validated based on the independent validation set the coefficient of determination (R2 ; Eq. 1) and the residual prediction devia-
419
tion (RPD; Eq. 3). The root mean square error (RMSE; Eq. 2) was also provided. R2 =
n
(ˆ yi − y¯)2
n
i=1
(yi − y¯)2
(1)
i=1
n yi − yi )2 n RMSE = (ˆ
(2)
i=1
RPD = SDv /RMSEv
n/(n − 1)
(3)
where yˆ is the predicted values; y¯ is the mean of observed values; y is the observed values; n is the number of predicted/observed values with i = 1, 2, · · · , n; SDv is the standard deviation of the validation set; and RMSEv is the root mean square error of validation. RESULTS AND DISCUSSION Descriptive statistics The descriptive statistics of measured soil TP with respect to the whole dataset, calibration set, and validation set are shown in Table I. Considering the whole dataset, TP showed a positively skewed distribution, with mean 214.18 mg kg−1 , median 62.50 mg kg−1 , and range between 0.01 and 2 869.99 mg kg−1 . The results showed that enough variation existed in the data set to develop calibration model. The highest TP values, based on the mean, occurred in Histosols, and mean TP decreased by soil order in the order of Histosols > Inceptisols, Mollisols > Ultisols > Entisols, Alfisols > Spodosols. The highest total P in the soils was associated with Histosols, possibly because the Histosols had the greatest contents of organic C and total Fe and Al that contribute to P adsorption and fixation. Anthropogenic inputs of
TABLE I Descriptive statistics of measured soil total phosphorus (TP) Statistics
TP Whole set
Log-transformed TP Calibration
Validation
Whole set
kg−1
No. of observations Mean Standard error of mean Median Standard deviation Coefficient of variation Skewness Kurtosis Range Minimum Maximum
448 214.18 17.40 62.50 364.05 169.98 3.62 16.80 2 869.99 0.01 2 870.00
mg 302 199.09 19.52 70.00 333.52 167.52 4.10 23.12 2 869.99 0.01 2 870.00
146 244.34 34.60 60.00 418.13 171.12 3.00 10.23 2 609.99 0.01 2 610.00
448 1.82 0.04 1.80 0.90 49.77 −2.16 7.76 5.46 −2.00 3.46
Calibration log (mg 302 1.83 0.04 1.84 0.82 45.16 −2.14 8.83 5.46 −2.00 3.46
Validation
kg−1 ) 146 1.78 0.08 1.78 1.04 58.45 −2.08 6.04 5.42 −2.00 3.42
420
X. Y. HU
P in Histosols were also evident, which lead to a higher total P in the Histosols than in the major Florida mineral soils (Entisols, Spodosols, Ultisols) (Chen and Ma, 2001). Log-transformed TP (log TP) had an approximate Gaussian distribution. The minimum and maximum values of log TP were −2.00 and 3.46 log (mg kg−1 ), respectively, and the mean and median were 1.82 and 1.80 log (mg kg−1 ), respectively. Comparison of mean log TP between the calibration and validation sets did not show a significant difference between them, according to the Student’s t test (P = 0.30) at a 0.05 significance level. This similarity between the calibration and validation sets was indicative that the randomly separated validation samples appropriately represented the population under study.
and > 2.0. RPD values above 2.0 were considered stable and accurate predictive models; RPD values between 1.4 and 2.0 indicated fair models that could be improved; and RPD values below 1.4 indicated poor predictive capacity. Dunn et al. (2002) suggested that when using NIRS for the analysis of agricultural soils, suitable limits for RPD may be: < 1.6 (poor); 1.6– 2.0 (acceptable); > 2 (excellent). In the present study, the RPD obtained using VNIRS was 2.82, which was better than those reported by other authors on the measurement of P concentration in soils (Confalonieri et al., 2001; Viscarra Rossel et al., 2006; Mor´on and Cozzolino, 2007; Chen et al., 2008; Mouazen et al., 2009), and had comparable accuracy to the model developed by Malley et al. (1999) using NIRS. The results of calibration obtained in this study suggested that the VNIRS model for soil TP was reliable and robust.
Visible/near-infrared spectroscopy model of soil total phosphorus
Prediction of soil total phosphorus using VNIRS Model
Compared with stepwise multiple linear regression (SMLR), PLSR is a robust statistical method that uses all the available reflectance data to build the models (Vasques et al., 2008). We tested preliminarily SMLR to calibrate the VNIRS model of TP, but we only obtained an R2 of 0.39 in calibration mode. In comparison, PLSR produced much better results (Table II).
Fig. 2 illustrates the relationship between VNIRS optical data and chemical reference values for log TP in soil samples.
TABLE II The spectral modela) of log-transformed soil total phosphorus n
9
Calibration
Validation
Rc2
Rv2
0.69
RMSEc log (mg 0.32
kg−1 ) 0.65
RMSEv log (mg 0.37
RPD kg−1 ) 2.82
a) n = number of partial least squares factors used in the model; Rc2 = coefficient of determination of calibration; RMSEc = root mean square error of calibration; Rv2 = coefficient of determination of validation; RMSEv = root mean square error of validation; RPD = residual prediction deviation.
The R2 in calibration and validation for log TP indicated that VNIRS calibration obtained in this study accounted for at least 65% of the variance in log TP using only VNIR spectra. In agricultural and environmental applications, while much effort has been applied to the development of calibration, no critical levels of RPD have been set for near-infrared reflectance spectroscopy (NIRS) analysis of soils (Dunn et al., 2002). Reports by other authors (Chang et al., 2001) state that NIRS has the ability to predict values of soil properties using the following categories for RPD values: < 1.4, 1.4–2.0
Fig. 2 Predicted versus measured log-transformed TP (log TP) by partial least square regression calibration model.
The more accurate the predictive model, the more closely all points cluster near the theoretical 1:1 (solidline) correspondence. Student’s test indicated that there was not a significant difference at a 0.05 significance level between predicted and measured values of log TP (P = 0.3583) and VNIRS predicted values were highly correlated to the measured values obtained by chemistry reference (P = 0.001). CONCLUSIONS Visible/near-infrared spectroscopy using partial least squares regression (PLSR) offered a rapid ap-
APPLYING INFRARED SPECTRA TO MODEL SOIL P
proach to predict total phosphorus in Florida soils with reasonable accuracy (Rv2 = 0.65). The methodology used in this study has been applied for other soil properties and in areas other than Florida. Given that soil total phosphorus is costly and laborious to measure, VNIRS models can improve the cost-efficiency to estimate soil total phosphorus. These results showed the potential of VNIRS as a method for the routine determination of soil total phosphorus. Thus, the VNIRS technique may be useful for monitoring changes in phosphorus input and output in the context of land use change. ACKNOWLEDGEMENTS I would like to thank those who participated in the Florida Cooperative Soil Survey for their collection and characterization of a large number of Florida soil samples in this study. Thanks also to Drs. G. M. Vasques, L. Q. Ma and S. Grunwald in the Soil and Water Science Department of University of Florida and to Dr. Chen Ming in the Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences for their helpful comments and suggestions as well as kind assistance during the study and manuscript preparation. REFERENCES Ben-Dor, E. and Banin, A. 1995. Near-infrared analysis as a rapid method to simultaneously evaluate several soil properties. Soil Sci. Soc. Am. J. 59: 364–372. Chang, C. W. and Laird, D. A. 2002. Near-infrared reflectance spectroscopic analysis of soil C and N. Soil Sci. 167. 110– 116. Chang, C. W., Laird, D. A., Mausbach, M. J. and Hurburgh, C. R. 2001. Near-infrared reflectance spectroscopy-principal components regression analyses of soil properties. Soil Sci. Soc. Am. J. 65: 480–490. Chen, M. and Ma, L. Q. 2001. Taxonomic and geographic distribution of total phosphorus in Florida surface soils. Soil Sci. Soc. Am. J. 65: 1539–1547. Chen, M., Ma, L. Q. and Harris, W. G. 1999. Baseline concentrations of 15 trace elements in Florida surface soils. J. Environ. Qual. 28: 1173–1181. Chen, P. F., Liu, L. Y., Wang, J. H., Shen, T., Lu, A. X. and Zhao, C. J. 2008. Real-time analysis of soil N and P with near infrared diffuse reflectance spectroscopy. Spectrosc. Spect. Anal. 28: 295–298. Confalonieri, M., Fornasier, F., Ursino, A., Boccardi, F., Pintus, B. and Odoardi, M. 2001. The potential of near infrared reflectance spectroscopy as a tool for the chemical characterization of agricultural soils. J. Near Infrared Spec. 9: 123–131. Cozzolino, D. and Mor´ on, A. 2006. Potential of near-infrared reflectance spectroscopy and chemometrics to predict soil or-
421
ganic carbon fractions. Soil Till. Res. 85: 78–85. Dunn, B. W., Beecher, H. G., Batten, G. D. and Ciavarella, S. 2002. The potential of near-infrared reflectance spectroscopy for soil analysis—a case study from the Riverine plain of south-eastern Australia. Aust. J. Exp. Agr. 42: 607–614. McCarty, G. W., Reeves III, J. B., Reeves, V. B., Follett, R. F. and Kimble, J. M. 2002. Mid-infrared and near-infrared diffuse reflectance spectroscopy for soil carbon measurement. Soil Sci. Soc. Am. J. 66: 640–646. Maleki, M. R., Van Holm, L., Ramon, H., Merckx, R., De Baerdemaeker, J. and Mouazen, A. M. 2006. Phosphorus sensing for fresh soils using visible and near infrared spectroscopy. Biosystems Eng. 95: 425–436. Malley, D. F., Yesmin, L., Wray, D. and Edwards, S. 1999. Application of near-infrared spectroscopy in analysis of soil mineral nutrients. Commun. Soil Sci. Plan. 30: 999–1012. Mor´ on, A. and Cozzolino, D. 2007. Measurement of phosphorus in soil by near infrared reflectance spectroscopy: Effect of reference method on calibration. Commun. Soil Sci. Plan. 38: 1965–1974. Mouazen, A. M., Maleki, M. R., Cockx, L., Van Meirvenne, M., Van Holm, L. H. J., Merckx, R., De Baerdemaeker, J. and Ramon, H. 2009. Optimum three-point linkage set up for improving the quality of soil spectra and the accuracy of soil phosphorus measured using an on-line visible and near infrared sensor. Soil Till. Res. 103: 144–152. Odlare, M., Svensson, K. and Pell, M. 2005. Near infrared reflectance spectroscopy for assessment of spatial soil variation in an agricultural field. Geoderma. 126: 193–202. Olsen, S. R. and Sommers, L. E. 1982. Phosphorus. In Page, A. L. et al. (eds.) Methods of Soil Analysis. Part 2. Chemical and Microbiological Properties. 2nd Edition. ASA and SSSA, Madison, WI. pp. 403–430. Reeves, J. B. and McCarty, G. W. 2001. Quantitative analysis of agricultural soils using near infrared reflectance spectroscopy and a fibre-optic probe. J. Near Infrared Spec. 9: 25–34. Savitzky, A. and Golay, M. J. E. 1964. Smoothing and differentiation of data by simplified least-squares procedures. Anal. Chem. 36: 1627–1639. Sharpley, A. N., Smith, S. J., Jones, O. R., Berg, W. A. and Coleman, G. A. 1992. The transport of bioavailable phosphorus in agricultural runoff. J. Environ. Qual. 21: 30–45. Ul´ en, B. and F¨ olster, J. 2007. Recent trends in nutrient concentrations in Swedish agricultural rivers. Sci. Total Environ. 373: 473–487. USEPA. 1995. Test Methods for Evaluating Solid Waste. Vol. IC: Laboratory Manual Physical/Chemical Methods, SW-846, 3rd Edition. U.S. Gov. Print. Office, Washington, DC. Vasques, G. M., Grunwald, S. and Sickman, J. Q. 2008. Comparison of multivariate methods for inferential modeling of soil carbon using visible/near-infrared spectra. Geoderma. 146: 14–25. Viscarra Rossel, R. A., Walvoort, D. J. J., McBratney, A. B., Janik, L. J. and Skjemstad, J. O. 2006. Visible, near infrared, mid infrared or combined diffuse reflectance spectroscopy for simultaneous assessment of various soil properties. Geoderma. 131: 59–75. Wetterlind, J., Stenberg, B. and Jonsson, A. 2008. Near infrared reflectance spectroscopy compared with soil clay and organic matter content for estimating within-field variation in N uptake in cereals. Plant Soil. 302: 317–327. White, R. E. 1997. Principles and Practices of Soil Science: The Soil as a Natural Resource. Blackwell Publishing, Oxford.