Random forest regression and spectral band

0 downloads 0 Views 2MB Size Report
Jan 20, 2013 - To cite this article: Elfatih M. Abdel-Rahman, Fethi B. Ahmed & Riyad Ismail (2013): Random forest regression and spectral band selection for ...
This article was downloaded by: [Elfatih Abdel-Rahman] On: 18 September 2012, At: 04:14 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

International Journal of Remote Sensing Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tres20

Random forest regression and spectral band selection for estimating sugarcane leaf nitrogen concentration using EO-1 Hyperion hyperspectral data Elfatih M. Abdel-Rahman

a b

a

, Fethi B. Ahmed & Riyad Ismail

a

a

School of Environmental Sciences, Howard College Campus, University of KwaZulu-Natal, Durban, 4041, South Africa b

Department of Agronomy, Faculty of Agriculture, University of Khartoum, Khartoum North, 13314, Sudan

To cite this article: Elfatih M. Abdel-Rahman, Fethi B. Ahmed & Riyad Ismail (2013): Random forest regression and spectral band selection for estimating sugarcane leaf nitrogen concentration using EO-1 Hyperion hyperspectral data, International Journal of Remote Sensing, 34:2, 712-728 To link to this article: http://dx.doi.org/10.1080/01431161.2012.713142

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-andconditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

International Journal of Remote Sensing Vol. 34, No. 2, 20 January 2013, 712–728

Random forest regression and spectral band selection for estimating sugarcane leaf nitrogen concentration using EO-1 Hyperion hyperspectral data

Downloaded by [Elfatih Abdel-Rahman] at 04:14 18 September 2012

Elfatih M. Abdel-Rahmana,b*, Fethi B. Ahmeda , and Riyad Ismaila a School of Environmental Sciences, Howard College Campus, University of KwaZulu-Natal, Durban 4041, South Africa; b Department of Agronomy, Faculty of Agriculture, University of Khartoum, Khartoum North 13314, Sudan

(Received 2 April 2010; accepted 4 December 2011) Nitrogen (N) is one of the most important limiting nutrients for sugarcane production. Conventionally, sugarcane N concentration is examined using direct methods such as collecting leaf samples from the field followed by analytical assays in the laboratory. These methods do not offer real-time, quick, and non-destructive strategies for estimating sugarcane N concentration. Methods that take advantage of remote sensing, particularly hyperspectral data, can present reliable techniques for predicting sugarcane leaf N concentration. Hyperspectral data are extremely large and of high dimensionality. Many hyperspectral features are redundant due to the strong correlation between wavebands that are adjacent. Hence, the analysis of hyperspectral data is complex and needs to be simplified by selecting the most relevant spectral features. The aim of this study was to explore the potential of a random forest (RF) regression algorithm for selecting spectral features in hyperspectral data necessary for predicting sugarcane leaf N concentration. To achieve this, two Hyperion images were captured from fields of 6–7 month-old sugarcane, variety N19. The machine-learning RF algorithm was used as a feature-selection and regression method to analyse the spectral data. Stepwise multiple linear (SML) regression was also examined to predict the concentration of sugarcane leaf N after the reduction of the redundancy in hyperspectral data. The results showed that sugarcane leaf N concentration can be predicted using both non– linear RF regression (coefficient of determination, R2 = 0.67; root mean square error of validation (RMSEV) = 0.15%; 8.44% of the mean) and SML regression models (R2 = 0.71; RMSEV = 0.19%; 10.39% of the mean) derived from the first-order derivative of reflectance. It was concluded that the RF regression algorithm has potential for predicting sugarcane leaf N concentration using hyperspectral data.

1. Introduction Worldwide, Nitrogen (N) is one of the most limiting nutrients for crop production (Blumenthal et al. 2001; Zhu et al. 2008; Fageria 2009). The growth of sugarcane as a semi-perennial crop depends highly on the application of N fertilizers. The application of an optimum amount of N results in higher sugarcane biomass production and higher N concentration in the plant tissues (Blumenthal et al. 2001). However, excessive N application may promote adverse toxicity symptoms and increase susceptibility to certain pests (Atkinson and Nuss 1989; Rice, Gibert, and Lentini 2006). On the other hand, any loss of N *Corresponding author. Email: [email protected] ISSN 0143-1161 print/ISSN 1366-5901 online © 2013 Taylor & Francis http://dx.doi.org/10.1080/01431161.2012.713142 http://www.tandfonline.com

Downloaded by [Elfatih Abdel-Rahman] at 04:14 18 September 2012

International Journal of Remote Sensing

713

from sugarcane farms contributes to the contamination of surface and underground water (Blumenthal et al. 2001). Field managers now face the new challenge of striking a balance between the need to increase sugarcane yields and quality and reduce the environmental impacts of excessive N applications (Ma, Ahuja, and Bruulsema 2009). There is, therefore, a need for accurate and quick field-wide N estimation that could assist in making decisions regarding the application of optimal amounts of N in the right place at the right time. Methods that take advantage of remote sensing, particularly the use of hyperspectral data, can provide non-destructive, cost-effective, and near-real-time monitoring routines for the estimation of sugarcane N concentration (Tarpley, Reddy, and Sassenrath-Cole 2000). The term “hyperspectral” refers to the remote-sensing technique that enables the capturing of spectral data in many narrow (width