International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
Contents lists available at ScienceDirect
International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag
Estimating the parameters of forest inventory using machine learning and the reduction of remote sensing features Tanel Tamm *, Kalle Remm 1 University of Tartu, Faculty of Science and Technology, Institute of Ecology and Earth Sciences, Department of Geography, Vanemuise 46, 51014 Tartu, Tartumaa, Estonia
A R T I C L E I N F O
A B S T R A C T
Article history: Received 23 October 2007 Accepted 19 March 2009
Locally computed statistics of image texture and a case-based reasoning (CBR) system were evaluated for mapping of forest attributes. Cluster analysis was preferred to regression models, as a pre-selection method of features. The best stand-based accuracy using satellite sensor images was 74.64 m3 ha1 (36%) RMSE for stand volume, 1.98 m3 ha1 a1 (49%) for annual increase in stand volume, where k = 0.23 for stand growth classes and k = 0.41 for dominant tree species in stands. The top pixel-based accuracy using orthophotos was 76.54 m3 ha1 (41%) RMSE for stand volume, 1.87 m3 ha1 a1 (44%) for annual increase in stand volume, where k = 0.24 for stand growth classes and k = 0.38 for dominant tree species in stands. Mean saturation in 30 m radius was the most useful feature when orthophotos were used, and standard deviation of Landsat ETM 6.2 values in 80 m radius was the best when satellite sensor images were used. The most valuable feature components (radii, channels and local statistics) for orthophotos were: 30 m kernel radius, lightness and the mean of pixel values; for satellite sensor images: 80 m kernel radius, near-infrared channel (ETM 4) and the mean of pixel values. Locally computed statistics. ß 2009 Elsevier B.V. All rights reserved.
Keywords: Case-based reasoning Machine learning Local statistics Remote sensing of forests
1. Introduction Geographical information systems (GIS) and different remote sensing methods are increasingly used in analysing data for the efficient and sustainable use of natural resources. The methods for estimating forest characteristics from image data can be divided between empirical (e.g. k nearest neighbours—k-NN method) and physical (e.g. inversion of canopy reflectance model) (Gemmel, 1999). The physical approach in the Estonian remote sensing community is represented by a directional multispectral forest reflectance model developed by Kuusk and Nilson (2000). In addition, visual photo interpretation methods have a long history and are still widely used (Congalton et al., 2002; Lonard et al., 2000). For instance, total mean volume of the growing stock has been evaluated by Kilpela¨inen and Tokola (1998) using stand-level visual photo-interpretation with a relative accuracy of 14–45% using orthophotos and 20–70% using satellite sensor images. Stand volume and biomass are the characteristics of a forest that have most commonly been estimated using computational remote sensing methods. With reference to the following
* Corresponding author. Tel.: +372 737 5827. E-mail addresses:
[email protected] (T. Tamm),
[email protected] (K. Remm). 1 Tel.: +372 737 5827. 0303-2434/$ – see front matter ß 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.jag.2009.03.006
examples, the remote sensing approach is promising, but the accuracy of the estimations has not usually been very high. When looking at stand-level accuracy expressed as a relative RMSE (root mean square error) of stand volume estimation, Ma¨kela¨ and Pekkarinen (2004) using Landsat images found an accuracy of 48% at stand-level and 60–80% at pixel-level, and using image segmentation, 48% at stand-level and 79% at pixel-level (Ma¨kela¨ and Pekkarinen, 2001). According to Labrecque et al. (2006), the accuracy of stand volume estimation at stand-level using the k-NN method was 51%, Katila and Tomppo (2001) found that it ranged from 64% to 295% depending on study area and tree species, Hyyppa¨ et al. (2000) found 56% accuracy and Holmgren et al. (2000) 36% accuracy. Using Landsat images from three different dates, Franco-Lopez et al. (2001) found an accuracy of 83% (49 m3 ha1). Using a combination of field plots, edge detection from a Landsat image and kriging interpolation, Wallerman et al. (2002) found 41% accuracy. Using segmented 3 3 pixel windows, Hall et al. (2006) found 36% accuracy (70 m3 ha1), and Reese et al. (2003) found 33% in total wood volume and 23% for age estimations. When estimating above-ground biomass using photo-inventory maps, Luther et al. (2006) found an accuracy level of 44.8%. Using aerial photographs, Tuominen and Pekkarinen (2005) reported an accuracy in estimations (%RMSE) of 57.8%, Muinonen et al. (2001) report 18–27%, Hyyppa¨ et al. (2000) 46% and Tuominen and Pekkarinen (2004) reported 66.1% using modified colour-infrared aerial photographs.
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
The texture in image data can be divided between those that are either statistical (describing the statistical distribution of values) or structural (describing the spatial distribution of values) (Haralick, 1979). For instance, Coops and Culvenor (2000) used local variance as the statistical characteristics of an image in order to detect the spatial structure of forest stands. Locally calculated variograms and correlograms used by Muinonen et al. (2001) and a texture index from the Fourier spectra expressing a gradient of coarseness vs. fineness (Couteron et al., 2005) are examples of structural parameters of texture. Both these types of textural parameters describe the pattern of pixels within a local neighbourhood. The need to include neighbouring areas around the location has been stressed by Kilpela¨inen and Tokola (1998) in prediction systems, by Tuominen and Pekkarinen (2005) in forest remote sensing, and by Remm and Luud (2003) in predictive species distribution mapping. Plenty of field-estimated national forest inventory (NFI) data exist. Case-based reasoning (CBR) is a methodology especially well suited to situations where a large number of observations on a complex predictable variable exist and highly generalized predictive models are ineffective. The CBR methodology, also known as similarity-based reasoning, has been defined as a multidisciplinary science that is based on the usage of former experiences at a minimal level of generalization (Aha, 1998). Overviews of CBR systems have been given, for example, by Aha (1998) and Remm (2004). To make a decision in a new situation, the most similar examples or cases are found and compared with the intended prediction in a CBR-system. The assumption that in similar conditions similar results occur has been formalized by Hu¨llermeier (2001). Wilson and Martinez (2000) provide an overview of similarity functions. k-NN estimations of forest parameters using remote sensing data and the similarity between plots has been applied in the Finnish national forest inventory since the 1990s (Tomppo et al., 2002; Tomppo, 1991). k-NN is one of the methods of similaritybased reasoning. The CBR approach differs from simple k-NN in terms of the iterative optimization of weights for features and observations, also called lazy learning (Aha, 1998). An overview of machine learning methods has been given, for example, by Mitchell (1997). In this study, CBR methodology, NFI data from the State Forest Management Centre and image and map data were used to estimate the properties of stands that are the most relevant for the management of forests. The following goals were set for the study: (1) to evaluate the estimation of the parameters of the NFI stands: the dominant tree species, maturity classes, mean annual increment (m3 ha1 a1), and stand volume in the primary layer (m3 ha1) using a CBR system with a Landsat 7 ETM+ satellite sensor image, chromatic orthophotos, and 1:10 000 base and soil maps; (2) to determine and compare the most indicative features and their components (radii, channels, and local statistics); (3) to evaluate the accuracy of forest maps produced with the CBR system, using a satellite sensor image, base and soil maps. 2. Material 1846 randomly located sample points on orthophotos (gap between points >100 m) and 969 points on the satellite sensor image (gap between points >250 m) were generated in the study area, which is situated in the northern part of Estonia (Fig. 1). These sample points were spatially joined to NFI data gathered by forest inventory experts during the summers of 2001 and 2002. The parameters of the NFI stands that were used in the estimation were: the dominant tree species according to tree stem volume in the primary layer, maturity classes (seedling stand, underbrush, superior underbrush, middle-aged, maturing and mature forest),
291
Fig. 1. Location of the study area.
mean annual increment of the stand volume computed from models (m3 ha1 a1), and stand volume computed from stand height and completeness or basal area or the number of trees in the primary layer (m3 ha1). The chromatic orthophotos (true colour images) used in the study, which had a 1 m spatial resolution, were acquired between June and July 2002, and were not obtained with film sensitive to the near infrared portion of the spectrum. The Landsat 7 ETM+ satellite sensor image (frame 187–018, from 6 July 2001) was received from the project ‘‘Image 2000—the Spatial Reference for Europe’’. 3. Methods 3.1. Explanatory variables Explanatory variables used in case-based machine learning are called features. A feature in this investigation consists of three components: radius, channel and a local statistic. The hue (H), saturation (S), and lightness (L) of the orthophotos were computed from the values of the red (R), green (G), and blue (B) channels. To provide local statistics the locally calculated average (a), the proportion over the average (poa), the standard deviation (stdv), the coefficient of variation (cvar), the mode (m) from statistical texture indicators and the autocorrelation index Moran’s I (acorr), and Moran’s I weighted with the reciprocal of distance (wMI) from structural texture indicators were used. Only pixels within an NFI stand and within a given radius (10, 20, 30, 40, and 50 m) were used when calculating local statistics from aerial photos. The use of a pre-classifier masks permitted the calculation of local statistics only within the NFI stand into which a random point fell. The boundaries of NFI stands were not used as a pre-classifier in the case of the satellite sensor image. Instead, a mask of the forest and young forest classes from the 1:10 000 base map of Estonia were used. Local characteristics of the satellite sensor image were calculated within a radius of 40, 80, and 120 m. 3.2. Machine learning A computer program for machine learning MLNN (Remm, 2004) and a program for calculating local statistics—LSTATS (Remm, 2005) were used in this study. The machine learning software can be used for learning the set of weights of features and exemplars (called the best predictive weights in this paper) giving the most accurate similarity alias case-based predictions. An exemplar is an
292
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
observation or an observation site that is selected from training data to be used in a similarity-based estimation of a dependent variable. The software can be used to predict different types of dependent variables: continuous, multi- and binominal and complex characteristics (e.g. stand formula in forest). The calculation of similarity between cases starts from the calculation of partial similarity in single features. The algorithm for calculating similarity depends on the type of each particular feature: nominal or numerical. In the case of a numerical feature (f), the difference (D) between its values (Tf and Ef) for an exemplar (E) and a training instance (T) is calculated in MLNN as T f E f D¼ ; 2wE w f where wE is the weight of exemplar E and w f is the weight of feature f. Only numerical features were used in this study. The total similarity is calculated as a weighted average of partial similarities. The first step of the machine learning method used in this study was the inclusion, one by one, of features that had proportionally decreased weights (Fig. 2). The search for the best predictive feature weights followed in 30 iterations. The weights were restandardised after every change; therefore, the mean value of feature weights remained equal to one. The best predictive set of weights of features was used in exemplar weighting. Individual weights, in real numbers, were given to every exemplar by the machine learning process as an assessment of its best influence on increasing the accuracy of the prediction. A zero weight means the exclusion of the instance from the set of exemplars. Machine learning processes were carried out for every dependent variable to find the best predictive weights for features and exemplars. Decisions by the machine learning software were made according to the leave-one-out cross-validation (LOOC) training accuracy. LOOC means that the predicted value for every training instance is calculated by all other instances, leaving this instance out. A more detailed description of machine learning algorithms in MLNN can be found in Remm (2004). The software solutions MLNN and LSTATS were combined and further developed to form an advanced software system Constud after the completion of this study (Remm and Remm, 2008; Linder et al., 2008).
Fig. 2. Machine learning and estimation in software MLNN.
3.3. Feature reduction An enormous number of combinations exist for forming a feature from statistics, radii and channels. Therefore, a method of preliminary feature selection ought to be chosen to reduce the computing time. Principal components analysis (PCA) is a popular method for the reduction of data dimensionality (Castro-Esau et al., 2004; Chica-Olmo and Abarca-Herna´ndez, 2000; Cingolani et al., 2004; Ranson et al., 2001) and correlation matrices have usually been used in feature pre-selection (Liu et al., 2002; Tuominen and Pekkarinen, 2005). PCA replaces raw variables with a smaller number of generalized ones that are sometimes difficult to interpret and to use outside the initial dataset. The use of very large correlation matrices, such as 150 150 features in this study, would have been very challenging and, in addition, different statistics of statistical correspondence would have to have been used to compare nominal and numerical variables. Therefore, cluster analysis (k-means clustering algorithm) and regression analysis were compared as feature reduction methods instead of PCA in this study. Step-wise linear regression in the case of continuous variables, and generalized linear models (GLZ) in the case of nominal variables were applied in Statsoft Statistica 6.0. Initially, values of 150 features (6 channels, 5 local statistics, and 5 radii) were calculated from orthophotos and then standardized. Next, 30 candidate features were selected out of 150. The 30 features were
the first ones selected in the regression model using stepwise regression analysis and the most indicative features according to Wald-statistics in GLZ. From the cluster analysis of features, the features closest to the 30 cluster centres were selected. Then, six separate machine learning iterations were carried out for every dependent variable using the training dataset of 1846 random points. The two methods of feature reduction were compared according to the LOOC-fit reached by machine learning. For continuous dependent variables, the statistical significance of difference between the LOOC-fit reached using the two preselection methods was compared using the Student’s t-test. For categorical dependent variables, the difference was compared using the methodology described by Congalton and Green (1999). The difference was the largest and statistically most significant in estimating stand volume (p = 0.008) and dominant tree species (p = 0.047) (Table 1). Differences in fit in the estimation of maturity class and mean annual increment were not significant. Therefore, cluster analysis was preferred, as it reduced the number of features more efficiently. 3.4. Generation of maps of estimated values After the comparison of feature reduction methods, 30 features from orthophotos and 40 from the satellite sensor images were
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
293
Table 1 The average LOOC fit in training data using different pre-selection methods.
Maturity class k Dominant tree species k Mean annual increment RMSE (%) Stand volume RMSE (%)
Cluster analysis
Regression model
Significance of difference
0.49 0.63 37.80 34.01
0.50 0.66 38.45 32.55
p = 0.770 p = 0.047 p = 0.230 p = 0.008
selected using the k-means cluster analysis of forest inventory parameters. Machine learning in MLNN optimized the weights of the features and exemplars separately for every dependent variable. The values of selected features were calculated at every cell of the output raster. Then the optimized set of weights was used by MLNN in the similarity-based estimation of the dependent variable at every cell of the output map. The estimated value of a numerical dependent variable is calculated as a weighted average of the variable obtained from the most similar exemplars. The number of exemplars used in decisions is controlled by the similarity threshold that is a parameter optimized during machine learning. An overview of all the technological process is presented in Fig. 3. 3.5. Validation An additional set of data points (712 points on orthophotos and 660 on satellite sensor images) with new locations was created to validate the results. The predicted values of the dependent variables were compared with values in the NFI dataset. For continuous variables RMSE, and for nominal variables k-analyses (Congalton and Green, 1999) were used to estimate the accuracy of
the predictions, where k > 0.8, strong agreement; k = 0.4–0.8, intermediate agreement; k = 0–0.4, poor agreement. 4. Results 4.1. Predictions using orthophotos The dominant tree species and stand volume in the primary layer were recognized more accurately than the mean annual increment of stand volume and the maturity class of the forest stand. The training accuracies were remarkably higher than the validation accuracies, being respectively: RMSE = 62.02 m3 ha1 (33%) and 76.54 m3 ha1 (41%) for stand volume, and k = 0.64 and 0.38 for dominant tree species (Table 2). The usefulness of the features was estimated using average weight in the case of every dependent variable, and as the average weight adjusted by the frequency of usage in the predictive sets. The most useful feature was the average saturation value of image colour within a 30-m radius, which was included in the best predictive set of features for all predicted variables (Table 3). The 30-m radius was the component that had the largest contribution to the predictive sets of all the dependent variables (Fig. 4).
Fig. 3. Technological schema of generation of maps of the estimated forest inventory parameters.
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
294
Table 2 The best pixel-level (area of a pixel 1 m2) training accuracy and cross-validation accuracy using orthophotos.
Maturity class Dominant tree species Mean annual increment Stand volume
Training [k or RMSE (% RMSE)]
Cross-validation [k or RMSE (% RMSE)]
0.51 0.64 1.5 m3 ha1 a1 (38%) 62.0 m3 ha1 (33%)
0.24 0.38 1.9 m3 ha1 a1 (44%) 76.5 m3 ha1 (41%)
4.2. Predictions using satellite sensor images Analogously to the results from the orthophotos, the relative error of the estimation of stand volume was less than the error of the increment of stand volume; also, the dominant tree species was more firmly recognized than the maturity class (Table 4). The most useful feature was the standard deviation of ETM+ band 6.2 within a radius of 80 m (Table 5). The focal pixel values (25 m 25 m) were nearly as useful as estimators as the values calculated from a radius of up to about 100 m (Fig. 5). This may be related to the relatively small size of NFI stands. Many pixels from neighbouring NFI stands were included in larger radii, as the satellite sensor image was not segmented using stand boundaries. That explains the significantly smaller indicator values at a radius of 120 m. As in the orthophotos, locally computed averages had the highest indicator values. Features that use Moran’s I weighted with the reciprocal of the distance obtained relatively high indicator values. Forest attribute maps were created using the results of machine learning, the satellite sensor image, and the 1:10 000 soil map. The
validation of the maps revealed that within the 100-km2 mapping area, the RMSE of stand volume estimated at stand-level was 85.8 m3 ha1 (39%) and the RMSE of the stand mean annual increment was 2.0 m3 ha1 a1 (43%) (Figs. 6 and 7). 5. Discussion This study represents the results of a comparison of features derived from orthophotos and satellite sensor images for the estimation and mapping of NFI parameters of forest stands. Using orthophotos, a radius smaller than 30 m seems to be too small to describe the spatial pattern of pixel values, and pixels within a larger radius are less representative of the location. The larger the radius, the more pixels fall near the border of an NFI stand if a random location is situated in the central part of an NFI stand. The NFI stands used in this study have an average area of 3.6 ha and usually have an elongated shape. So, a random sample point often fell near the border of the NFI stand. More variation was lost using larger radii, as pixels from the other stands were excluded from the calculation of features in the case of orthophotos. Coops and Culvenor (2000) found that the image local variance was a valuable indicator of the aggregation of trees if calculated within a window with a 20–30-pixel (metre) edge. Features calculated from HSL channels had slightly greater indicator values than the RGB channels of the orthophotos. One possible explanation for this phenomenon is the fact that HSL values are combinations of RGB values, and therefore, contain more generalized information about that location. The number of features used in similarity-based prediction systems must not be large. For example, Pekkarinen (2002) demonstrated graphically that about five features are enough to
Table 3 The best features calculated from orthophotos ordered by average weights and average weights weighted by frequency of usage (weighted average). Rank
Growth class
Main tree species
Growth speed
Stand volume
Weighted average
1 2 3 4 5
S a 30 m H stdv 30 m H acorr 30 m R stdv 30 m G stdv 30 m
H a 30 m H stdv 30 m S a 30 m G stdv 30 m S cvar 30 m
H poa 30 m H a 30 m S cvar 30 m S a 30 m L acorr 20 m
L a 40 m L a 20 m B a 30 m S a 30 m L stdv 10 m
S a 30 m H a 30 m S stdv 30 m H stdv 30 m B a 30 m
H, hue; S, saturation; L, lightness; R, red; G, green; B, blue; a, locally calculated average; poa, proportion over the average; stdv, standard deviation; cvar, coefficient of variation; acorr, Moran’s I.
Fig. 4. Total indicator value (average weights of features weighted by the frequency of usage) and standard errors of feature components from orthophotos summarised over all dependent variables. H, hue; S, saturation; L, lightness; R, red; G, green; B, blue; a, locally calculated average; poa, proportion over the average; stdv, standard deviation; cvar, coefficient of variation; acorr, Moran’s I. Table 4 The best pixel-level (area of a pixel 625 m2) and stand-level (average area 3.4 ha) training and validation accuracy using satellite images and 1:10 000 soil map.
Maturity class Dominant tree species Mean annual increment Stand volume
Pixel-level training accuracy [k; RMSE (% RMSE)]
Pixel-level validation accuracy [k; RMSE (% RMSE)]
Stand-level validation accuracy [k; RMSE (% RMSE)]
0.50 0.66 1.6 m3 ha1 a1 (42%) 68.4 m3 ha1 (34%)
0.17 0.31 2.1 m3 ha1 a1 (52%) 87.0 m3 ha1 (42%)
0.23 0.41 2.0 m3 ha1 a1 (49%) 74.6 m3 ha1 (36%)
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
295
Table 5 The best features from satellite images and soil map ordered by average, and average weights weighted by frequency of usage (weighted average). The number in feature name is a channel of Landsat ETM+. Rank
Growth class
Main tree species
Growth speed
Stand volume
Weighted average
1 2 3 4 5
ETM5 ETM6.1 stdv 40 m ETM6.2 stdv 80 m ETM7 ETM6.2 a 80 m
ETM4 a 80 m ETM7 ETM5 a 80 m ETM3 ETM4
ETM6.2 stdv 80 m ETM6.1 stdv 40 m ETM4 wMI 80 m ETM1 cvar 40 m ETM5
ETM6.2 stdv 80 m ETM6.1 stdv 40 m ETM3 wMI 40 m ETM7 wMI 120 m ETM6.2 wMI 120 m
ETM6.2 stdv 80 m ETM5 ETM6.1 stdv 40 m ETM4 a 80 m ETM7
Fig. 5. Total indicator value (average weights of features weighted by the frequency of usage) and standard errors of feature components from satellite image summarised over all dependent variables. 1p, value of the focal pixel; sm, soil map; 1–7, channels of Landsat ETM+; a, locally calculated average; m, mode; stdv, standard deviation; cvar, coefficient of variation; wMI, Moran’s I weighted with the reciprocal of distance.
Fig. 6. Estimated stand volume (m3 ha1) in the study area.
achieve an efficient segmentation of remote sensing images. Remm (2004) used 17 features to estimate coverage of coniferous trees. On average, machine learning software selected 12 features for the best predictive sets in most cases. The difference between training accuracy and validation accuracy probably indicates that the present case-base does not describe the total variability of the dependent variables. It should also be stressed that the NFI forest data that were used to train the CBR system originally included errors in stand volume estimations. These have been estimated to be up to 15% for 2/3 of all observations and up to 20% for the rest according to NFI regulations. The reported estimations are originally pixel-based estimations compared to stand-based validation data. Stand-based estimations that interpolate the internal variability of a forest stand are less variable and more exact. It is estimated that representing the results according to 1 ha stands increases the accuracy by 10% compared to the pixel-level accuracy estimation
(Kilpela¨inen and Tokola, 1998). In addition, accuracy is influenced by the size of field plots, because the RMSE and Kappa statistics involve measurement and sampling errors in the field data. This aspect should be considered when comparing the results of different case studies. Compared to this study, recently published results have not reported a higher accuracy fit of estimated stand volume using Landsat or chromatic aerial images. An outstanding stand-level accuracy of 18–27% RMSE by Muinonen et al. (2001) was achieved by undertaking extensive and precise field work, using orthophotos with a near-infrared channel and locally calculated variograms. In this study, orthophotos with only RGB values were used. In this study, a comparison of features derived from orthophotos and satellite sensor images for the estimation and mapping of NFI parameters of forest stands was presented. The features that frequently gave the most accurate estimations for stand volume in machine learning iterations had a component calculated either
296
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297
Fig. 7. Estimated stand mean annual increment (m3 ha1 a1) in the study area.
from the lightness channel, within a 30 m radius or as a local average. Tuominen and Pekkarinen (2005) also found the local average to be the most indicative out of several locally computed statistics for forest inventory. Simple local statistics should also be preferred according to the rule of parsimony. More complicated variograms and autocorrelation indices have been demonstrated to be useful mainly in detecting finer scale canopy structure (Hudak and Wessman, 1998; Le´vesque and King, 2003; St-Onge and Cavayas, 1997). The estimation and mapping of NFI parameters of forest stands can be useful for evaluating the need for in-field data updating. The strengths of CBR-systems should be utilized through developing solutions that are integrated with forest information systems. Acknowledgements The investigation was supported by the Estonian Ministry of Education (SF0180127s08). The authors express their gratitude to the Estonian State Forest Management Centre for the forest inventory data and to Ilmar Part, Robert Szava-Kovats and Michael Haagensen for linguistic corrections and to two reviewers for valuable suggestions. References Aha, D.W., 1998. The omnipresence of case-based reasoning in science and application. Knowledge-Based Systems 11 (5–6), 261–273. Castro-Esau, K.L., Sa´nchez-Azofeifa, G.A., Caelli, T., 2004. Discrimination of lianas and trees with leaf-level hyperspectral data. Remote Sensing of Environment 90, 353–372. Chica-Olmo, M., Abarca-Herna´ndez, F., 2000. Computing geostatistical image texture for remotely sensed data classification. Computers and Geosciences 26, 373–383. Cingolani, A.M., Renison, D., Zak, M.R., Cabido, M.R., 2004. Mapping vegetation in a heterogeneous mountain rangeland using landsat data: an alternative method to define and classify land-cover units. Remote Sensing of Environment 92, 84–97. Congalton, R.G., Green, K., 1999. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices. Lewis Publishers. Congalton, R.G., Birch, K., Jones, R., Schriever, J., 2002. Evaluating remotely sensed techniques for mapping riparian vegetation. Computers and Electronics in Agriculture 37, 113–126. Coops, N., Culvenor, D., 2000. Utilizing local variance of simulated high spatial resolution imagery to predict spatial pattern of forest stands. Remote Sensing of the Environment 71, 248–260.
Couteron, P., Pelissier, R., Nicolini, E.A., Paget, D., 2005. Predicting tropical forest stand structure parameters from Fourier transform of very high-resolution remotely sensed canopy images. Journal of Applied Ecology 42 (6), 1121–1128. Franco-Lopez, H., Ek, A.R., Bauer, M.E., 2001. Estimation and mapping of forest stand density, volume, and cover type using the k-nearest neighbors method. Remote Sensing of Environment 77, 251–274. Gemmel, F., 1999. Estimating conifer forest cover with Thematic Mapper data using reflectance model inversion and two spectral indices in a site with variable background characteristics. Remote Sensing of Environment 69, 105–121. Hall, R.J., Skakun, R.S., Arsenault, E.J., Case, B.S., 2006. Modeling forest stand structure attributes using Landsat ETM+ data: application to mapping of aboveground biomass and stand volume. Forest Ecology and Management 225, 378– 390. Haralick, R.M., 1979. Statistical and structural approaches to texture. Proceedings of the IEEE 67 (5), 786–804. Holmgren, J., Joyce, S., Nilsson, M., Olsson, H., 2000. Estimating stem volume and basal area in forest compartments by combining satellite image data with field data. Scandinavian Journal of Forest Research 15 (1), 103–111. Hudak, A.T., Wessman, C.A., 1998. Textural analysis of historical aerial photography to characterize woody plant encroachment in South African savanna. Remote Sensing of the Environment 66, 317–330. Hu¨llermeier, E., 2001. Similarity-based inference as evidential reasoning. International Journal of Approximate Reasoning 26, 67–68. Hyyppa¨, J., Hyyppa¨, H., Inkinen, M., Engdahl, M., Linko, S., Zhy, Y.H., 2000. Accuracy comparison of various remote sensing data sources in the retrieval of forest stand attributes. Forest Ecology and Management 128, 109–120. Katila, M., Tomppo, E., 2001. Selecting estimation parameters for the Finnish multisource National Forest Inventory. Remote Sensing of Environment 76, 16–32. Kilpela¨inen, P., Tokola, T., 1998. Gain to be achieved from stand delineation in LANDSAT TM image-based estimates of stand volume. Forest Ecology and Management 124, 105–111. Kuusk, A., Nilson, T., 2000. A directional multispectral forest reflectance model. Remote Sensing of Environment 72, 244–252. Labrecque, S., Fournier, R.A., Luther, J.E., Piercey, D., 2006. A comparison of four methods to map biomass from Landsat-TM and inventory data in western Newfoundland. Forest Ecology and Management 226, 129–144. Le´vesque, J., King, D.J., 2003. Spatial analysis of radiometric fractions from highresolution multispectral imagery for modelling individual tree crown and forest canopy structure and health. Remote Sensing of Environment 84, 589–602. Linder, M., Remm, K., Proosa, H., 2008. The application of the concept of indicative neighbourhood on Landsat ETM+ images and orthophotos using circular and annulus kernels. SDH 2008. In: Ruas, A., Gold, C. (Eds.), Headway in Spatial Data Handling. Proceedings of the 13th International Symposium on Spatial Data Handling in Montpellier, France. Springer, 147–162. Liu, Q.J., Takamura, T., Takeuchi, N., Shao, G., 2002. Mapping of boreal vegetation of a temperate mountain in China by multitemporal Landsat TM imagery. International Journal of Remote Sensing 23 (17), 3385–3405. Lonard, R.I., Judd, F.W., Everit, J.H., Escobar, D.E., Davis, M.R., Crawford, M.M., Desai, M.D., 2000. Evaluation of color-infrared photography for distinguishing annual changes in riparian forest vegetation of the lower Rio Grande in Texas. Forest Ecology and Management 128, 75–81.
T. Tamm, K. Remm / International Journal of Applied Earth Observation and Geoinformation 11 (2009) 290–297 Luther, J.E., Fournier, R.A., Piercey, D.E., Guindon, L., Hall, R.J., 2006. Biomass mapping using forest type and structure derived from Landsat TM imagery. International Journal of Applied Earth Observation and Geoinformation 8, 173– 187. Ma¨kela¨, H., Pekkarinen, A., 2001. Estimation of timber volume at the sample plot level by means of image segmentation and Landsat TM imagery. Remote Sensing of Environment 77, 66–75. Ma¨kela¨, H., Pekkarinen, A., 2004. Estimation of forest stand volumes by Landsat TM imagery and stand-level field-inventory data. Forest Ecology and Management 196, 245–255. Mitchell, T.M., 1997. Machine Learning. Mc Graw-Hill Series in Computer Science. The Mc Graw-Hill Companies Inc. Muinonen, E., Maltamo, M., Hyppa¨nen, H., Vainikainen, V., 2001. Forest stand characteristics estimation using a most similar neighbour approach and image spatial structure information. Remote Sensing of Environment 78, 223–228. Pekkarinen, A., 2002. Image segment-based spectral features in the estimation of timber volume. Remote Sensing of Environment 82, 349–359. Ranson, K.J., Sun, G., Kharuk, V.I., Kovacs, K., 2001. Characterization of Forests in Western Sayani Mountains, Siberia from SIR-C SAR Data. Remote Sensing of Environment 75, 188–200. Reese, H., Nilsson, M., Granqvist Pahle´n, T., Hagner, O., Joyce, S., Tingelo¨f, U., Egberth, M., Olsson, H., 2003. Countrywide estimates of forest variables using satellite data and field data from the National Forest Inventory. Ambio 32 (8), 542–548. Remm, K., Luud, A., 2003. Regression and point pattern models of moose distribution in relation to habitat distribution and human influence in Ida-Viru county, Estonia. Journal for Nature Conservation 11, 197–211.
297
Remm, K., 2004. Case-based predictions for species and habitat mapping. Ecological Modelling 177, 259–281. Remm, K., 2005. Correlations between forest stand diversity and landscape pattern in Otepa¨a¨ Nature Park, Estonia. Journal for Nature Conservation 13, 137–145. Remm, M., Remm, K., 2008. Case-based estimation of the risk of enterobiasis. Artificial Intelligence in Medicine 43, 167–177. St-Onge, B.A., Cavayas, F., 1997. Automated forest structure mapping from high resolution imagery based on directional semivariogram estimates. Remote Sensing of the Environment 61, 82–95. Tomppo, E., 1991. Satellite image-based national forest inventory of Finland. International Archives of Photogrammetry and Remote Sensing 28, 419–424. Tomppo, E., Nilsson, M., Rosengren, M., Aalto, P., Kennedy, P., 2002. Simultaneous use of Landsat-TM and IRS-1C WiFS data in estimating large area tree stem volume and aboveground biomass. Remote Sensing of Environment 82, 156– 171. Tuominen, S., Pekkarinen, A., 2004. Local radiometric correction of digital aerial photographs for multi source forest inventory. Remote Sensing of Environment 89, 72–82. Tuominen, S., Pekkarinen, A., 2005. Performance of different spectral and textural aerial photograph features in multi-source forest inventory. Remote Sensing of Environment 94, 256–268. Wallerman, J., Joyce, S., Vencatasawmy, C.P., Olsson, H., 2002. Prediction of forest stem volume using kriging adapted to detected edges. Canadian Journal of Forest Research 32, 509–518. Wilson, D.R., Martinez, T.R., 2000. Reduction techiques for instance-based learning algorithms. Machine Learning 38, 257–286.