optimizing continuous fields of tree cover for complex topography at the regional scale of the European Alps using generalized linear models (GLM) which are ...
MODIS based continuous fields of tree cover using generalized linear models Markus Schwarz, Niklaus E. Zimmermann, Lars T. Waser Swiss Federal Research Institute WSL 8903 Birmensdorf, Switzerland Abstract— Knowledge about the land cover over large area is important for monitoring and modeling of ecological and environmental processes. Considerable efforts have recently resulted in the development of global continuous fields for different land cover types at small spatial scales based on NOAAAVHRR and TERRA-MODIS data. Researchers have applied a range of techniques to depict the sub-pixel fraction of land cover types from remotely sensed data. As a result, such products have a high potential to accurately monitor land cover change over time. In this study, a methodology is described for deriving and optimizing continuous fields of tree cover for complex topography at the regional scale of the European Alps using generalized linear models (GLM) which are commonly used in ecological research. The presented study is carried out in Switzerland a central part of the European Alps with a complex topography and a highly fragmented landscape. MODIS data (MOD09) at a spatial resolution of 500m are used as explanatory variables and Landsat TM based data of fractional tree-cover are used as target variable to calibrate the GLM. For purpose of evaluation we calculated different accuracy measures of the resulting continuous fields of tree cover, and we compared the model output with two available global data sets: (1) TERRAMODIS Vegetation Continuous Fields product (MOD44), and (2) the NOAA-AVHRR vegetation continuous fields. Our regionally optimized GLM model produces higher accuracies compared to MOD44- and AVHRR-products. The GLM model achieve an average mean absolute error (MAE) of 0.093, while the average MAE of the MOD44 and the AVHRR products are 0.205 and 0.200, respectively. We conclude that GLM's are appropriate for deriving fractional tree cover for complex topography at the scale of the European Alps. Regional calibrations of vegetation continuous fields may offer significantly improved predictions compared to globally calibrated models. Such regionally calibrated and optimized models may serve as valuable tools for regional monitoring of land cover pattern and its temporal change. Keywords-component: Fractional Tree Cover, Generalized Linear Model, MODIS, Sub-Pixel Classification
I.
INTRODUCTION
The environment of the European Alps with its characteristic complex topography and its fine-scale and often traditional land use regime is exposed considerably to both natural environmental threats and human impacts and exploitation [1]. For sustainable use and management and for
large-area monitoring and modeling of these Alpine terrestrial habitats, consistent land cover information across national borders is indispensable. Various studies in the past showed that remote sensing based data are an appropriate tool to detect land cover change [2]. Considerable efforts have recently resulted in the development of global and continental land cover data at small spatial scales based on the Advanced Very High Resolution Radiometer (NOAA-AVHRR) and the Moderate Resolution Imaging Spectroradiometer (TERRA-MODIS). The paradigm for describing the characteristics of the surface covered by these data sets is to classify each pixel into a land cover type based on a predefined classification scheme [3]. However, this approach has certain limitations [4]. Alternatively, a number of techniques to depict the sub-pixel fraction of landscape components from the same remotely sensed data have been applied like fuzzy membership functions, artificial neural networks, regression trees, decision trees and linear mixture models [5]. The resulting continuous field maps – containing the fraction of landscape components as a continuous variable – offer the advantage of summarizing the effects of spatial heterogeneity better than the discrete land cover maps based on the same data sources. As a result, such products have a higher potential to accurately monitor land cover change over time [6]. In this paper we describe a methodology for deriving continuous fields of tree cover at the regional scale of the European Alps using generalized linear models (GLM). The presented procedure uses MODIS reflectance data (MOD09) at a spatial resolution of 500m. The benefit of such a regionally optimized model is to establish accurate datasets compared to the available global products enabling more flexible change detection, accounting for regional characteristics (e.g. topography, phenology, vegetation). For purposes of comparison we test our resulting continuous fields of tree classification within Switzerland against the two available global datasets: 1) TERRA-MODIS Tree Cover Continuous Fields product (MOD44) based on regression trees at 500m spatial resolution [6] and 2) the NOAA-AVHRR Vegetation continuous fields using linear mixture models at a spatial resolution of 1000m [7].
0-7803-8743-0/04/$20.00 (C) 2004 IEEE
II.
MATERIALS & METHODS
A. Study area All studies were carried out in Switzerland which is located in the heart of the European Alps. The Alps of Southern Central Europe extend over 1000km from the Mediterranean coast of France and North-western Italy through Switzerland, Northern Italy, to Southern Germany, Austria, and Slovenia. The Alps are characterized by an extraordinary biodiversity and a variety of landscapes, which are influenced by geological, morphological and climatologic factors and by a long history of varying traditional land management. One third of this mountainous landscape is covered by trees. B. Calibration data In order to calibrate the model of fractional tree cover, we used atmospherically corrected (6S code) and georeferenced surface reflectance data from the Moderate Resolution Imaging Spectroradiometer (MODIS) on board the TERRA satellite at a spatial resolution of 500m. In particular we used the MODIS level 3 surface reflectance product, MOD09_L3_V004, as an 8-day composite. We used all tiles of h18v04 for the year 2001. The data were transformed to UTM coordinate system with WGS 1984 datum (Zone North 32). Further, the Normalized Difference Vegetation Index (NDVI) was calculated for each 8 day composite. To reduce the negative effect of cloud contamination we synthesized the 8-day composites to monthly images based on the maximum NDVI value per month [8]. Based on these datasets a number of metrics representing the seasonal and/or annual phenological characteristics of vegetation were derived according to DeFries et al. [9] using the 8 months with the highest NDVI-value. These metrics include the minimum, maximum, mean, range, and standard deviation per reflectance band. Finally, from these calibration dataset (i.e. area-wide maps), sub-samples of 15’838 pixels within Switzerland reflecting a regular lattice were re-sampled for calibrating the model. This represents 1/9th of the total surface of Switzerland. The left-out 8/9th of the Swiss data were treated as an independent reference dataset. C. Training and Reference data A training dataset is required to calibrate the model. In our case we used training data from high-resolution satellite imagery for Switzerland. An existing Swiss forest map, based on 11 Landsat-5 TM images dating from 1990,1991 and 1992, was used. Based on information of the Swiss forest inventory we converted the forest cover map into a spatial representation of tree cover per forested pixel using a linear regression model. Further we used the “Swiss areal statistics 92/97”, available within the same GEOSTAT package to convert all pixels containing tree vegetation for areas not classified as forest. The resulting fractional tree cover map was re-projected to UTM coordinate system with WGS 1984 datum (Zone North 32) used in the calibration and finally resampled to the 500m resolution of MODIS, which leads to a data set with more than 150’000 training samples.
D. Available Global data sets We tested both global datasets and compared it against our regionally optimized, GLM-based tree cover product. The NOAA-AVHRR vegetation continuous fields (called AVHRRVCF, thereafter) product is based on data from the Advanced Very High Resolution Radiometer (NOAA-AVHRR) and used a linear mixture model. The data were acquired in 1992-93 at a 1km spatial resolution and processed under the guidance of IGBP (International Geosphere Biosphere Program). The TERRA-MODIS Vegetation Continuous Fields product MOD44 (called MODIS-VCF, thereafter) contains estimates of percent tree cover as developed from globally distributed, highresolution training data. A regression tree model was used to link the MODIS MOD09 500m reflectance data to the training data and phenological metrics [6]. In order to enable a direct comparison with the other data sets used we re-projected the data set to the to UTM coordinate system with WGS 1984 datum (Zone North 32). E. Model development GLMs are an extension of the linear (least-square regression) modeling that allows models to be fit to data with errors following other than (only) Normal distributions, and for dependent variables following other than a Normal distribution, such as the Poisson, Binomial and Multinomial [10]. Ordinary least squares regression assumes e.g. that the model response varies continuously and that it is unbounded. GLMs of the binomial model family overcome this difficulty by linking the binary response to the explanatory covariates through the probability of either outcome, which does vary continuously from 0 to 1 [11]. This approach is often called logit regression. Other model families allow to fit response variables of different restricting characteristics (Poisson regression, etc.). Tree cover fractions show the same restrictions as do binary dependent variables (upper and lower bounds); only they vary continuously between 0 and 1. Thus, GLM of the binomial model family may equally well be applied to fractional cover data as can be applied to presence/absence data. The converted and aggregated national tree cover map was used as target variable. All basic statistics of the 7 500m MODIS bands and the NDVI per pixel for the months used in the respective data set were included as explanatory variables to calibrate the model. We built a full GLM-model containing all possible explanatory variables. F. Validation The evaluation of the simulated continuous tree cover and the two global datasets against the reference data was performed using several statistical measures, namely 1) the mean absolute error (MAE), 2) the overall correct classification rate (CCRO), 3) the kappa and the weighted kappa (K and Kw). CCRO, K and Kw were calculated based on a confusion matrix with the continuous target variable aggregated to classes of 25% width. The weighted kappa coefficient (Kw) is an extended measure of the Cohen’s kappa [12] and can be used to weight the disagreements between simulated and reference map classes.
0-7803-8743-0/04/$20.00 (C) 2004 IEEE
GLM
MODIS-VCF
AVHRR-VCF
41405
41405
41405
9705
9692
9680
9499
16635
13618
Bias
-2.12%
71.6%
40.7%
MAE
0.093
0.205
0.200
CCRO
0.719
0.539
0.497
KW
0.914
0.806
0.802
Total area [km2] Total forest [km2] pred. forest [km2]
III.
RESULTS
The GLM model underestimates the tree cover by 2.1%. The model generates a CCRO of 0.72 and an MAE of 0.093 (an error of 9.3% tree cover on average) when applied to the independent reference data set. The Kw reaches 0.91. Table 1 shows the summarized accuracy measures. The MODIS-VCF data set overestimates the fraction tree cover area generally by 71.6% and achieved a CCRO of 0.539 (Table 1). The associated MAE is 0.205, the Kw reaches 0.81. The AVHRR-VCF data set overestimates the fractional tree cover generally by 40.7% and achieved a CCRO of 0.497 (table 1). The MAE is 0.200 and the κw reaches 0.80. Fig. 1 displays the MAE of the three models along the predicted range of fractional tree cover. The MAE of the fractional cover of the GLM-based model is between 0.1 (at low and at very high cover) and 0.2 (at high cover values). The MAE of the MODIS-VCF data set is clearly higher than the GLM model on average and especially at low cover values. For fractional tree covers higher than 75% the MODIS-VCF model is more accurate than the GLM model. The MAE of the AVHRR-VCF model is very similar to the MODIS-VCF model. However, fractional tree cover of 90% and higher are clearly mapped with less accuracy than both prior models. The mean across the full range of predicted values is the least accurate. Figure 2 displays the bias of the predictions of the fractional tree cover. All three models generally overpredict low fractional tree cover, and underpredict high observed cover fractions. The GLM-mx8 model shows the least bias, i.e. lowest maximum deviation along the full range, and the least overall bias (-2.6%, see table 1). The AVHRR-VCF behaves similar to the GLM-based model, but has higher positive deviations at low and higher negative deviations at high cover values. The MODIS-VCF model mostly overpredicts the fractional tree cover across most of the range, but is the most accurate model at cover values of 70-100%.
bias (Fig. 2) at high tree cover shows overestimation, with saturation towards full tree cover (80-100%). A comparison between the GLM model and the generated Swiss tree cover map shows a high spatial agreement, even though the landscape- and forest-pattern is highly fragmented. Further, the visual exploration of the error map reveals that fractional tree cover is generally overestimated in steep slopes of northern exposure, while likely to be underestimated in slopes of southern exposure. This might be an effect of shadows on the model. Additionally, the differences in the phenological development between north and south facing slopes might additionally affect the distribution of errors. Generally, the stationarity of the spatial error distribution shows that the resulting model can be considered as robust. The MODIS-VCF classification generally overestimates tree fractional cover in the complex topography of the European Alps. In detail, tree cover fractions up to 80% are overestimated, while pixels with higher tree cover (>80%) are rather underestimated. Compared to the regional fitted GLM model the classes with a high fractional tree cover (80-100%), are mapped with higher accuracy. For the lower cover fractions the GLM-based approach revealed better results. For our study area, the overall-accuracy of the MODIS-VCF product is only marginally better than the one obtained by the AVHRR-VCF product. 0.4
0.3
MAE
STATISTICAL PERFORMANCE MEASURES FOR THE GLMMODIS-VCF AND THE AVHRR-VCF DATASETS.
MODEL, THE
0.2
0.1
GLM MODIS-VCF AVHRR-VCF
0 0
0.2
0.4 0.6 fraction of tree cover (reference)
0.8
1
Figure 1. Mean absolute error (MAE) ranges of fractional tree cover for the GLM-Model, the MODIS-VCF and the AVHRR-VCF datasets. 0.4 0.3 0.2 0.1 Bias
TABLE I.
0 0
0.2
0.4
0.6
0.8
1
-0.1
IV.
DISCUSSION
-0.2
The results show that our GLM model revealed highest accuracies when compared to the available global data sets (MODIS-VCF and AVHRR-VCF). Thus, it seems reasonable to generate regional data sets, if one aims at monitoring fractional tree cover at regional scales only. Fig. 1 reveals that the errors of the GLM model tend to be higher at higher actual fractional tree cover. This is consistent with the fact that the
-0.3
GLM MODIS-VCF AVHRR-VCF
-0.4 fraction of tree cover (reference)
Figure 2. Bias ranges of fractional tree cover for the GLM-model, the MODIS-VCF and the AVHRR-VCF datasets.
0-7803-8743-0/04/$20.00 (C) 2004 IEEE
[2]
The errors of the AVHRR-VCF product are the highest for pixel with either a high or a low fractional tree cover. Lowest errors were obtained for cover fractions around 70%. This is not surprising since the maximum fraction, determined by the algorithm, is not exceeding 80% for the AVHRR-VCF, thus the bias is negative for high cover fractions. V.
CONCLUSION
As several prior efforts have shown, vegetation continuous fields provide a useful and promising alternative to the traditional discrete classification approaches. As a consequence, this method offers a sound approach to detect land cover change. Thus we conclude that a) GLM offer a simple, yet accurate method to calibrate vegetation continuous fields, and b) a regional model calibration improves the quality of the model. The GLM-based model shows that the applied statistics used are well suited for deriving fractional tree cover at the scale of the European Alps and are able to handle nonlinear relationships present in the training data set well. On the other hand, we generally conclude that the global models reached comparably high accuracies when tested at a regional scale. Further we conclude that the improved spatial characteristics of MODIS data result in more accurate predictions compared to AVHRR-VCF based maps, not only at the global, but also at the regional scale. REFERENCES [1]
Tasser, E., and Tappeiner, U., 2002. The impact of land-use changes in time and space on vegetation distribution in mountain areas. Applied Vegetation Science, 5, 173-184.
Rogan, J., Franklin, J. and Roberts, D.A., (2002). A comparison of methods for monitoring multitemporal vegetation change using Thematic Mapper imagery. Remote Sensing of Environment, 80(1), 143156 [3] DeFries, R.S., and Townshend, J.R.G. (1994). NDVI-derived land cover classification at global scales. International Journal of Remote Sensing, 15, 3567-3586 [4] Aplin, P., and Atkinson, P.M. (2001). Sub-pixel land cover mapping for per-field classification. International Journal of Remote Sensing, 22, 2853-2858 [5] Adams, J.B., Sabol, D.E., Kapos, V., Almeida, R., Roberts, D.A., Smith, M.O., and Gillespie, A.P. (1995). Classification of multispectral images based on fractions of endmembers: application to land-cover change in the Brazilian Amazon. Remote Sensing of Environment, 52, 137-154 [6] Hansen, M.C., DeFries, R.S., Townshend, J.R.G., Sohlberg, R., Dimiceli, and Caroll, M. (2002). Towards an operational MODIS continuous field of percent tree cover algorithm: examples using AVHRR and MODIS data. Remote Sensing of Environment, 83, 303319 [7] DeFries, R.S., Townshend, J.R.G., and Hansen, M.C. (1999). Continuous fields of vegetation characteristics at the global scale at 1km resolution. Journal of Geophysical Research - Atmospheres, 104, 911-916,925 [8] Holben, B.N. (1986). Characteristics of maximum-value composite images from temporal AVHRR data. International Journal of Remote Sensing, 7, 1417-1434 [9] DeFries, R.S., Hansen, M., Townshend, J.R.G., and Sohlberg, R. (1995).Global discrimination of land cover types from metrics derived from AVHRR Pathfinder data. Remote Sensing of Environment, 54, 209-222 [10] McCullagh. P., and Nelder, J.A. (1989). Generalized linear models. Chapman and Hall, London, 511 [11] Dobson, A.J. (2002). An introduction to generalized linear models. Chapman & Hall/CRC, Boca Raton, 225 [12] Cohen, J., (1968). Weighted Kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70: 213-220.
0-7803-8743-0/04/$20.00 (C) 2004 IEEE