New Strategies for European Remote Sensing, Oluiþ (ed.) © 2005 Millpress, Rotterdam, ISBN 90 5966 003 X
Using ancillary data to improve classification of degraded Mediterranean vegetation with HyMap spectroscopic images R. Sluiter & S.M. de Jong Faculty of Geosciences, Utrecht University, Utrecht, Netherlands
[email protected]
Keywords: Mediterranean, vegetation, classification, ancillary data, HyMap ABSTRACT: Accurate land cover maps based on remote sensing observations are required for a.o. the evaluation of vegetation change models. In this study, where we investigate the intensification and extensification of land use in an area in southern France, purely spectrally based classification accuracy proved not to be sufficient. Therefore, we present a method to classify Mediterranean vegetation communities by integrating environmental and ecological information into a spatio-temporal image classification model: the Ancillary Data Classification Model (ADCM). Compared to a traditional Spectral Angle Mapper classification with 14 classes, the new proposed ADCM yields an increase of overall accuracy from 51 to 69 %. We anticipate that the use of additional environmental factors will further improve the classification results. 1
INTRODUCTION
Landscapes in Mediterranean Europe have been disturbed and changed by people for centuries. Nowadays, the structural changes of the landscape are associated with both intensification and extensification. In some areas logging, open mining, urban pressure, recreational pressure and agricultural activities increase. In other areas traditional agricultural activity has ceased and agricultural lands have been abandoned (Bonet 2004). In Mediterranean France, land abandonment is the most widespread change and is caused by technological, social and economic changes. The land abandonment process can have several consequences: 1) Biodiversity, generally high in Mediterranean landscapes, may decrease because species that depend on the traditional pattern of land use disappear (Zavala & Burkey 1997). 2) Especially in dry upland areas soil erosion can increase because soil and water conservation works linked with agriculture deteriorate (Boer 1999) and because overland flow and sediment yield increase due to changes in soil properties (Lasanta et al. 1995). 3) Homogenization (spatial simplification) of the landscape in combination with the increase of fuel accumulation in abandoned land increases the fire risk (Scarascia-Mugnozza et al. 2000). To monitor, understand and predict changes of vegetation communities, methods are needed that facilitate the description of the status of the vegetation communities and the description of the changes of vegetation communities. For example, modern geographic information analysis tools (Burrough & McDonnell 1998) enable us to analyze changes of vegetation communities by computer models (Tappeiner et al. 1998; Mouillot et al. 2001).
219
Remote sensing data can be used to identify vegetation communities and to calibrate and validate the results of vegetation change models. Many remote sensing classification studies have been done in Mediterranean areas (Carmel & Kadmon 1998; Maselli et al. 2000; Shoshany 2000) but not many attempts to combine these studies with vegetation change studies have been undertaken nor have they produced reasonable results (Plummer 2000). Out of many factors we relate this to: 1) a relative large distance between the fields of ecology, land use change modelling and technical remote sensing: relative few integrative studies are carried out (Briassoulis 2000). 2) The spectral and spatial limitations of the remote sensing data. Conventional methods for spectral classification of remote sensing images are per-pixel based. However, due to spectral confusion, open heterogeneous vegetation patterns as commonly found in Mediterranean and in semi-arid regions cannot be characterized by per-pixel classifiers in a satisfying way. Two alternative methods receive attention: methods that include the spatial domain to analyze imagery (Atkinson & Quattrochi 2000; De Jong et al. 2001; Blaschke et al. 2004; Sluiter et al. 2004) and methods that incorporate ancillary data into the classification process. The latter is based on the assumption that vegetation communities can be related to environmental factors like geology, soil type, soil water availability, elevation, slope, aspect, slope curvature etc. This ancillary data or knowledge data can be incorporated into the classification process to enhance spectral and contextual classifications (Shoshany 2000; Nagendra 2001). The objective of this study is to improve the classification of Mediterranean vegetation communities by integrating environmental and ecological information in a spatio-temporal model. In the model, both per-pixel and contextual classification techniques are used. In a later stage of this research, we plan to couple the classification to a vegetation change model. A Mediterranean ecosystem in the La Peyne area, approximately 60 km west of Montpellier, southern France, is used as the case study. In this area, Mediterranean oak forest called ‘maquis’, lower ‘garrigue’ shrub lands and agricultural areas are found. Human and natural disturbance of these ecosystems occurs scattered over the test area as a consequence of forest fires, open mining, logging and ongoing extensive agricultural activities This paper will discuss the concepts of the new spatio-temporal classification model and present the results of the prototype of the model applied to high-resolution hyperspectral HyMap images. The first classification results are encouraging and show a significant increase in accuracy compared to results from conventional methods. 2
THE CONCEPT OF THE ANCILLARY DATA CLASSIFICATION MODEL
To optimize the classification results of a remote sensing image, the Ancillary Data Classification Model (ADCM) uses the relationships that exist between vegetation communities and several information sources like: • Topographic maps (roads, land use) • Geological & soil mapping units • Elevation (DEM) • DEM derivatives: slope, aspect, curvature • Input collected from external models: insulation during the year, water availability, wetness index • Historic information: output from previous change detection studies • Derived remote sensing products: vegetation indices, absorption features The ADCM is built within the PCRaster dynamic modelling software (Wesseling et al. 1996; Burrough & McDonnell 1998) because spatial-dynamic relations play an important role in the classification scheme. The classification procedure consists of three phases:
220
R. Sluiter & S.M. de Jong
1 A spectral classification with a classifier that can generate rule images for the individual classes, suitable algorithms are the maximum likelihood classifier or the spectral angle mapper classifier (Kruse et al. 1993). 2 A classification based on general expert knowledge (IF THEN logic). 3 A classification based on the combination of probability functions: the probability of the occurrence of a class is calculated by pooling class occurrence probabilities on different predictor variables. The class with the highest pooled probability is assigned to a pixel.
Figure 1. Classification procedure of the ADCM
A flow diagram of the model is shown in figure 1. In phase 1, the rule images indicate the most likely spectral class and show which classes are easily distinguished spectrally and the classes that are spectrally confused. Classes that can be easily distinguished spectrally, for example water, are assigned directly to the right class during phase 2. Additionally, classification rules based on “common logic” can be applied during phase 2. An example is given by the spectral confusion between agricultural crop classes and natural vegetation classes: if an up to date topographic map is available, scattered pixels classified as agricultural crop outside the agricultural area on the map can be assigned to the spectrally most likely alternative natural class. During phase 3, classes that show significant spectral confusion, e.g. natural heterogeneous vegetation, are distinguished using classification rules based on ancillary data. As vegetation response on a certain environmental factor is not always very clear, the response on different factors is combined during phase 3. Relationships between environmental factors and vegetation communities can be analyzed with several statistical techniques (multiple regression, classification trees, Bayesian approach, correspondence analysis, etc.) which are extensively reviewed by Guisan & Zimmerman (2000). The choice of a suitable method depends on the collected field data, data type and the ability to incorporate the response curve in the ADCM. The prototype of the ADCM works with a distribution probability (Bayesian) approach during phase 3, because it has the advantages that variables can be added and removed easily from the model, spatial dependency can be incorporated and model interpretation is easy. The pooling algorithm of the prototype ADCM is:
P = ((1 − p1) * (1 − p 2) * ... * (1 − pn))^ (1 / n) Where: P = pooled probability p = probabiltiy on factor n P ranges from 0 to 1. Future versions of the ADCM will include alternative pooling algorithms based on Bayes theorem (Aspinall 1992; Aspinall 1993) or based on the algorithm given by Desachy et al. (1996).
Using ancillary data to improve classification of degraded Mediterranean vegetation with HyMap spectroscopic images
221
3
APPLICATION OF THE ANCILLARY DATA CLASSIFICATION MODEL
We tested the suitability of the prototype of ADCM for mapping Mediterranean heterogeneous vegetation in the Peyne study area, southern France. Central coordinates are 43°33’N, 3º18’E. During 4 field campaigns in the summers of 2000 - 2003 we collected detailed information on vegetation. We performed the ADCM classification on high resolution airborne HyMap imagery, collected in the summer of 2003. The HyMap sensor operated by HyVista (HyVista 2003) collects data in 126 optical bands along the 450 –2500 nm. range with a spectral resolution of 15-20 nm. The spatial resolution of the images is 5m. The area is characterized by various geological and lithological substrates. The large variation at short distances in these substrates, of elevation, climatic factors and human disturbance is responsible for a wide range of growing conditions and hence, a large variety of vegetation types. The natural vegetation in the area is a degraded stage of evergreen forest, consisting of shrubby formations referred to as ‘matorral’ by Tomaselli (1981). Matorral can be further discriminated by height, density and species composition. A classification scheme developed for the study area is shown in figure 2.
Figure 2. Hierarchical land cover classification scheme. Classes used in the 14 class SAM and ADCM classifications are shown in bold.
For the initial spectral classification of phase 1 the spectral angle mapper (SAM) classifier was used. The SAM classifier compares entire spectra with class spectra by measuring the angle between spectra on the different band intervals. The difference between the spectra is expressed in radians, in this study typically ranging from 0 to 1.3, where 0 indicates a perfect match. The algorithm is very suitable for this study because the SAM classifier makes use of the entire hyperspectral dataset, and because the class images expressed as radians are easy to interpret. The prototype of the ADCM only classifies the 14 classes shown in bold in figure 2. The results will be compared with a standard SAM classification of the same 14 classes. The classification rules for the different classes are as follows: Phase 2: • Class 2, 3 and 15: direct spectral classification 222
R. Sluiter & S.M. de Jong
• Class 1: combination of spectral classification and high wetness index value. Wetness index is calculated using a stochastic approach described by Burrough et al. (2000). • Class 17, 19 and 20: spectral classification constrained by land use from topographical map. • Class 16 and 20: spectral classification constrained by land use from topographical map and by patch size. • Class 18: spectral classification constrained by the distance to the road network on the topographical map. Phase 3: • Class 7,8,11,12 and 13: spectral classification in combination with the occurrence probabilities of the dominating species on lithologic and geologic units. We calculated the occurrence probabilities based on the field data set. The SAM based class images are included in the pooling procedure described above as a probability factor: if there is a perfect spectral match, the spectral probability is 1. The classifications of phase 2 and phase 3 are combined and the result will be compared with a standard 14-class SAM classification. 4
RESULTS
The results of the 14-class SAM classification and the 14-class ADCM classification are shown in figure 3. In this figure, we point to three areas of interest. The overall accuracy of the SAM classification is 51%, the overall accuracy of the ADCM classification is 69%. The following differences between the images are important to mention: • The SAM classification shows significant confusion between dense matorral (class 2) and riparian vegetation (class 1) in the forested region around the lake. The accuracy of both classes increases in the ADCM classification in phase 2. • Gardens in urban areas are misclassified as vineyard (class 16) in the SAM classification, in the ADCM classification vineyards are constrained to known agricultural areas in phase 2. • Very bright bare soils are misclassified as urban outside urban areas, in the ADCM classification urban is constrained to known urban areas in phase 2. • The vegetation type dominated by Cistus spec. (class 13), which is abundant in area 1 and 3 is not at all identified by the SAM classification but confused with vegetation types that do not occur in these areas. The ADCM classifies the vegetation type well in phase 3. • In area 2, the low herb and grass class (class 12) is underestimated by the SAM classifier, the ADCM result better matches the field observations, in phase 3. On average, it is clearly visible that the ADCM classification shows less spectral confusion and associated salt and pepper patterns compared to the SAM classification. Moreover the ADCM results show a more consistent pattern of vegetation communities, as observed in the field. We anticipate that the results will improve even more by addition of more environmental factors, the historic information and by the inclusion of spatial pattern. The last point is essential to distinguish the classes 4,5,6 and 14. 5
CONCLUSION
In this paper we presented and evaluated the new Ancillary Data Classification model (ADCM), aiming at improved classification of Mediterranean vegetation communities. ADCM accounts for relations between the environmental factors (lithology, geology, water availability and topography)
Using ancillary data to improve classification of degraded Mediterranean vegetation with HyMap spectroscopic images
223
Figure 3. A) SAM classification result with 14 classes, B) the ADCM classification results with 14 classes. 224
R. Sluiter & S.M. de Jong
and vegetation communities. The result of the 14-class SAM classification shows that there is significant spectral overlap between certain classes, resulting in an inconsistent pattern of vegetation communities. The ADCM classification is able to identify these vegetation communities much better, proven by an increase of overall accuracy from 51% to 69%. Based on the promising results of the prototype of the ADCM, we anticipate that the addition of more environmental factors will further improve the accuracy of the classification product, into maps better suited for the evaluation of land cover change models. REFERENCES Aspinall, R. 1992. An inductive modeling procedure based on Bayes’ theorem for analysis of pattern in spatial data. Journal of Geographical Information Systems 6(2): pp. 105-121. Aspinall, R. 1993. Habitat mapping from satellite imagery and wildlife survey using a Bayesian modeling procedure in a GIS. Photogrammetric Engineering & Remote Sensing 59: pp. 537-543. Atkinson, P. and Quattrochi, D.A. 2000. Special Issue on Geostatistics and Geospatial Techniques in Remote Sensing. Computers & Geosciences 26: pp. 359-490. Blaschke, T., Burnett, C. and Pekkarinen, A. 2004. Image segmentation methods for object-based analysis and classification. Remote Sensing Image Analysis: including the spatial domain. S. M. De Jong and F. D. Van der Meer. Dordrecht, Kluwer Acadmics. pp. 211-236. Boer, M.M. 1999. Assesment of dryland degradation. Utrecht, PhD Thesis Utrecht University. 291 pp. Bonet, A. 2004. Secondary succession of semi-arid Mediterranean old-fields in south-eastern Spain: insights for conservation and restoration of degraded lands. Journal of Arid Environments 56(2): pp. 213-233. Briassoulis, H. 2000. Analysis of land use change: theoretical and modeling approaches in: the web book of regional science. WWW document, http://www.rri.wvu.edu/WebBook/Briassoulis/contents.htm accessed 11/11/2003 Burrough, P.A. and McDonnell, R.A. 1998. Principles of Geographical Information Systems, Oxford University Press. 333 pp. Burrough, P.A., Van Gaans, P.F.M. and MacMillan, R.A. 2000. High-resolution landform classification using fuzzy k-means. Fuzzy sets and systems 113: pp. 37-52. Carmel, Y. and Kadmon, R. 1998. Computerized classification of Mediterranean vegetation using panchromatric aerial photographs. Journal of Vegetation Science 9(445-454): pp. De Jong, S.M., Hornstra, T. and Maas, H. 2001. An integrated spatial and spectral approach to the classification of Mediterranean land cover types: the SSC method. Journal of Applied Geosciences 3(2): pp. 176183. Desachy, J., Roux, L. and Zahzah, E. 1996 Numeric and symbolic data fusion: a soft computing approach to remote sensing analysis. Pattern recognition letters 17: pp. 1361-1378. Guisan, A. and Zimmerman, N.E. 2000. Predictive habitat distribution models in ecology. Ecological modelling 135(135): pp. 147-186. HyVista 2003. "HyVista Corporation website". WWW document, http://www.hyvista.com/ accessed 30-102003 Kruse, F.A., Lefkoff, A.B., Boardman, J.B., Heidebrecht, K.B., Shapiro, A.T., Barloon, P.J. and Goetz, A.F.H. 1993. The spectral image processing system (SIPS) - Interactive visualisation and analysis of imaging spectrometer data. Remote Sensing of Environment 44: pp. 145-163. Lasanta, T., Pérez-Rontomé, C., García-Ruiz, J.M., Machín, J. and Navas, A. 1995. Hydrological problems resulting from farmland abandonment in semi-arid environments: the central ebro depression. Phys. Chem. Earth 20(3-4): pp. 309-314. Maselli, F., Rodolfi, A., Bottai, L., Romanelli, S. and Conese, C. 2000. Classification of Mediterranean vegetation by TM and ancillary data for the evaluation of fire risk. International Journal of Remote Sensing 21(17): pp. 3303-3313. Mouillot, F., Rambal, S. and Lavorel, S. 2001. A generic process-based SImulator for meditERRanean landscApes (SIERRA): design and validation exercises. Forest Ecology and Management 147: pp. 75-97. Nagendra, H. 2001. Using remote sensing to assess biodiversity. International Journal of Remote Sensing 22(12): pp. 2377-2400. Plummer, S.E. 2000. Perspectives on combining ecological process models and remotely sensed data. Ecological Modelling 129: pp. 169-186.
Using ancillary data to improve classification of degraded Mediterranean vegetation with HyMap spectroscopic images
225
Scarascia-Mugnozza, G., Oswald, H., Piussi, P. and Radoglou, K. 2000. Forests of the Mediterranean region: gaps in knowledge and research needs. Forest Ecology and Management 132: pp. 97-109. Shoshany, M. 2000. Satellite remote sensing of natural Mediterranean vegetation: a review within an ecological context. Progress in Physical Geography 24(2): pp. 153-178. Sluiter, R., De Jong, S.M., Van der Kwast, J. and Walstra, J. 2004. A Contextual Approach to Classify Mediterranean Heterogeneous Vegetation using the Spatial Re-classification Kernel and DAIS7915 Imagery. Remote Sensing Image Analysis: Including the Spatial Domain. S. M. De Jong and F. D. Van der Meer, Kluwer. pp. 291-310. Tappeiner, U., Tasser, E. and Tappeiner, G. 1998. Modelling vegetation patterns using natural and anthropogenic influence factors: preliminary experience with a GIS based model applied to an Alpine area. Ecological Modelling 113: pp. 225-237. Tomaselli, R. 1981. Main physiognomic types and geographic distribution of shrub systems related to Mediterranean climates. Ecosystems of the world 11 Mediterranean-type shrublands. F. Di Castri, D. W. Goodall and R. L. Specht, Elsevier. pp. 95-106. Wesseling, C.G., Karssenberg, D.J., Burrough, P.A. and Van Deursen, W.P.A. 1996. Integrating dynamic environmental models in GIS: The development of a dynamic Modelling language. Transactions in GIS 1: pp. 40-48. Zavala, M.A. and Burkey, T.V. 1997. Application of ecological models to landscape planning: the case of the Mediterranean basin. Landscape and Urban Planning 38: pp. 213-227.
226
R. Sluiter & S.M. de Jong