Bowersox2. 1School of Natural Resources and Environment, University of Michigan, Ann Arbor, MI. 48109-1115. 2Town of Pittsford, Spiegel Community Center, ...
Assessing Uncertainty in Spatial Landscape Metrics Derived from Remote Sensing Data
Daniel G. Brown1*, Elisabeth A. Addink1, Jiunn-Der Duh1, and Mark A. Bowersox2
1
School of Natural Resources and Environment, University of Michigan, Ann Arbor, MI 48109-1115
2
Town of Pittsford, Spiegel Community Center, 35 Lincoln Avenue, Pittsford, NY 14543
In Press. In Lunetta, R., Lyon, J.G., Eds. Remote Sensing and GIS Accuracy Assessment, Boca Raton, FL: CRC Press.
Manuscript #02-20
2
Assessing Uncertainty in Spatial Landscape Metrics Derived from Remote Sensing Data 1.0 INTRODUCTION Recent advances in the field of landscape ecology have included the development and application of quantitative approaches to characterize landscape condition and processes based on landscape patterns (Turner et al., 2001). Central to these approaches is the increasing availability of spatial data characterizing landscape constituents and patterns, which are commonly derived using various remote sensor data (i.e., aerial photography or multi-spectral imagery). Spatial pattern metrics provide quantitative descriptions of the spatial composition and configurations of habitat or land-cover (LC) types. They can be applied to provide useful indicators of the habitat quality, ecosystem function, and the flow of energy and materials within a landscape. Landscape metrics have been used to compare ecological quality across landscapes (Riitters et al., 1995), across scales (Frohn, 1997), and to track changes in landscape pattern through time (Henebry and Goodin, 2002). These comparisons often lead to quantitative statements of the relative quality of landscapes with respect to some spatial pattern concept (e.g., habitat fragmentation). Uncertainty associated with landscape metrics has several components including (i) accuracy (how well the calculated values match the actual values), (ii) precision (how closely repeated measurements get to the same value), (iii) and meaning (how comparisons between metric values should be interpreted). In practical terms, accuracy, precision and the meaning of metric values are affected by several factors that include the definitions of categories on the landscape map, map accuracy; and validity and uniqueness of metric of interest. Standard
3
methods for assessing LC map accuracy provide useful information, but are inadequate as indicators of the spatial metric accuracy because they lack information concerning spatial patterns of uncertainty and the correspondence between the map category definitions and landscape concepts of interest. Further, direct estimation of the accuracy of landscape metric values is problematic. Unlike LC maps, no standard procedures have been developed to support assessment of landscape metric accuracy. Also, the scale dependence of landscape metric values complicates comparisons between field observations and map-based calculations. As a transformation process, in which mapped landscape classes are transformed into landscape measurements describing the composition and configuration of that landscape, landscape metrics can be evaluated using precision and meaning diagnostics (Figure 1). The primary objective is to acquire a metric with a known and relatively high degrees of accuracy and precision that is interpretable with respect to the landscape characteristic(s) of interest. The research presented in this paper addresses the following questions: (1) How precise are estimates of various landscape metrics derived from satellite images?; (2) How sensitive are landscape metrics to differences in the definitions of landscape classes?; (3) How sensitive are landscape metrics to landscape pattern concepts of interest (e.g., ecotone abruptness or forest fragmentation) versus potential confounding concepts (e.g., patchiness or amount of forest). This chapter presents results from recent research that seeks to evaluate uncertainty in landscape metrics, as defined above. To calculate the precision of landscape metrics, repeated estimates of metric values are used to observe the variation in the estimates. Because measures of precision are based on multiple calculations, they are more practical for landscape metric applications than are measures of accuracy. We discuss two different approaches to performing multiple calculations of landscape metric values. First, redundant mapping of landscapes was
4
used to calculate the variation in metric values resulting from the redundant maps. Second, spatial simulation was used to evaluate the response of landscape metric values to repeated landscape mapping under a neutral model (Gustafson and Parker, 1992). Following a general discussion of alternative types of landscape metrics, we review the results of previous research and new results to illustrate how landscape metric values vary using redundant mapping and simulation methods. First, the precision of estimates of change in metric values between two images was investigated using redundant mapping of sample areas that were defined by the overlap of adjacent satellite scenes (Brown et al. 2000a; Brown et al. 2000b). The approach and the results of the previous work are summarized. Next, the variation in metrics calculated using landscape maps that were derived from the same remote sensing source, but classified with different definitions and meanings of the landscape classes. We present comparisons that illustrate the effects of alternative definitions of "forest," and of using land cover versus land use classes, in the calculation of landscape metrics. Finally, research that uses simulation to investigate the interpretability of the construct being measured by the metric is reviewed and evaluated (Bowersox and Brown, 2001). The degree of similarity between several landscape metrics and the concept of ecotone abruptness was evaluated. We also present simulations to illustrate the problem of interpreting the degree of fragmentation, as distinct from the amount of forest, from the values of individual landscape metrics.
2.0 BACKGROUND A variety of approaches to characterizing landscape pattern are available, each with their own implications for the accuracy, precision, and meaning of a landscape pattern analysis. With the goal of quantitatively describing the landscape structure, landscape metrics provide
5
information both about landscape composition and landscape configuration (McGarigal and Marks, 1995). The most common approach to quantifying these characteristics of landscape structure has been to map defined landscape classes (e.g., habitat types), delineate patches of each landscape class, then describe the patches. Patches are defined as contiguous areas of homogenous landscape condition. Landscape composition metrics describe the presence, relative abundance, and diversity of various landscape types. Landscape configuration refers to the "physical distribution or spatial character of patches within the landscape" (McGarigal and Marks, 1995). Summaries of pattern can be made at the level of the individual patch (e.g., size, shape, and relative location), averaged across individual landscape classes (e.g., average size, shape, and location), or averaged across all patches in the landscape (e.g., average size, shape, and location of all patches). As an alternative to patch-based metrics, another set of metrics focuses on identifying transition zone boundaries that are present in continuous data. This approach has not been used as extensively as the patch approach in landscape ecology (Johnston and Bonde, 1989; Fortin and Drapeau, 1995). One approach to using boundaries is to define "boundary elements," defined as cells that exhibit the most rapid spatial rates of change, and "sub-graphs," which are strings of connected boundary elements that sharea common orientation (direction) of change (Jacquez et al., 2000). The landscape metrics characterize the numbers of boundary elements and sub-graphs and the length of sub-graphs, which is defined by the number of boundary elements in a sub-graph. An important advantage is that boundary-based statistics can be calculated from images directly, skipping the classification step through which errors can propagate. Throughout this chapter, we refer to patch-based metrics, which were calculated
6
using Fragstats (McGarigal and Marks, 1995), and boundary-based metrics calculated using the methods described by Jacquez et al. (2000).
3.0 METHODS 3.1 Precision of Landscape Change Metrics To measure imprecision in metric values, overlapping Landsat Multi-Spectral Scanner (MSS) path/row images were redundantly processed for two different study areas in the Upper Midwest to create classifications representing forest, non-forested, water, and other, and maps of the normalized difference vegetation index (NDVI). Images on row 28 and paths 24-25 overlapped in the Northern Lower Peninsula of Michigan and on row 29 and paths 21-22 overlapped on the border between Northern Wisconsin and the western edge of Michigan's Upper Peninsula. The georeferenced MSS images at 60 m resolution were acquired from the North American Landscape Characterization (NALC) project during the growing seasons corresponding to three periods 1973-1975, 1985-1986, and 1990-1991 (Lunetta et al., 1998). Subsequent LC classifications of the four images resulted in accuracies ranging from 72.5% to 91.2% (average 80.5%), based on comparison with aerial photograph interpretations. For landscape pattern analysis, the two study areas were partitioned into 5 x 5 km2 cells. A total of 325 cells in the Michigan site and 250 in the Wisconsin/Michigan site were used in the analysis. The partitions were treated as separate landscapes for calculating the landscape metric values. The values of eight pattern metrics, four patch-based and four boundary-based, were calculated for each partition using each of two overlapping images at each of three time periods in both sites.
7
The precision of landscape metric values was calculated using the difference between metric values calculated for the same landscape partition within the same time period. For each metric, these differences were summarized across all landscape partitions using the root mean squared difference (RMSD). To standardize the measure of error for comparison between landscape metrics, the relative difference (RD) was calculated as the RMSD divided by the mean of the metric values obtained in both images of a pair. 3.2 Comparing Class Definitions 3.2.1. Landsat Classifications To evaluate the sensitivity of landscape maps to differences in class definitions we calculated landscape metric values from two independent classifications of land cover in the Huron River watershed, Southeastern Michigan. Landsat TM imagery were to create both data sets. The primary difference between the data sets was their class definitions. Accuracy assessments were not available for either data set. Therefore, the analysis serves only as an illustration for evaluating the importance of class definitions. For the first map, Level I LU/LC classes were mapped for the early-1990’s using the National Land Cover Data (NLCD) classification for the region. We developed the second data set using TM imagery from July 24, 1988. It was classified to identify all areas of forest, defined as pixels with >40% canopy cover, versus non-forest. Spectral clusters, derived through unsupervised classification (using the ISODATA technique), were labeled through visual interpretation of the image and reclassified. Landscape metrics were computed using Fragstats applied to the forest class from both data sets across the entire watershed. Also, the two data sets were overlaid to evaluate their spatial correspondence.
8
3.2.2. Aerial Photography Interpretations We also compared two classifications of aerial photography over a portion of Livingston County in Southeastern Michigan. The first data set consisted of a manual interpretation of landuse (LU) and LC using color infrared (CIR) aerial photographs (1:24,000-scale) collected in 1995 (SEMCOG, 1995). The classes were based on the Anderson system (Anderson et al.1976), which we reclassified to high density residential, low density residential, other urban, and other. The second was a LC classification created through unsupervised clustering, and subsequent cluster labeling, of scanned color-infrared photography (1:58,000-scale) collected in 1998. The LC classes were forest, herbaceous, impervious, bare soil, wetland, and open water. The two maps were overlaid to identify the correspondence between the LC classes and the urban LU classes. The percentages of forest and impervious cover were calculated within each of the urban LU types.
3.3 Landscape Simulations 3.3.1 Ecotone Abruptness An experiment was designed in which 25 different landscape types were defined, each representing a combination of among five different levels of abruptness and five levels of patchiness (Bowersox and Brown, 2001). Abruptness was controlled by altering the parameters of a mathematical function to model the change from high to low values along the gradient representing forested cover. Patchiness was introduced by combining the mathematical surface with a randomized surface that was smoothed to introduce varying degrees of spatial autocorrelation. Once the combined gradient was created, all cells with a value above a set
9
threshold were classified as forest, and below as non-forest. The threshold was set so that each simulated landscape was 50% forested and 50% non-forested. For each type of landscape, 50 different simulations were conducted. The ability of each landscape metric to detect abruptness was then tested by comparing the values of the 50 simulations among the different landscape types. The landscape metric values were compared among the abruptness and patchiness levels using analysis of variance (ANOVA). The ANOVA results were analyzed to identify the most suitable metrics for measuring abruptness, i.e., those exhibiting a high degree of variation between landscape types with variable abruptness levels but a low degree of variation between landscape types with variable patchiness. In addition to several patch-based metrics (including area-weight patch fractal dimension, area-weighted mean shape index, contagion, and total edge) and boundary-based metrics (including number of boundary elements, number of subgraphs, and maximum subgraph length), the analysis compared the ability of two new boundary-based metrics, designed specifically to measure ecotone abruptness, to distinguish different levels of abruptness. The new metrics characterize (i) the dispersion of boundary elements around an "average ecotone position", calculated as the centroid of all boundary elements, and (ii) the area under curve of number of boundary elements versus slope threshold level. 3.3.2 Fragmentation The sensitivity of several potential measures of forest fragmentation to the amount of forest was also investigated through simulation. The simulation involved three steps: (1) A random map for a 100 x 100 cell landscape was generated, drawing pixel values randomly from a normal distribution with mean 0 and standard deviation 1; (2) The random map was smoothed with a 5 x 5 averaging filter to introduce spatial autocorrelation; and (3) From each smoothed
10
surface, 10 different classified landscape maps were created by classifying cells as forest or nonforest based on 10 different threshold levels. The threshold levels were defined so that the 10 different maps had a uniformly increasing amount of forest from about 9% to about 91% (Figure 2). By extracting the landscape maps with different proportions of forest from the same simulated surface, the pattern of the 10 maps was controlled and the dominant difference among maps was the amount forested. The simulation process was repeated 10 times to produce a range of output values at each landscape proportion level.
4.0 RESULTS 4.1 Precision of Landscape Metrics Comparison among the patch-based metrics (Table 1) indicated that the number and size of patches were much less precise than the area of forest and the edge density. A likely explanation is that the number of patches and mean patch size metrics required that the classification process consistently classified individual pixels and placed them in consistent patches, both processes of which can be sensitive to spatially patterned classification error. This suggests that there are differences among metrics in the ∆ precision described in Figure 1. Comparing patch-based metrics with boundary-based metrics indicated that the majority of boundary-based metrics had greater precision than the majority of the patch-based statistics (Table 1). This can best be explained by the way in which ∆ precision was affected by the procedures used to calculate the metric values. All of the patch-based metrics involved an image classification step, and two of them added a patch identification step. Both of these steps are sensitive to spatial variations in image quality and to the specific procedures used. Because the boundary-based metrics were calculated directly from the NDVI images, there was less
11
opportunity for propagation of the spatial pattern of error. Further, the boundary-based metrics used only local information (i.e., the NDVI values immediately adjacent to each cell) to characterize pattern, but the patch-based metrics use global information, i.e., spectral signatures from throughout the image for classification and patch connections that spanned the entire landscape. This use of global information introduced more opportunities for error in the calculation of landscape metrics. Additional work evaluated the effects of various processing choices on the precision of metrics (Brown et al., 2000a). The results of this work suggest that (i) haze in the images and differences in seasonal timing are important determinants of metric variability, with less precision resulting from hazier images and image pairs that are separated by more Julian days, irrespective of the year; (ii) summarizing landscape metrics over larger areas, i.e., using larger landscape partitions, increases the precision of the estimates though it reduces the spatial resolution; (iii) post-classification processing, like sieving and filtering, does not consistently increase the precision, and can actually reduce the precision.. The obvious cost associated with obtaining precise estimates through the empirical approach of redundant mapping is that the areas need to be mapped twice. However, the costs may be lower than the costs of obtaining reference data for accuracy assessment, and can provide reasonable estimates of precision in a pattern analysis context, where comparison with a reference data set is much more problematic. Guindon et al. (2003) used a similar approach to dealing with the precision of LC maps.
12
4.2. Comparing Class Definitions 4.2.1. Comparing TM Classifications Across all landscape metrics tested, our forest cover classification of the Huron River watershed suggested that the landscape was much less fragmented than did the NLCD forest class, i.e., that there was more forest, in fewer but larger patches, with less forested/non-forested edge and more core area (Table 2). Comparisons of forest cells indicated that forest cover occurred in several of the non-forest NLCD classes. The definitions of NLCD classes allowed for substantial amounts of forest cover in non-forest classes. For example, in the low-density residential class “vegetation” could account for 20% to 70% of the cover (USGS, 2001). Also, the NLCD forest classes were not 100% forested. Although 65% of the forested cover in the region (by our definition) was contained within forest classes as defined by NLCD, 25% was located in agricultural areas and