LIDAR-BASED DETECTION OF SHRUBLAND AND FOREST LAND COVER TO IMPROVE IDENTIFICATION OF GOLDEN-CHEEKED WARBLER HABITAT Jennifer L.R. Jensen (
[email protected]) Sandra Irvin (
[email protected]) Department of Geography Texas State University San Marcos, TX 78666 Adam Duarte (
[email protected]) Department of Biology Texas State University San Marcos, TX 78666 1. INTRODUCTION Aircraft- and satellite-based imagery are valuable resources for land cover/land use assessments and enable up-to-date, rapid identification of landscape features. In terms of wildlife habitat identification, classification of multispectral imagery provides operational datasets that wildlife and resource managers can use to characterize existing and potential habitat for specific species. While such land cover and/or vegetation map data are invaluable, there are some limitations. For example, many terrestrial species require habitat comprised of not only a specific vegetation type or assemblage, but that the features on the landscape exhibit specific vertical structural characteristics as well. As is the case with other terrestrial species, the golden-cheeked warbler (Setophaga chrysoparia; hereafter, warbler), which is on the federal list of endangered species, not only requires a specific type of vegetation on the landscape but it also needs the vegetation to have a specific physiognomy. In particular, the warbler requires the presence of mature oak-juniper woodlands for breeding habitat, where mature is generally 12.7 cm in diameter at breast height (DBH) (Pulich 1976; Ladd and Grass 1999). While DBH information may be difficult to easily determine from remote sensing datasets, vegetation height is not. Thus, the ability to either a) integrate height data at the time of multispectral image classification, or b) constrain post-classification thematic map output based on vegetation height should result in improved identification of warbler habitat. In this study, we use lidar-derived height dataset to constrain National Land Cover Dataset (NLCD) pixels of interest. In the case of the warbler, the NLCD contains two categories of interest that are often misclassified with respect to one another: Forest and Shrubland, where the primary difference is based on height. Therefore we address two objectives in this paper: 1) can lidar height-based metrics exhibit significant differences between NLCD classes of interest, and 2) can lidar be used post-classification to identify misclassified pixels in the NLCD and/or potential warbler habitat classes. 2. RELEVANT BACKGROUND AND LITERATURE The National Land Cover Database (Fry et al., 2011) is derived from Landsat multispectral datasets and provides categorical land cover classification. The classification system, originally based on Anderson et al. (1976), was more recently updated to provide a uniform classification of remotely sensed data across multiple agencies and entities. The NLCD contains two coarse categories of interest (Forest and Shrubland) for determining warbler habitat. However, at the next level of detail, “Forest” is subdivided into three more distinct categories: Deciduous Forest, Evergreen Forest, and Mixed Forest. “Shrubland” is also Papers in Applied Geography, Volume 36: 74-82
subdivided into two classes, however, one is specific to Alaska and therefore not applicable to our study. Distinguishing between the two broad categories of Forest and Shrubland can prove problematic using two-dimensional passive optical remote sensing imagery since the primary difference in class definitions is that Forest is defined as woody vegetation greater than five meters tall and Shrubland consists of woody vegetation less than 5 meters tall. Both classes use the same definition of 20 percent for total canopy cover. Thus, viewing the top of the canopy through the lens of a spectral dataset can lead to inconclusive results between the two land cover classes. In order to differentiate between the classes, a measure of height is required which can be fulfilled via lidar-derived height data. At present, a number of studies have used lidar data, alone and in conjunction with multispectral datasets, to identify suitable wildlife habitat. For example, Bradbury et al. (2005) used lidar data to characterize variability in ground cover and crop height to predict Sky Lark (Alauda arvensis) distribution. Nelson et al. (2005) used profiling lidar to quantify upper canopy height and crown closure to identify potential Delmarva fox squirrel (Sciurus niger cinereus) habitat. Clawges et al. (2008) used airborne lidar data to characterize canopy structure and assess avian species occurrence, diversity, and density in mixed forests. Garcia-Feced et al. (2011) used lidar data to calculate habitat variables surrounding California Spotted Owl (Strix occidentalis occidentalis) nesting habitat. More recently, Farrell et al. (2013) used lidar data to quantify canopy height and cover to predict both warbler and black-capped vireo (Vireo atricapilla) habitat patch occupancy on the Fort Hood Military Reservation. Though not an exhaustive representation of the literature with regard to habitat modeling, the utility of lidar data as a tool for habitat mapping is recognized and increasingly put in to practice (Vierling et al., 2008). Unfortunately, lidar data are not available at the continental scale (as is NLCD classification), nor always available during the same season or year that multispectral datasets are acquired. Nonetheless, the data may be used a-posteriori to improve identification of suitable or potential habitat. 3. METHODS 3.1 STUDY AREA The warbler breeds exclusively in the mature oak-juniper woodlands of the Texas Hill Country and Balcones Escarpment. In order to manage the species’ recovery process, the U.S. Fish & Wildlife Service (USFWS) geographically divided the warbler’s breeding habitat in to eight regions based on geology, vegetation, and watershed boundaries (Figure 1). 3.2 GEOSPATIAL DATA ACQUISITION AND PROCESSING Lidar data were acquired during leaf on conditions in 2006 with an Optech 2050 airborne system. The data, available by request from the Captial Area Council of Governments (CAPCOG; http://www.capcog.org), were commissioned to provide a relatively high-density dataset of lidar returns suitable for hydraulic/hydrological model development and environmental impact analysis. The vendor delivered multiple-return lidar data in LAS format with a simple ground/non-ground classification. The vendor also separated the lidar data into a system of tiles (i.e., tile index) with approximate dimensions of 1.5 km by 1.7 km tiles for more efficient display, storage, and distribution. All lidar-related datasets were delivered in the State Plan 4023 (Survey Feet), NAD 83 coordinate system (NAVD88, US Foot for the vertical coordinate system). Prior to lidar processing and analysis, all relevant lidar datasets were reprojected to UTM Zone 14N, NAD83 and converted the vertical units to meters. The 2006 National Land Cover Database was downloaded from the USGS EarthExplorer website (http://earthexplorer.usgs.gov). The thematic map data were subset to the warbler’s range using the warbler recovery unit boundaries provided by the USFWS (1992). The dataset was re-projected from its native Albers Equal Area (WGS84) projection to UTM Zone 14N, NAD 83. Since the warbler’s range includes land cover classes unrelated to its 75
required habitat (i.e., agriculture, urban, grassland, etc.), zonal statistics were used to calculate NLCD class proportions within each lidar tile.
FIGURE 1 BREEDING RANGE FOR THE GOLDEN-CHEEKED WARBLER IN TEXAS, USA. ON THE RIGHT, GOLDEN-CHEEKED WARBLER RECOVERY UNIT BOUNDARIES AND LOCATIONS OF NLCD TILES USED IN ANALYSIS Since each lidar tile contains approximately 3,310 30m NLCD pixels, we selected seven lidar tiles within the immediate region (i.e., recovery units 3, 4, and 5) to include in our analysis. Selected tiles contained a relatively high proportion of either Forest or Shrubland, or both classes (Table 1). This selection approach was used to ensure that we compared as many pixels of interest as possible while minimizing processing time associated with the lidar datasets. TABLE 1 PERCENT NLCD-CLASSIFIED LAND COVER CLASS WITHIN EACH COINCIDENT LIDAR TILE. CLASS NUMBERS CORRESPOND TO NLCD CLASS CODES Tile Name Crabapple Driftwood Enchanted Rock Florence Leander San Marcos Shingle Hills
Deciduous Forest (41) 4.3% 9.7% 6.1% 4.7% 12.6% 8.5% 3.2%
Evergreen Forest (42) 26.1% 55.7% 12.7% 10.5% 45.2% 80.3% 38.6%
76
Shrub/Scrub (52) 59.5% 4.6% 64.0% 33.6% 13.4% 5.7% 43.1%
Other 10.2% 30.0% 17.2% 51.3% 28.8% 5.5% 15.2%
For each lidar tile, the lidar returns were filtered into ground and non-ground datasets and exported as shapefiles for use in ArcMap using the LP360 extension (QCoherent Software LLC, Madison, AL) for ArcGIS (ESRI, Redlands, CA). The ground points were used to generate a 30 m digital terrain model (DTM) based on the mean elevation value of ground returns over a 30 m by 30 m area corresponding to the coincident NLCD pixels. The nonground returns were used with the 3D Analyst function “Add Surface Information” to extract the corresponding DTM elevation and write the value to the non-ground return shapefile attribute table. A height field was added to the attribute table and populated by subtracting the DTM elevation from the non-ground point elevation. The individual heights were summarized within NLCD pixel “zones” (described below) to determine the relationship of selected lidar height metrics to NLCD classes. Seven regular “grids” of 30 m square polygon features (i.e., fishnets) were created using the individual lidar tile index and snapped to the corresponding NLCD tiles. Creating fishnets for each of the clipped NLCD datasets allowed zonal statistics to be calculated for each of the individual polygons (i.e., each 30 m x 30 m polygon was geographically-coincident with each 30 m NLCD pixel). Polygons were treated as “zones” and since each polygon has a unique feature ID (FID), zonal statistics based on FID were calculated for the lidar-based height data as well as the NLCD class. Zonal statistic extraction of the NLCD data included the land cover class value, while zonal statistics for the lidar data involved calculation of mean height and standard deviation of heights for all individual heights located within the 30 m x 30 m zone. The zonal statistics output tables were joined based on FID and exported for quantitative analysis. Figure 2 provides a side-by-side display of land cover class and mean lidar height calculated for each coincident NLCD pixel.
FIGURE 2 SHINGLE HILLS ANALYSIS TILE WITH NLCD LAND COVER CLASSES (LEFT) AND COINCIDENT 30 M X 30 M MEAN LIDAR CALCULATED HEIGHTS (RIGHT)
77
3.3 LIDAR-NLCD CLASS COMPARISONS AND HEIGHT THRESHOLDING ANALYSES Zonal statistics calculated for lidar height data consisted of the following three metrics: maximum height, mean height, and coefficient of variation (CV). The CV was calculated to determine how variable heights are within each of the 30 m pixels, where small values indicate homogeneity among heights and high values indicate considerable heterogeneity. Lidar metrics were paired with their coincident NLCD pixel and analyzed in JMP 10.0 (SAS Institute Inc., Cary, NC). Since the lidar metrics were not normally distributed, a nonparametric ANOVA (Median Test) was performed to test for significant differences among median values of lidar metrics between NLCD classes. To determine the percentage of NLCD pixels that were potentially misclassified (i.e., Shrubland pixel with lidar calculated height greater than five meters or Forest pixel with lidarcalculated height less than 5 meters), we performed a simple map algebra calculation to threshold mean height values for each class. The number of class-specific pixels that did not meet the height criteria for each class were tallied and reported as a percentage of the total pixels for each class. 4. RESULTS 4.1 LIDAR-NLCD CLASS COMPARISONS Simple univariate statistics for each of the classes are reported in Table 2. Median values for each lidar metric indicate a difference of 2.30 m and 3.05 m for Deciduous and Evergreen Forest pixels and Shrub/Scrub, respectively and an average difference of 2.93 m for combined Forest pixels compared to Shrub/Scrub class pixels. The relatively lower CV values for all Forest classes suggest greater homogeneity among lidar heights within each NLCD pixel compared to Shrub/Scrub classes. TABLE 2 MEDIAN VALUES OF LIDAR HEIGHT METRICS FOR NLCD LAND COVER CLASSES. VALUES WERE EXTRACTED USING DISCRETE LIDAR HEIGHTS SUMMARIZED WITHIN INDIVIDUAL NLCD PIXELS NLCD Class Deciduous Forest, n = 1, 462 Evergreen Forest, n = 8,097 Combined Forest, n = 9,559 Shrub/Scrub, n = 6,655
Median Lidar Metric Values Mean Height (m) Max Height (m) 3.17 8.27 3.92 8.36 3.80 8.35 0.87 3.12
CV (%) 65.58 50.67 52.94 92.27
As anticipated, nonparametric ANOVA results indicate there are significant differences between lidar metrics and NLCD classes. Quantile plots for mean and maximum lidar height metrics and combined Forest and Shrubland classes are provided in Figure 2, and statistical test results are reported in Table 3. Chi-square approximations for each one-way test resulted in less than 0.01 percent probability that the deviations of the observed from the expected values between individual or combined Forest classes and Shrubland are due to chance. In short, the overall classification accuracy for the classes of interest is quite good when differences based solely on height metrics are considered. 4.2 IDENTIFICAITON OF MISCLASSIFIED NLCD PIXELS AND APPLICATION OF HEIGHT THRESHOLDS The data were analyzed using two separate thresholds for mean lidar height per pixel to identify potentially misclassified pixels. First, the NLCD defined height threshold for Forest (73.68 percent) of Evergreen Forest and 1,190 of 1,462 (81.14 percent) of Deciduous Forest pixels potentially misclassified as Forest since the mean lidar height calculated for returns 78
misclassification percentage with only 146 of 6,655 pixels at or above the 5 m height threshold. In terms of warbler-specific habitat requirements, only 24.28 percent of Deciduous Forest and threshold specific to their breeding habitat.
FIGURE 3 QUANTILE PLOTS OF MEAN AND MAXIMUM LIDAR HEIGHT ASSOCIATED WITH INDIVIDUAL NLCD PIXELS OF FOREST AND SHRUBLAND TABLE 3 NONPARAMETRIC ONEWAY ANALYSIS RESULTS OF LIDAR METRICS AND NLCD LAND COVER CLASSES Chi-square Critical Values and Probabilities Analysis Group
Mean Lidar Height
Max Lidar Height
CV
Individual Forest Classes and Shrubland (df=2)
4453.56 (0.01%)
2679.04 (0.01%)
3795.59 (0.01%)
Combined Forest Classes and Shrubland (df=1)
4438.13 (0.01%)
2664.10 (0.01%)
3655.05 (0.01%)
A more positive representation of classification accuracy was achieved by querying those pixels that had maximum lidar heights above or below the specific thresholds. For example, about 20 percent of the combined Forest pixels (Evergreen and Deciduous) had maximum heights below either the 5 m NLCD threshold or the 4.6 m warbler habitat threshold. Conversely, Shrubland percent misclassification increased considerably when maximum lidar height was used. For instance, 37.60 percent of the maximum lidar heights were at or above 5 m and 38.98 percent were at or above 4.6 m compared to 2.19 and 3.35 percent of pixels above the respective thresholds when lidar mean height was evaluated. A separate analysis of individual tiles indicates a geographic pattern of potential NLCD pixel misclassification (Table 5). For example, the Florence and Leander tiles and Crabapple and Enchanted Rock tiles are both adjacent tile pairs. Both pairs exhibit similar percentages in terms of the number of Shrubland and Forest pixels that do not meet the NLCD class definition for vegetation height. Evaluation of the Enchanted Rock and Crabapple tiles suggest at or close to 100 percent Forest pixel misclassification, as did the analysis of the
79
Shingle Hills tile. With the exception of the Driftwood tile, all other areas exhibited very low potential misclassification of Shrubland. TABLE 4 PERCENT OF PIXELS WITHIN EACH NLCD CLASS THAT ARE POTENTIALLY MISCLASSIFIED BASED ON LIDAR-CALCULATED HEIGHT AND NLCD DEFINITIONS OR HABITAT SPECIFIC THRESHOLDS Number (percent) of Potentially Misclassified Pixels Mean Lidar Height NLCD Class Deciduous Forest Evergreen Forest Combined Forest Shrubland
Max Lidar Height
5 m threshold
4.6 m threshold
5 m threshold
4.6 m threshold
1190 (81.40%)
1107 (75.72%)
323 (22.09%)
310 (21.20%)
5969 (73.72%)
5292 (65.36%)
1647 (20.34%)
1602 (19.79%)
7159 (74.89%)
6399 (66.94%)
1970 (20.61%)
1912 (20.00%)
146 (2.19%)
223 (3.35%)
2502 (37.60%)
2594 (38.98%)
TABLE 5 PERCENT OF NLCD COMBINED FOREST AND SHRUBLAND PIXELS THAT ARE POTENTIALLY MISCLASSIFIED BASED ON SPECIFIC TILES Shrub Percent Shrub misclassified
Forest < 5 m (total forest pixels)
Percent Forest misclassified
Tile Name
(total shrub pixels)
Crabapple
65 (1759)
3.7%
855 (898)
95.2%
Driftwood
29 (140)
20.7%
1248 (2006)
62.2%
Enchanted Rock
0 (1887)
0.0%
555 (555)
100.0%
Florence
24 (1000)
2.4%
341 (443)
76.9%
Leander
13 (402)
3.2%
1337 (1716)
77.9%
San Marcos
15 (166)
9.0%
1574 (2692)
58.5%
Shingle Hills
0 (1301)
0.0%
1249 (1249)
100.0%
5. DISCUSSION 5.1. APPLICABILITY OF LIDAR-BASED HEIGHT THRESHOLDS TO IDENTIFY POTENTIAL MISCLASSIFIED PIXELS. The analysis to determine if significant differences existed between individual NLCD classes relevant to the warbler habitat proved informative. If lidar data are processed and integrated into a multispectral classification, the data may be used to improve classification accuracy, particularly when the primary differences may not be observed spectrally but in the vertical characteristics of the dataset. For this analysis, it is encouraging that significant differences in mean and median values exist between the Forest and Shrubland classes in that it allows a certain measure of confidence in the NLCD classification itself. However, overall
80
classification accuracy may skewed by very good classification of the Shrubland class, which mitigates the poor agreement of the Forest classes with NLCD-defined height thresholds. Of greater interest is the selection of lidar metrics used to threshold heights for Forest and Shrubland classes. For instance, if mean lidar height is used, then only 25.11 percent of the pixels classified as Forest meet the height criteria defined for that class for the NLCD. While mean height is a common and useful lidar metric used to predict field-observed canopy heights, it may not be an ideal metric over forest stands that have lower percent canopy cover (i.e., 2050 percent) because many of the returns may come from the ground or short stature vegetation. A possible remedy to this situation would be to eliminate near-ground returns that may come from grasses, forbs, etc. On the other hand, maximum lidar height presents challenges as well because it only represents that highest lidar return over a 900 sq. m area. It is possible that there is a single tree that meets the Forest height threshold, yet the surrounding, or dominant vegetation canopy does not. However, lidar systems rarely sample the highest elevation of a feature within a specified landscape (especially forested landscapes) and differences in fieldmeasured height and lidar-modeled height frequently differ anywhere from 25 cm to several meters, depending on vegetation canopy type and structure (Lim et al. 2003) 5.2 SOURCES OF UNCERTAINTY AND IMPROVEMENTS TO STUDY A specific source of uncertainty with our study is the averaging of ground returns in the lidar dataset to create the DTM. Since all vegetation heights are based on the accuracy of the lidar return classification (i.e., ground and non-ground) and subsequent terrain model, discrepancies in height may exist due to the representation of ground elevation alone. This problem would most often present itself over steep-sloped areas, which are common throughout the Texas Hill Country and Balconies Escarpment, as the average ground elevation would be used to model heights that may lie above and below the mean terrain within each pixel. A potential solution would be to model the terrain at a higher resolution (i.e., 5 m), and calculate heights with the finer resolution DTM. Another source of uncertainty is associated with the NLCD pixels. Although the expected classification accuracy of NLCD products is reported to be 85% overall accuracy, this figure does not apply to individual classes, nor are there specific accuracy reports for individual areas or ecoregions. Ideally, a fieldwork campaign to validate the NLCD classification should be conducted to quantify the accuracy of the thematic map so that any uncertainty contributed by the actual map classification could be accounted for. 6. CONCLUSION This study uses the federally endangered golden-cheeked warbler as a case study to provide a framework by which NLCD classes relevant to wildlife species dependent on specific vegetation structure can be refined. While the results indicate significant differences exist between a few lidar-based height metrics and specific NLCD classes, the thresholds applied to the mean and maximum lidar metrics were particularly revealing with respect to potentially misclassified pixels. In terms of warbler-specific habitat, inclusion of Shrubland pixels that meet the height threshold for Forest could possibly be included as potential habitat (i.e., reclassified as Forest). However, exclusion of Forest pixels that did not meet the height threshold is not advised until more work is performed to determine optimal lidar metrics that will accurately represent forest canopy characteristics at 30 m spatial resolution. In summary, this study provides a practical application of remote sensing data and analysis that can be used to evaluate species habitat availability, monitoring, and conservation efforts. 7. REFERENCES Anderson, J. R., E.E. Hardy, J.T. Roach, and R.E. Witmer. 1976. A land use and land cover classification for use with remote sensor data, U.S. Geological Survey Professional Paper 964, Washington, U.S. Government Printing Office. 81
Bradbury, R. B., R. A. Hill, D. C. Mason, S. A. Hinsley, J. D. Wilson, H. Balzter, G. Q. A. Anderson, M. J. Whittingham, I. J. Davenport, and P. E. Bellamy. 2005, Modelling relationships between birds and vegetation structure using airborne LiDAR data: a review with case studies from agricultural and woodland environments: Ibis 147:443452. Clawges, R., K. Vierling, L. Vierling, and E. Rowell. 2008, The use of airborne lidar to assess avian species diversity, density, and occurrence in a pine/aspen forest: Remote Sensing of Environment: Earth Observations for Terrestrial Biodiversity and Ecosystems Special Issue. 112: 2064-2073. Farrell, S. L., B. A. Collier, K. L. Skow, A. M. Long, A. J. Campomizzi, M. L. Morrison, K. B. Hays, and R. N.Wilkins. 2013. Using LiDAR-derived vegetation metrics for highresolution, species distribution models for conservation planning. Ecosphere 4(3): 42. Fry, J., G. Xian, S. Jin, J. Dewitz, C. Homer, L. Yang, C. Barnes, N. Herold, and J. Wickham. 2011. Completion of the 2006 National Land Cover Database for the Conterminous United States, Photogrammetric Engineering and Remote Sensing 77(9): 858-864. Garcia-Feced, C., D. J. Tempel, and M. Kelly. LiDAR as a Tool to Characterize Wildlife Habitat: California Spotted Owl Nesting Habitat as an Example. Journal of Forestry 109: 436-443. Ladd, C., and L. Gass. 1999. Golden-cheeked warbler (Dendroica chrysoparia). Account 420. in A. Poole, andF. Gill, editors. The Birds of North America. The Academy of Natural Sciences, Philadelphia, Pennsylvannia, The American Ornithologists' Union,Washington D.C., USA. Lim, K., P. Treitz, M. Wulder, B. St-Onge, and M. Flood, 2003, LiDAR remote sensing of forest structure: Progress in Physical Geography 27: 88-106. Nelson, R., C. Keller, and M. Ratnaswamy. 2005, Locating and estimating the extent of Delmarva fox squirrel habitat using an airborne LiDAR profiler: Remote Sensing of Environment 96: 292-301. Pulich, W. M. 1976. The Golden-Cheeked Warbler: A Bioecological Study. Texas Parks and Wildlife, Austin, Texas, USA. U.S. Fish and Wildlife Service [USFWS]. 1992. Golden-cheeked warbler (Dendroica chrysoparia) recovery plan. U.S. Fish and Wildlife Service, Albuquerque, New Mexico, USA. Vierling, K. T., L. A. Vierling, W. A. Gould, S. Martinuzzi, and R. M. Clawges. 2008. Lidar: shedding new light on habitat characterization and modeling. Frontiers in Ecology and the Environment 6: 90-98.
82