Regression models for spatial prediction: their role for ... - CiteSeerX

8 downloads 82 Views 54KB Size Report
(Ferrier and Watson 1997). • How can we import expert opinion into the modelling of faunal distribution? (Pearce et al. 2001). • Is the observed species diversity ...
Biodiversity and Conservation 11: 2085–2092, 2002. © 2002 Kluwer Academic Publishers. Printed in the Netherlands.

Regression models for spatial prediction: their role for biodiversity and conservation A. LEHMANN1,∗ , J.McC. OVERTON2 , and M.P. AUSTIN3 1 Swiss Centre for Faunal Cartography, Terreaux 14, CH-2000 Neuchâtel, Switzerland; 2 Landcare Research, Private Bag 3127, Hamilton, New Zealand; 3 CSIRO Sustainable Ecosystems, GPO Box 284,

Canberra City ACT 2601, Australia; *Author for correspondence (e-mail: [email protected]; fax: +41-32-717-7969) Received 14 June 2002; accepted in revised form 9 July 2002

Abstract. This paper is an introduction to a Special Issue on ‘regression models for spatial predictions’ published in Biodiversity and Conservation following an international workshop held in Switzerland in 2001 (http://leba.unige.ch/workshop). This introduction describes how the exponential growth in computing power has improved our ability to reach spatially explicit assessment of biodiversity and to develop cost-effective conservation management. New questions arising from these modern approaches are listed, while papers presenting examples of applications are briefly introduced. Key words: Community composition, Conservation, Generalized additive models, Geographic information systems, Generalized linear models, Hotspots, Reserve network, Spatial predictions, Species distribution, Species richness. Abbreviations: GAM – generalized additive models, GLM – generalized linear models, GIS – geographic information systems.

Introduction Ecologists are increasingly being asked to produce detailed spatial estimates of natural resources, particularly biodiversity attributes (e.g. Prendergast et al. 1993; Dobson et al. 1997; Scott and Jennings 1997; Reid 1998; van Jaarsveld et al. 1998; Austin 1999; Margules and Pressey 2000; Ferrier 2002). The exponential growth in computing power has allowed both conceptual and operative advances in biodiversity assessment and the possibility of cost-effective conservation management. The advent of suitable software for geographic information systems (GIS), remote sensing, statistical modelling and database management has changed the way ecologists think about data on species distribution (e.g. Franklin 1995; Bio et al. 1998; Leathwick 1998; Guisan and Zimmermann 2000; Overton et al. 2000). Ecologists are redefining the questions they ask using their new analytical tools. These questions are becoming much more directed towards providing answers to problems of conservation and ecosystem sustainability.

2086 Examples of recent questions asked in the literature are: • What is the potential distribution of species? (e.g. Augustin et al. 1996; Lehmann 1998; Heegard and Hangelbroek 1999; Elith 2000; Frescino et al. 2000; Osborne et al. 2001; Texeira et al. 2001) • How high is the biological diversity of this pond, of ponds in this region, of ponds in this country? (Oertli et al. 2002) • Can we use biodiversity surrogates in regional conservation planning? (Ferrier and Watson 1997) • How can we import expert opinion into the modelling of faunal distribution? (Pearce et al. 2001) • Is the observed species diversity of this region higher or lower than its potential? (e.g. Wolgemuth 1998) • How can we improve forest inventory? (Moisen and Edwards 1999) • What is the ecological condition of this ecosystem and is it sustainable? • What was the composition of the primary forest that previously grew in this pasture? (e.g. Leathwick 2001) • What will be the impact of climate change on vegetation patterns? (e.g. Guisan and Theurillat 2000) • How damaging are these weeds and pests for agricultural production in our country? • What is the most suitable area for a new nature reserve? (e.g. Margules and Pressey 2000) • Are these remnant patches of forest sufficiently connected to ensure the survival of endangered species? Overton et al. (this volume) examine how best to provide answers to questions such as these that span a wide range of detail and spatial extent. Policy-makers need answers to these questions and estimates of other biodiversity attributes which are transparent, reliable, spatially explicit and inexpensive. Numerous methods of ecological modelling are being used to address these requirements at the present time. The proceedings of a recent conference ‘Predicting Species Occurrences. Issues of accuracy and scale’ provide an extensive overview of modelling methods and issues (Scott et al. 2002). The combination of regression modelling and GIS is a powerful tool for biodiversity and conservation studies (e.g. Lehmann et al. 2002), and is seen in all of the papers of this special issue. A model used for biodiversity assessment should have a number of characteristics: (1) The model should not only be precise, but should also be ecologically sensible, meaningful and interpretable. In order to achieve this, it is important to develop modelling procedures in agreement with current theory in ecology (Austin and Gaywood 1994). The nature of environmental predictors (direct, resource or indirect gradients), the shape of response curves, the effect of species interactions, are all important considerations in developing environmental models.

2087 (2) The model should also be general, which means applicable in other regions or different times (e.g. Leathwick et al. 1996). This is a crucial issue with regression models because their results can only be generalized to predict within the environmental conditions found in the data used to calibrate a model. In other words, it is relatively safe to interpolate within the environmental space considered, but generally dangerous to extrapolate outside of known boundaries. (3) The model should be fully data-defined, so that all predictions are fully specified from empirical data, with no parameters or modules that have been guessed or defined as having convenient values. (4) Finally, a model used for biodiversity and conservation should be expressed in a spatial framework (e.g. Austin and Meyers 1996). In order to produce spatially explicit outputs from point observations using regression models, the environmental predictors selected in a model must correspond to GIS maps, defined at a suitable grid resolution. By replacing predictors in a model by their corresponding maps, we can, not only derive spatial predictions of the response we are interested in, but also predictions of the errors we make. Two particular methods of regression analysis are subject to intense use and evaluation. Generalized linear regression (GLM; McCullagh and Nelder 1997) is an extension of classical regressions that allows modelling of responses (dependent variables) such as species richness, abundance classes and presence–absence data. These types of data are usually not normally distributed like measures of biomass or body size, and should not be modelled in the classical Gaussian framework. With GLM, predictors (independent variables) are introduced in a model essentially in a linear or curvilinear form as in classical regressions. Generalized additive models (GAMs; Hastie and Tibshirani 1990; Yee and Mitchell 1991) are a non-parametric extension of GLM that allow introduction of non-linear responses to predictors. The shape of response form is driven by the data and is not predefined, allowing the study of response shape and potentially improving predictions by modelling closer to the actual data. Regression analysis and the resulting spatial prediction is intrinsically objective because it is explicit, consistent and repeatable, and creates models that are fully data-defined. However, numerous subjective decisions are necessary when building a model, from the choice of a sampling strategy, the nature of what is measured in the field, through to the method used to build a model. For these reasons, it is important to determine the effects that these modelling decisions make on the models and predictions, and to define a set of best practices. Continuous improvement in our current best practice in using these methods is necessary if ecologists are to contribute to management of our natural resources. Context The edition of this special issue (Lehmann, Austin and Overton, eds.), together with a parallel issue for Ecological Modelling (Guisan, Hastie and Edwards, eds.) was initi-

2088 ated by the organizers of a workshop held in August 2001 at Riederalp in Switzerland (http://leba.unige.ch/workshop). The workshop and the two special issues share the same aims and philosophy: bringing together ecologists, statisticians and GIS specialists in order to discuss and improve the state of the art of generalized regression modelling for spatial predictions in ecology. A major concern was to improve current practice by achieving a synthesis of ecological and statistical approaches to statistical modelling as applied to biodiversity. The workshop was divided in six themes corresponding to logical steps in the process of building spatial predictions and used as introductory talks available on http://leba.unige.ch/workshop/video.htm: • links with theory in ecology (by M. Austin), • data sampling (by T. Edwards), • model selection (by T. Hastie), • model evaluation (by J. Elith), • spatial predictions (by R. Aspinall), • from spatial predictions to environmental management (by J. Overton). The focus of the workshop was oriented towards methodological aspects except for the last theme. It was decided therefore that a special issue for ecological modelling would concentrate on the first five themes, whereas the present special issue would concentrate on applications for biodiversity and conservation, though these methods and examples have much wider application in natural resource management. Our aim with this special issue is to provide a vision and examples for the use of statistical modelling for conservation management. We hope that these papers will stimulate and convince potential users that these methods have now reached a level of quality where they should be considered as valuable tools for conservation planning, especially when spatially explicit information is needed. This field of ecology provides many opportunities and challenges along with much promise to deliver a quantitative basis for biodiversity management and conservation policy.

Special issue content The first paper presents a vision for the way in which information is integrated and generalized into forms that can be used to support conservation decisions. Jake Overton and co-authors use information pyramids as a metaphor for a general paradigm for informed ecosystem management. The pyramidal metaphor is, however, based upon a rigorous foundation of quantitative and explicit methods of data generalization. Regression analyses and their use in spatial predictions provide an important method for integrating and generalizing information. The paper uses examples of efforts in New Zealand as a case study to demonstrate the concepts. Information pyramids, and their characteristics, provide a powerful perspective for viewing the rest of the papers in this special issue.

2089 The remainder of the papers are organized roughly in increasing order of number of species and complexity of modelling. We begin with papers that investigate single species, then work up to multiple species, species interactions, emergent properties, communities, and biodiversity hotspots. Each of these steps includes a number of conceptual and methodological problems and solutions. Ramona Maggini and co-authors use GLM to predict the distribution of a threatened ant species in the Swiss National Park and throughout Switzerland. This single species approach leads to numerous potential applications for the conservation of the species and to a better understanding of its ecology. Population management within the Park, reasons for the decline in Switzerland, a sampling design for new field surveys and the study on the effect of habitat fragmentation are among the possible outcomes discussed by the authors. Habitat fragmentation is the central topic of the paper presented by Nicolas Ray and co-authors. Its aim is to model the spatial distribution of amphibian populations based on the connectivity between breeding sites. They develop new methods of assessing connectivity using detailed land use maps that generally improve the ability to predict amphibian occurrence. Methods such as these show promise for understanding and managing amphibian populations in areas with intensive human use. José Teixeira and J. Arntzen predict the potential impact of climate warming on the distribution of a vulnerable species of salamander in Portugal and Spain. This paper examines methods for assessing changes in habitat suitability and degree of fragmentation under climate change scenarios. The approach should help in conservation planning by identifying stronghold areas and potential habitat corridors to maintain this species in the future. John Leathwick addresses more directly the complex species interactions that are most often ignored in species distribution modelling. New Zealand forest inventories provide here a perfect dataset to test our understanding of the niche concept in different competitive contexts. Results indicate strong competitive interactions that underline the difficulties of using current species distribution models to predict distributions under future climatic conditions. Ana Bio and co-authors provide an example where an entire community of river valley plant species in Belgium was studied and modelled. Their paper discusses limitations in data quantity and quality that can affect the applicability of the ecological modelling procedures used. However, they conclude that despite a restricted number of observations, non-randomly collected and spatially correlated data, ecologically sound models were obtained that predicted most species well. Anthony Lehmann and co-authors present a species and community level approach for assessing fern biodiversity in New Zealand. The aim of this paper is to investigate broad scale distributions of species in relationship to climatic and landform variables and to identify hotspots of fern diversity. This approach also provides the basis for setting targets for a biodiversity assessment and restoration program. This is the first of several papers that models many species, and has developed streamlined

2090 procedures for analyzing and predicting each species, as well as methods for organizing and integrating the individual species predictions into higher level assessments of biodiversity. Margaret Cawsey and co-authors present a concrete example from Australia of how species and community level modelling can provide a basis for the assessment of conservation priority to be attached to any particular vegetation remnant using the predicted original distribution of that vegetation type and the percentage remaining. The approach can also provide a basis for the setting of priorities for revegetation to connect and expand existing forest remnants in an area that has been extensively cleared since European settlement. This paper presents also an excellent example of practical difficulties that must be expected in applying modern approaches based on species and community modelling to underpin conservation planning in heavily human-influenced ecosystems. We end with two papers by Ferrier et al. that describe what we believe is the most extensive and longest running application of species and community modelling to regional conservation planning. The first paper by Ferrier et al. describes the role of statistical modelling and spatial prediction of species distribution in regional conservation planning. The state of New South Wales in Australia has benefited from this work in a series of major land-use planning decisions allocating public land to forestry and conservation uses, and in the design of a network of nature reserves. Streamlined species modelling, dealing with spatial autocorrelation, and evaluation of model results are among the new approaches detailed in this paper. The second paper by Simon Ferrier and co-authors follows on from the first paper to discuss community level modelling approaches. Current approaches to community level modelling are discussed, and new approaches such as canonical classification and the modelling of compositional dissimilarity are outlined. In collection, these papers demonstrate a range of methods and solutions for the application of regression modelling to conservation management. We hope that you will find the approaches and results as compelling as we do.

Acknowledgements As guest editors we gratefully acknowledge the enthusiasm and help received by Alan Bull from the early stages of the preparation of this special issue. The contribution of Kluwer Academic Publishers to the cost of colour figures was also greatly appreciated by the authors. We also acknowledge the collaborative efforts made with Antoine Guisan, Trevor Hastie and Thomas Edwards, guest editors of the related special issue for ecological modelling. As organizers of the GLM/GAM modelling workshop, we thank the staff at Villa Cassel in Riederalp for their wonderful hospitality and efficiency that greatly contributed to the success of the workshop. The workshop

2091 was supported by the Swiss Academy of Science and the Swiss National Science Foundation, as well as private sponsors ESRI and Insightful.

References Augustin N.H., Mugglestone M.A. and Buckland S.T. 1996. An autologistic model for the spatial distribution of wildlife. Journal of Applied Ecology 33: 339–347. Austin M.P. 1999. The potential contribution of vegetation ecology to biodiversity research. Ecography 22: 465–484. Austin M.P. and Gaywood M.J. 1994. Current problems of environmental gradients and species response curves in relation to continuum theory. Journal of Vegetation Science 5: 473–482. Austin M.P. and Meyers J.A. 1996. Current approaches to modelling the environmental niche of eucalypts: implication for management of forest biodiversity. Forest Ecological Management 85: 95–106. Bio A.M.F., Alkemade R. and Barendregt A. 1998. Determining alternative models for vegetation response analysis: a non parametric approach. Journal of Vegetation Science 9: 5–16. Dobson A.P., Rodriguez J.P., Roberts W.M. and Wilcove D.S. 1997. Geographic distribution of endangered species in the United States. Science 275: 550–553. Elith J. 2000. Quantitative methods for modeling species habitat: comparative performance and an application to Australian plants. In: Ferson S. and Burgman M.A. (eds) Quantitative Methods in Conservation Biology, Springer, New York. Ferrier S. 2002. Mapping spatial pattern in biodiversity for regional conservation planning: where to from here? Systematic Biology 51: 331–363. Ferrier S. and Watson G. 1997. An evaluation of the effectiveness of environmental surrogates and modelling techniques. In: Predicting the Distribution of Biological Diversity, Environment Australia, Canberra, Australia. Franklin J. 1995. Predictive vegetation mapping: geographic modelling of biospatial patterns in relation to environmental gradients. Progress in Physical Geography 19: 474–499. Frescino T.S., Edwards T.C.J. and Moisen G.G. 2000. Modelling spatially explicit structural attributes using generalized additive models. Journal of Vegetation Science 12: 15–26. Guisan A. and Theurillat J.-P. 2000. Equilibrium modeling of alpine plant distribution and climate change: how far can we go? Phytocoenologia 30: 353–384. Guisan A. and Zimmermann N.E. 2000. Predictive habitat distribution models in ecology. Ecological Modelling 135: 147–186. Hastie T.J. and Tibshirani R.J. 1990. Generalized Additive Models. Chapman & Hall, London, 335 pp. Heegaard E. and Hangelbroek H.H. 1999. The distribution of Ulota crispa at a local scale in relation to both dispersal- and habitat-related factors. Lindbergia 24: 65–74. Leathwick J.R. 1998. Are New Zealand’s Nothofagus species in equilibrium with their environment? Journal of Vegetation Science 9: 719–732. Leathwick J.R. 2001. New Zealand’s potential forest pattern as predicted from current species-environment relationships. New Zealand Journal of Botany 39: 447–464. Leathwick J.R., Whitehead D. and McLeod M. 1996. Predicting changes in the composition of New Zealand’s indigenous forests in response to global warming: a modelling approach. Environmental Software 11: 81–90. Lehmann A. 1998. GIS modelling of submerged macrophyte distribution using generalized additive models. Plant Ecology 139: 113–124. Lehmann A., Overton J.McC. and Leathwick J.R. 2002. GRASP: generalized regression analysis and spatial predictions. Ecological Modelling 157: 187–205. Margules C.R. and Pressey R.L. 2000. Systematic conservation planning. Nature 405: 243–253. McCullagh P. and Nelder J.A. 1997. Generalized Linear Models. Monographs on Statistics and Applied Probability. Chapman & Hall, London, 511 pp. Moisen G.G. and Edwards T.C.J. 1999. Use of generalized linear models and digital data in a forest inventory of Utah. Journal of Agricultural, Biological and Environmental Statistics 4: 372–390.

2092 Oertli B., Anderset Joye D., Castella E., Juge R., Cambin D. and Lachavanne J.-B. 2002. Does size matter? The relationship between pond area and biodiversity. Biological Conservation 104: 59–70. Osborne P.E., Alonso J.C. and Bryant R.G. 2001. Modelling landscape-scale habitat-use using GIS and remote sensing: a case study with great bustards. Journal of Applied Ecology 38: 458–471. Overton J.McC., Leathwick J.R. and Lehmann A. 2000. Predict first, classify later – a new paradigm of spatial classification for environmental management: a revolution in the mapping of vegetation, soil, land cover, and other environmental information. 4th International Conference on Integrating GIS and Environmental Modelling (GIS/EM4), Canada (http://www.Colorado.EDU/research/cires/banff/upload/80/). Pearce J.L., Cherry K., Drielsma M., Ferrier S. and Whish G. 2001. Modelling the relative abundance of flora and fauna species at a regional scale. Journal of Applied Ecology 38: 412–424. Prendergast J.R., Quinn R.M., Lawton J.H., Eversham B.C. and Gibbons D.W. 1993. Rare species, the coincidence of diversity hotspots and conservation strategies. Nature 365: 335–337. Reid W.V. 1998. Biodiversity hotspots. TREE 13: 275–280. Scott J.M. and Jennings M.D. 1997. A description of the National Gap Analysis Program. Biological Research Division, US Geological Survey. Scott J.M., Heglund P.J., Haufler J.B., Morrison M., Raphael M.G., Wall W.B. and Samson F. (eds) 2002. Predicting Species Occurrences: Issues of Accuracy and Scale. Island Press, Covelo, California. Teixeira J., Ferrand N. and Arntzen J.W. 2001. Biogeography of the golden-striped salamander, Chioglossa lusitanica: a field survey and spatial modelling approach. Ecography 24: 618–624. van Jaarsveld A.S., Freitag S., Chown S.L., Muller C., Koch S., Hull H., Bellamy C., Kruger M., EndrodyYounga S., Mansell M.W. and Scholtz C.H. 1998. Biodiversity assessment and conservation strategies. Science 279: 2106–2108. Wolgemuth T. 1998. Modelling floristic richness on a regional scale: a case study in Switzerland. Biodiversity and Conservation 7: 159–177. Yee T.W. and Mitchell N.D. 1991. Generalized Additive Models in plant ecology. Journal of Vegetation Science 2: 587–602.

Suggest Documents