Linking GIS and reserve selection algorithms: Towards a geospatial data model
David M. Stoms * Institute for Computational Earth System Science, University of California, Santa Barbara, CA 93106-3060 USA
* Mailing address: Donald Bren School of Environmental Science & Management, University of California, Santa Barbara, CA 93106-5131 USA. Tel: + 1.805.893.7655; FAX: +1.805.893.7612 Email addresses:
[email protected]
Abstract Most reserve selection algorithms used in research or conservation practice are only loosely coupled with geographic information system technology. This paper argues that formalizing a core geospatial data model would benefit algorithm developers, researchers, and practitioners through standardized data management and ease of database development with any reserve selection algorithm. Keywords: conservation planning, object-oriented method, biodiversity
1
1. INTRODUCTION Establishing nature reserves to sustain dwindling biodiversity is a fundamental conservation strategy. Creating credible and effective methods to decide where new reserves should be established, referred to as the reserve selection problem, has been the work of an interesting synergism of experts in conservation biology, ecology, geography and geographic information systems (GIS), operations research, computer science, and economics. There are two basic approaches for addressing this problem, which can be categorized as value-based and targetbased. Value-based approaches (e.g., [1, 2, 3, 4]) maximize some (composite) measure of biodiversity benefit for a given budget or area constraint. Target-based approaches minimize the cost of achieving quantitative goals for the amount of each element of biodiversity to be represented in a network of reserves. Most papers in the scientific literature, and most existing software, are based on a target-based approach (although some use a value-based algorithm to find a solution). Consequently, target-based models will be our focus in this paper, but we will speculate in the Discussion section about generalizing the data model to incorporate valuebased models as well. Margules and Pressey [5] provide a useful synthesis of target-based methods and characterize a systematic framework for conservation planning. Their framework involves six stages: 1) choose the elements of biodiversity as surrogates of overall biodiversity; 2) set explicit representation targets or goals for the minimum amount of each element to be protected; 3) assess the extent to which these goals are already met by existing reserves; 4) select additional areas to achieve the goals (i.e., the reserve selection problem); 5) implement (and modify as necessary) the plan; and 6) manage and monitor the network of reserves. The conservation literature is replete with discussions of each of these stages. Furthermore, there are many procedures published that outline the steps for implementing each stage, and software is publicly
2
available to execute the procedures. Over the past two decades, many alternative approaches have been explored to the basic planning problem in stage 4 of selecting a set of sites as new nature reserves to achieve explicit conservation goals (see the 2002 Special Issue of Environmental Modeling and Assessment, volume7, issue 2, for an overview of a variety of approaches). These methods vary in the roles of cost efficiency, spatial configuration, threat, resource tradeoffs, and uncertainty in the model, and the algorithm used to solve the optimization problem. Some approaches use relatively straightforward, intuitive sets of rules for selecting sites from a pool of candidates. Others are based on a more sophisticated mathematical formulation. I intend to demonstrate here that they share a common view of the basic problem domain. Because of this shared view, it should be feasible to design and implement a single conceptual geospatial data model that describes the concepts and relationships relevant to the reserve selection problem. Incorporating more of the application knowledge directly into the geospatial data model saves each application from having to reinvent it. Thus a common data model could benefit algorithm developers, researchers, and end-users by harnessing the power of GIS technology more effectively for developing and using reserve selection algorithms. This trend of tighter coupling of GIS and environmental management models is emerging in other disciplines [6, 7, 8]. Those involved with reserve selection problems can be usefully classified into three categories: developers, researchers, and practitioners. Developers of reserve selection algorithms are the mathematical programmers who formulate the reserve selection problem in mathematical terms and create algorithms to find (near) optimal solutions. Researchers are the collection of conservation scientists who apply reserve selection algorithms to investigate scientific and policy questions. New questions continually stimulate new problem formulations and algorithms. Practitioners are the end-users in non-profit organizations, consulting firms, and government agencies that utilize reserve selection software to develop and evaluate alternative 3
conservation scenarios and craft conservation plans. Clearly, these categories overlap. Development teams are often collaborations between developers and researchers. Some researchers develop new algorithms themselves so that they can answer new questions (e.g., [9]). Researchers frequently participate in the planning process to demonstrate new methodological approaches (e.g., [10]). Having a clear conceptual model of a conservation-planning database may facilitate communication between groups and the development and implementation of new algorithms for the benefit of research and real-world planning. Although it is possible to perform reserve selection without a geographic information system database, GIS can certainly facilitate the process of generating the model inputs and displaying and evaluating model results. GIS functions are needed to aggregate biodiversity elements and other relevant attributes from diverse sources to a common spatial framework. Even non-spatial tables such as the list of representation goals (e.g., how much area of each element must be selected for reserves) can be readily derived from GIS data. The goal is often stated as a percentage of the total extent of each element, and this extent is summarized from the GIS database through routine processing. After a model is run, a solution to a reserve scenario can be displayed as a map to communicate where the proposed reserves are located and what other resources or issues occur at those sites. GIS can also be used for evaluating the effects of scenario solutions on resources that were not explicitly modeled, for comparing alternative solutions or scenarios, or even for comparing the efficiency of different algorithms in solving the same scenario. Dealing with GIS data management and analysis can be a distraction from the real tasks of algorithm developers, researchers, and practitioners. Because most reserve selection models began as research tools, the emphasis was on the algorithms rather than simplifying data preparation and formatting for model inputs and outputs. Thus to some extent, each developer must define their data model and write code to handle data exchange between the GIS data4
base and the reserve selection model. Even models that are being used as operational planning support tools sometimes only define the input format requirements and put the burden on users to populate the GIS database and convert it into specified input table formats (e.g., [11]). It would be to the advantage of all three groups to have a standardized data model of the common variables at their disposal. As they apply their models with data sets from various geographic locations, they could use the same set of processing tools to build databases and extract data for executing the model. This would be analogous to the use of the Mathematical Programming System (MPS) in location science, which defines a standard file format that can be read by most general-purpose linear programming software [12]. When researchers conceive of extensions to reserve models to answer new scientific or policy questions, the data model could be extended as needed to accommodate additional objects, attributes, or relationships. Having worked with developers, researchers, and practitioners, and talking with frustrated end users, I have observed that GIS data processing and management is a significant challenge in conservation planning. Spatial data processing tools are often created for each unique data set and not readily adapted to other geographic locations. I submit that a lack of a coherent conceptual data model and associated implementation tools may be impeding the deployment of reserve selection models into conservation practice. The purpose of this paper is to define a conceptual data model for a generic GIS database that supports the range of current target-based reserve selection algorithms from stage 4 of the conservation planning process. The paper first summarizes the best-known classes and variants of target-based reserve selection models in order to compare and contrast their data requirements. Through this summarization we hope to make explicit the underlying, though often unstated, data model of each. Using an object-oriented approach, we then propose a core data model that would support all the reserve selection models from the summary. The pro5
posed data model will hopefully lay a foundation for the next generation of reserve selection algorithms—where the core could be extended to accommodate new concepts proposed by researchers rather than reinvented. 2. TARGET-BASED RESERVE SELECTION MODELS There have been a large number of target-based reserve selection models published in the scientific literature (see [13, 14] for reviews). Most have been used primarily as research tools, although a few have developed to a stage where they are being applied in real-world conservation planning activities. Cabeza and Moilanen [13] summarized two general classes of target-based reserve selection problems: minimum cost and maximal covering. The minimum cost problem minimizes the cost (or number of planning units or area) needed to meet explicit representation goals. Adapting from Cabeza and Moilanen [13], let A be a matrix of I elements and J sites (hereafter called planning units [15]) whose cells aij are a measure of the occurrence of elements i in planning unit j. Let xj be a {0,1} decision variable that has a value of 1 if planning unit j is selected and 0 otherwise. Each planning unit has a cost cj, and each element is assigned a desired representation target or goal ri. The minimum cost problem then can be formulated as: Minimize Z =
J
∑c x j=1
j
j
(1)
j
≥ ri
(2)
J
Subject to
∑a x j=1
ij
In contrast to the “ideal” reserve system found in the minimum cost problem, it is often of greater interest to determine how much conservation can be achieved in a fixed area (or budget). This is analogous to the maximal covering location problem in location science [12, 14]. In the reserve selection case, the objective is to maximize the number of elements covered
6
or represented in the reserve system (yi = 1 if element i is covered, else = 0), subject to a constraint on the resources available, R, whether expressed as maximum number (or area) of planning units that may be selected or as a budget limit. The maximal covering species problem can be formulated as: Maximize Z =
I
∑y i =1
J
Subject to
∑c x j=1
j
j
i
(3)
≤R
(4)
It is also possible to hybridize the minimum cost and maximal covering problems. For instance, the algorithm used in the Sites software [11] accommodates both representation goals and a budget constraint. Penalties are imposed for failure to meet goals because of the budget constraint or for exceeding the budget to achieve higher levels of representation, allowing explicit assessment of the tradeoffs. Each of these two general problems has many specific variants to accommodate additional scientific or policy questions. One of the most basic options is how the representation goals are expressed. The simplest form specifies the number of occurrences (i.e., sites that contain the element) to protect, while a more complex version uses an areal extent. Early reserve selection algorithms used presence/absence data (i.e., aij = 1 or 0) and set representation goals of ri = 1. If planning units are assumed to have equal cost (cj = 1), this becomes the basic Species Set Covering Problem [14], articulated by Kirkpatrick [16] for the minimum set problem or the Maximal Covering Species Problem presented in Church et al. [12]. ReVelle et al. [14] also describe a variant where representation goals ri ≥ 2, which they label the backup or redundant coverage model. For area goals, ri is expressed in the same areal units as aij, such as hectares rather than occurrences. Models such as the Biodiversity Management Area Selection (BMAS) model [17], C-Plan [18], and Sites [11] were designed to solve this type of problem for
7
either the minimum set or maximal covering class. Costs, of course, can also be specified in terms of area (minimum area or an area constraint) or as actual land values [19]. Another variation on costs is to consider the opportunity costs of foregone resource use. The most relevant example here is TARGET, the reserve selection component in the BioRap toolbox [20]. TARGET selects planning units to meet the area-representation goals described above but simultaneously minimizes the loss of timber harvest (or other resource) from the selected planning units. The Conflicting Land Use Index [21] is a similar concept that is used as a cost term for minimizing conflicts (and hence opportunity costs) between conservation and resource use. In the maximal covering class, variants have also been developed that vary the weight, wi, of the elements in the objective function by some measure of their conservation importance (e.g., endemism, legal status, rarity) [22] or weight on the best habitat quality of coverage for each element, wiq , a weighted benefits maximal covering species problem [23]. The latter requires a measure of habitat quality, hij, for each element i in each unit j. In the former variant, one can tradeoff the total number of species covered against the number of high priority species. Similarly in the latter a planner can explore the tradeoffs between the total number of species covered versus the number covered in their best quality habitat. To this point, our discussion of reserve selection models assumes that distribution data (the aij terms) are certain, whereas our knowledge of dynamic species’ distributions is usually uncertain. Zeros in the aij matrix, in particular, may represent ignorance rather than known absences. Non-zero aij values are often based on habitat suitability modeling in which planning units above a threshold probability are assumed to contain element i. If we define pij to be the probability that element i occurs in planning unit j, there are two model formulations available to deal with probabilistic covering. In an analog to the maximal availability location problem [14, 24], the constraints require that the probability that each element is (not) covered is greater (less) than some threshold value, termed the alpha-reliability. That is, yj = 1 (covered) only if the 8
probability of coverage is greater than α. In the Maximal Expected Covering Species Problem (MEXCSP [14, 25]), the model maximizes the expected number of species covered by including the probabilities in the objective function. We could simply allow the aij term to be the probability of occurrence rather than a measure of presence, but for clarity we will retain a separate variable, pij to indicate probabilities. In some conservation planning applications, the spatial configuration of selected reserves can also be important. Particularly if planning units are small, clustering selected units together may increase sustainability of ecological processes and make management more efficient than if reserves are small and widely scattered. Recent advances utilize spatial data on the adjacency of planning units to minimize the exterior perimeter length of selected units [26, 27]. Thus the desired spatial configuration is expressed in the objective function, and the degree of clustering is controlled by a weight, wb. To account for boundary in this variation of a covering model, we need to add variables for the boundary length of planning unit j, βj, and the shared boundary between units j and k, bjk. It may not be enough that the reserve system itself is contiguous and compact. In addition, the patches selected for individual elements may need to be of some minimum contiguous area for viability, MinAreai. Here too, information would be required about the adjacency of sites and how large the cumulative area is for each element. A third aspect of spatial configuration for a reserve network is to distribute risk of catastrophic loss of biodiversity by requiring reserves containing element i to be separated by a minimum distance, MinDisti, based on the scale of the ecological process on concern [11]. This requires a variable measuring the distance between all pairs of planning units, djk. Both perimeters and distances are spatial properties accessible through GIS operations. From the descriptions of the classes and variants of reserve selection models above, four types of entities or objects will be employed, which are defined here. First, they all operate on units of biodiversity to be represented to a desired target level in the reserve network. These 9
can be species, plant communities, ecosystems, biophysical environments, environmental gradients, or any combination. For simplicity, we will refer to this object type as biodiversity elements, or just elements to avoid having to repeat all the possible subtypes. These elements act as surrogates for other units of biodiversity whose distributions are less well known. The second type of object is the set of candidate planning units (also referred to as selection units, assessment units, analysis units, parcels, or sites) from which to craft a reserve network. Planning units contain elements and may have certain other management-related attributes, such as the current land use allocation or ownership, cost of conservation intervention, and suitability for conservation or competing land uses. As with elements, there has been much discussion and variation in the choice of planning units—large or small, regular or irregular, etc. Our concern here is not with deciding the ideal unit but in providing a framework that will accommodate whatever spatial framework of planning units a planner chooses. For planning or research purposes, a user generally wants to consider strategies for achieving alternative conservation goals or to compare a strategy across different reserve selection models or algorithms. Therefore, we include a third type of object, scenarios, that specify elements, representation goals, preselected reserves, budget, and other constraints of a potential conservation plan. For a given scenario, an algorithm may generate many realizations or solutions, either generating exactly equivalent objective function values with different sets of planning units, or giving different approximations of the optimal solution by stochastic methods. These alternative solutions (the fourth type of object) reveal the degree of flexibility in meeting conservation goals, which is important information for decision makers to posses [28]. 3. A PROPOSED DATA MODEL FOR EXISTING RESERVE SELECTION ALGORITHMS From the descriptions of target-based reserve selection models above, we can derive a conceptual data model of the four types of objects (elements, planning units, scenarios, and so-
10
lutions) and their relationships that are consistently or at least most frequently used. We can group these into classes depending on whether they are “facts” contained in the GIS database, user-specifications of a scenario (e.g., elements, representation goals, assumptions, weights), or the model outcomes. These objects and their relationships are depicted in the diagram of the conceptual data model (Figure 1) and described below. Figure 2 depicts the spatial relationship of elements and planning units and some of the feature and relationship attributes. GIS Database Object Classes Alternative target-based conservation plans cannot be evaluated and compared without a decision about what elements to conserve and the desired level of representation in the reserve system. Thus the BiodiversityElements object class is the most fundamental component of the database (Figure 1 and Table 1). Data are stored in a table with a row for each element in the planning region and columns for current facts about the element, such as the total area of its distribution or the number of occurrences in the planning region. Attributes of elements that are specific to a scenario should be placed in scenario tables as described below. For instance, the representation goals may vary between scenarios to satisfy different stakeholder values. They may be computed from the factual attributes, such as setting a representation goal as a desired percentage of the total habitat area of an element as computed by the GIS. The percentage is a user-specified variable in a scenario, whereas the area of distribution is a fact. Elements with positive representation goals correspond to a query selection on this table of all elements. Similarly the relative rarity of an element, such as its global rank in the Heritage Program [29], is factual and belongs in this table if it will be used to weight the conservation importance of elements. Many different spatial frameworks have been used as planning units. Most frameworks have been spatially exhaustive of the planning region, but some studies have been based on
11
incomplete tessellations, such as isolated habitat patches of forest [30]. The latter can be considered a query selection on the universe of all planning units where some patches are excluded from selection as reserves. In any case, the planning units form a set of “fixed sites” [1] or discrete locations that have a priori boundaries. That is, the reserve selection model does not itself delineate the boundaries of individual planning units. Despite this variety of frameworks, conceptually they can be treated the same in a data model. The facts about planning units are stored in the PlanningUnits object class table (see Table 2), including a unique identifier (i.e., a primary key), area, perimeter (in the spatial clustering variants), its cost to conserve (for acquisition, conservation easement, management, or opportunity costs of other resources) and its current tenure and stewardship. Ratings of experts on the conservation value of each planning unit are used in some models, either to pre-allocate planning units in a scenario or as a component of conservation suitability [31]. Of course, the elements and planning units have little meaning until they are related, that is, that planning units are populated with occurrences or spatial extent of the elements. This process assigns the fundamental conservation benefits to planning units so that their contribution to the representation goals can be assessed. In the conceptual data model, these two objects are related in the ElementOccursIn relationship table (see Table 3). Except in rare cases where the element data are stored in the same spatial framework as the planning units (e.g., atlas data by latitude/longitude units [32]), GIS processing is required to transform source data into the relational table form. This table in the conceptual data model contains keys to the element and planning unit, a measure of the amount (or 0-1 absence/presence) of the element’s occurrence. For probabilistic reserve selection models, this measure would be specified as the probability that the element occurs. For weighted-benefits extensions of the MCSP, the quality of the occurrence or habitat for each specific element, hij, is recorded [23]. Spatial clustering in the reserve selection models has been modeled as an objective to 12
minimize the outside perimeter or boundary of the selected planning units. To calculate the internal boundary length between adjacent planning units that is subtracted from the sum of their overall perimeters [26, 27], the model needs data on the shared edge between planning units. This information is typically implicit in the GIS storage of polygons, but needs to be summarized in the AdjacentTo relationship class table to be useable in reserve selection models (see Table 4). This table contains the ID number of two adjacent planning units and some measure of their common boundary or edge. Typically this measure is simply the geographic length, but it could be a more ecologically meaningful measure such as permeability or habitat contrast [27]. In contrast to clustering reserves in space, planners may desire to distribute risk by separating reserves. This requires a distance matrix for all pairs of planning units, djk in the DistanceFrom relationship table (Table 5). How distance is measured (edge to edge, centroid to centroid, Euclidean, or shortest-path ecological distance) would be at the user’s discretion. Note that inter-unit distance has also been used in a proximity rule to encourage clustering [33]. Scenario Definition Object Classes The GIS database object classes described above provide factual data for solving the target-based reserve selection problem. Conservation planning, however, involves formulating and evaluating alternative scenarios or plans to meet various societal goals. These scenarios vary not in the facts in the database but in the representation goals, constraints, and assumptions. This variation might reflect the values of different stakeholder groups, or they might simply explore solutions with different levels of potential funding. The Scenario Definition object class consists of three tables, one for the scenario object itself, and one each for the elements and planning units. The ScenarioSpecifications object (Table 6) is a table that contains the unique name of the scenario, the constraints on costs (whether in financial terms, area, or number of planning
13
units), and any other parameters that apply to the scenario as a whole rather than individual planning units or elements. For example, a spatial clustering model controls the degree of clustering through a weight on boundary length. Other descriptive information such as the date the scenario was created and a textual description can be added to document the analysis. Elements have their own initialization table for attributes specific to a given scenario (see Table 7 for ElementInitialization). The first set of attributes in this table contains information on the desired representation goals specific to the scenario, in terms of number of occurrences or area of distribution. Goals could be set to zero for elements that will not be used in selecting new reserves. Other attributes encompass issues such as a ranking of the element by its conservation importance, weights for the quality of habitat, a minimum probability of coverage (for probabilistic algorithms), and a minimum separation distance between reserves, if desired. Similarly, there would be a PlanningUnitsInitialization relationship class table for planning units in scenarios (see Table 8), which contains data that may be varied between scenarios. If some planning units are to be preassigned to the solution, such as existing biodiversity management areas or those identified by expert opinion, this can be set in this table. Note that we include this InitialStatus attribute here rather than in the PlanningUnits feature attribute table because stakeholders may wish to explore different assumptions about what is already (or should be) protected [17]. This table has a 1:1 relationship with the PlanningUnits feature table. The Initial_ConservationValue attribute is an optional field that is calculated at run time in relation to the scenario’s representation goals and the proportion of the goals achieved by the initial set of reserves, based on Initial_Status. Examples of this Initial_ConservationValue include irreplaceability [5] and complementarity [5], but could also be used in valued-based approaches. As an iterative algorithm adds new units to the reserve network, this value would be updated.
14
Solution Object Classes Running a reserve selection algorithm is not the conclusion of a scenario. There is still a need to display the selection of planning units in the GIS, generate summary reports about the scenario, the allocation of planning units, and the representation of elements. Planners use this information to examine a tentative reserve system and bring personal knowledge to bear in making adjustments [31]. Also a scenario may generate many solutions, either equivalent optimal solutions [28] or a set of approximations of an optimal solution [11]. The SolutionSummary relationship table contains a few overall summary measures of the total cost; the number, area, and perimeter of the reserve system; and the number of representation goals met (see Table 9). The ElementRepresentationInSolution table summarizes the total amount of the element in the reserve network in the same units as the goals were expressed (Table 10). The PlanningUnitsAllocation relationship table (Table 11) identifies which units were selected in a solution and a calculated measure of their Final_ConservationValue if they were not selected. If a planning unit was selected, its conservation value has already been contributed to the reserve network so that it has no residual value. 4. DISCUSSION AND CONCLUSIONS Through this synthesis of the major classes and variants of target-based reserve selection algorithms, we have seen that there is a strong core of similarity in the implicit data models. They all operate on discrete spatial sites (planning units) that contain various elements of biodiversity and are selected in a scenario in order to achieve some quantitative level of representation in a reserve network as efficiently as possible. Much of the difference between reserve selection algorithms reflects their treatment of socioeconomic or ecological concerns. For instance, the original maximal covering algorithms minimized the number of planning units selected, followed by those that minimized area, and eventually to those that minimized costs. In 15
a general data model, these can all be described as a single attribute (or suite of related attributes) of planning units. In the simplest case, the “cost” is 1 where all planning units are considered equal. For area, the measurements are in two-dimensional spatial units such as hectares, while for costs the measurement units might be currency or as an index of suitability for other uses [21]. The point is that different applications of these models often use different definitions of related attributes. Suitability is another concept that can be derived in many ways from many spatial features in the database. Thus structuring these attributes in the data model to the level of a schema with data types and field length must wait for the specific application. This differs from other geospatial data models in which the features and attributes are relatively standardized within the disciplinary practice [7, 8]. Researchers always conceptualize environmental problem domains ahead of developers’ ability to make the concepts operational. The data model presented in this paper was consciously limited to concepts in algorithms of the recent past, corresponding to the state of the practice. Ideally, a conservation planning data model would be as visionary as the concepts in the state of the science and give impetus to advances in the algorithms. What are some of the newer concepts that have not yet been integrated into operational reserve selection algorithms and how might they affect the core data model? A few of these are investigated here. One major conceptual advance in conservation planning is to move beyond representation as a performance measure to measures of viability and persistence [13, 32]. Current models address viability through minimum viable area constraints, which describe the minimal area of an element’s distribution in a contiguous set of selected planning units [11]. But viability and persistence are also a function of habitat condition, environmental and demographic stochasticity, threats, and landscape configuration. Further, it is species-specific, so that universal parameters will not suffice. In Figure 2, one of the patches of element i spans two planning units. If that patch constituted a viable occurrence of a population of a rare species, conserving only 16
unit j or k would be insufficient to maintain viability. Nor would simply summing the area of the partial patch in unit k with the area of the smaller patch, assuming they represent isolated populations. To accommodate this view, the data model would also need to include spatial objects for the individual patches or occurrences. A reserve selection algorithm would have to be devised that selects planning units but does not count achievement towards a representation goal until the entire occurrence, or a viable portion, was selected. No published algorithms currently formulate the problem at this occurrence-level of detail (unless the occurrences are the planning units). An innovative approach to minimum viable area has been devised through the concept of “planning patches”, which are clusters of planning units [34]. These planning patches would be a new object class in the proposed data model, with relationships to the constituent planning units. The selection algorithm would operate on these super-objects. A recent variant of the MEXCSP considers the probabilities of element persistence rather than occurrence [32], deriving the probability of persistence from the probability of occurrence, threats to the element at a site, and the vulnerability of the element to those threats. Although this “persistence” model has a fundamentally different conceptual basis than the probability of occurrence, it would only require a modest extension to the core data model. A threat variable would be needed for the PlanningUnits feature class table, and the BiodiversityElements class would have a VulnerabilitytoThreat variable. The covering models described in Section 2 could be considered static. They are based on a premise that the entire reserve system could be implemented instantaneously and that all planning units are available for acquisition or will still contain the species when the lands (and funds) do become available. Possingham et al. [35] and Costello and Polasky [9] have rethought the reserve selection problem as a dynamic “scheduling” problem in which planning units not acquired in a given time period are threatened (with some probability) of being lost to conversion to incompatible land uses. One interesting variation that this model introduces is 17
that species can be considered covered either in reserves or in planning units that are left in a natural condition, that is, where the probability of land use conversion is low. In the classes of models described above, only species represented in reserves are counted in the performance measures. To accommodate the dynamic selection model, the data model would need to be extended in the Scenario Definition section. A relationship table would be created that contained the cost and probability of development (threat) for each planning unit in each time period. A simpler approach to this problem does the scheduling as a second phase, following the initial selection phase. After the planning units are selected, they are ranked by their relative threat and irreplaceability with the most irreplaceable and most threatened given highest priority [5]. The core data model presented here is only a first step towards designing a spatial database to support both the development and deployment of conservation planning software tools. Therefore, I consciously bounded the scope to include only target-based reserve selection algorithms as they have been and are being employed. This is a narrow definition of conservation planning, which must ultimately consider land use in the non-reserve planning units and their cumulative effects, formulate and simulate conservation policy options such as alternative economic incentives [36], address tradeoffs between competing conservation actions, such as the choice between expanding reserves, connecting reserves, or adding new reserves [37, 38], and permit alternative conservation actions such as acquisition or restoration [3]. We are currently working on a broader conceptual data model and the implementation tools as part of a conservation planning support system that will meet most of the needs of model developers, researchers, and practitioners. In the introduction of this paper, the target-based approach was contrasted with one that is value-based. The former specifically address the conservation goal of achieving a representative reserve network, usually at some time in the future. However, decision makers often have 18
additional conservation objectives besides representation, and it may not practical to set (or achieve) representation targets for them (e.g., [10, 39]). Decision makers may also need to prioritize actions in the short-term, not just envision an ideal long-term conservation plan. A valuebased approach is useful in these situations and could be considered a more general form of the conservation-planning problem. Examples of value-based approaches have been appearing in the literature [2, 3, 4]. Our purpose here is not to debate the merits of the two approaches but to consider the implications of a generic value-based approach on the geospatial data model. A preliminary inspection of value-based models shows that their implicit data model shares many aspects with the one proposed here. They also are based on a set of planning units from which reserves (or restoration sites) are to be selected. These units are also associated with biodiversity elements, costs, and threats. As with target-based models, there would be scenarios and solutions with similar attributes and relationships. The main extension to the data model appears to be additional attributes for other biodiversity objectives, the synthetic indices that measure the conservation benefits, and the relative weights among competing objectives. Thus, the data model presented here should be extensible to accommodate a more general framework for conservation planning as well. Ideally, this extended data model can be developed in tandem with this general planning framework to facilitate the development and deployment of new algorithms. 5. ACKNOWLEDGMENTS Funding that supported the writing of this paper came from the Doris Duke Charitable Foundation (to NatureServe) and from the California Resources Agency (to the National Center for Ecological Analysis and Synthesis). The views expressed in the paper do not necessarily reflect those of any of these organizations. This paper synthesizes the experiences and insights of many developers, researchers, and practitioners of reserve selection algorithms with
19
whom I have had the pleasure to interact. In particular, I wish to thank Sandy Andelman, Rick Church, Frank Davis, Ross Gerrard, Bill Langford, Elia Machado, Josh Metz, David Theobald, and the staff from NatureServe and The Nature Conservancy. Any shortcomings of the synthesis are those of the author. 6. REFERENCES [1] K.D. Cocks and I.A. Baird, Using mathematical programming to address the multiple reserve selection problem: an example from the Eyre Peninsula, South Australia, Biological Conservation 49 (1989) 113-130. [2] T.G. Olson, Biodiversity and private property: Conflict or opportunity?, in: Biodiversity and the Law, ed. W.J. Snape III., Island Press, Washington, 1996, pp. 67-79. [3] J.B. Hyman and S.G. Leibowitz, A general framework for prioritizing land units for ecological protection and restoration, Environmental Management 25 (2000) 23-35. [4] P. Siitonen, A. Tanskanen and A. Lehtinen, Method for selection of old-forest reserves, Conservation Biology 16 (2002) 1398-1408. [5] C.R. Margules and R.L. Pressey, Systematic conservation planning, Nature 405 (2000) 243 253. [6] J.L. Mennis and A.G. Fountain, A spatio-temporal GIS database for monitoring alpine glacier change, Photogrammetric Engineering and Remote Sensing 67 (2001) 967-975. [7] D.R. Maidment, ArcHydro: GIS for Water Resources, ESRI Press, Redlands, California, 2002. [8] D.C. McKinney and X. Cai, Linking GIS and water resources management models: an ob20
ject-oriented method, Environmental Modelling & Software 17 (2002) 413-425. [9] C. Costello and S. Polasky, Dynamic reserve site selection, Resource and Energy Economics (2003) in press. [10] R.F. Noss, C. Carroll, K. Vance-Borland and G. Wuerthner, A multicriteria assessment of the irreplaceability and vulnerability of sites in the Greater Yellowstone Ecosystem, Conservation Biology 16 (2002) 895-908. [11] S. Andelman, I. Ball, F. Davis, and D. Stoms, Sites V 1.0: An Analytical Toolbox for Designing Ecoregional Conservation Portfolios, Manual, University of California, Santa Barbara, 1999. [Online at http://www.biogeog.ucsb.edu/projects/tnc/toolbox.html] [12] R.L. Church, D.M. Stoms and F.W. Davis, Reserve selection as a maximal covering location problem, Biological Conservation 76 (1996) 105-112. [13] M. Cabeza and A. Moilanen, Design of reserve networks and the persistence of biodiversity, Trends in Ecology & Evolution 16 (2001) 242-248. [14] C.S. ReVelle, J.C. Williams and J.J. Boland, Counterpart models in facility location science and reserve selection science, Environmental Modeling and Assessment 7 (2002) 71-80. [15] F.W. Davis, and D.M. Stoms, A spatial analytical hierarchy for Gap Analysis, in: Gap Analysis: A Landscape Approach to Biodiversity Planning, eds., J. M. Scott, T. H. Tear, and F. W. Davis, American Society for Photogrammetry and Remote Sensing, Bethesda, MD, 1996, pp. 15-24. [16] J.B. Kirkpatrick, An iterative method for establishing priorities for selection of nature reserves: an example from Tasmania, Biological Conservation 25 (1983) 127-134.
21
[17] F.W. Davis, D.M. Stoms, R.L. Church, W.J. Okin, and K.N. Johnson, Selecting biodiversity management areas. in Sierra Nevada Ecosystem Project: Final Report to Congress, vol. II, Assessments and scientific basis for management options, University of California, Centers for Water and Wildlands Resources, Davis, 1996, pp. 1503-1528 [Online at http://ceres.ca.gov/snep/pubs/web/PDF/VII_C58.PDF] [18] New South Wales National Parks and Wildlife Service, C-Plan Conservation Planning Software, User Manual for C-Plan Version 3.06 (April 12th, 2001), Armidale, NWS, Australia., 2001, 178 pp. [Online at http://www.ozemail.com.au/~cplan, accessed July 15,2002]. [19] A. Ando, J. Camm, S. Polasky and A. Solow, Species distributions, land values, and efficient conservation, Science 279 (1998) 2126-2128. [20] D.P. Faith, C.R. Margules and P.A. Walker, A biodiversity conservation plan for Papua New Guinea based on biodiversity trade-offs analysis, Pacific Conservation Biology 6 (2001) 304-324. [21] P. Nantel, A. Bouchard, L. Brouillet and S. Hay, Selection of areas for protecting rare plants with integration of land use conflicts: A case study for the west coast of Newfoundland, Canada, Biological Conservation 84 (1998) 223-234. [22] R.A. Gerrard, R.L. Church, D.M. Stoms and F.W. Davis, Selecting conservation reserves using species covering models: Adapting the ARC/INFO GIS, Transactions in GIS 2 (1997) 45-60. [23] R. Church, R. Gerrard, A. Hollander and D. Stoms, Understanding the tradeoffs between site quality and species presence in reserve site selection, Forest Science 46 (2000) 157167.
22
[24] R.G. Haight, C.S. Revelle and S.A. Snyder, An integer optimization approach to a probabilistic reserve site selection problem, Operations Research 48 (2000) 697-708. [25] S. Polasky, J.D. Camm, A.R. Solow, B. Csuti, D. White and R. Ding, Choosing reserve networks with incomplete species information, Biological Conservation 94 (2000) 1-10. [26] D.T. Fischer, Clustering and Compactness in Reserve Site Selection: An Extension of the Biodiversity Management Area Selection Model, Unpublished Masters thesis, Department of Geography, University of California, Santa Barbara, 2001. [27] M. McDonnell, H.P. Possingham, I.R. Ball and E. Cousins, Mathematical methods for spatially cohesive reserve design, Environmental Modeling and Assessment 7 (2002) 107114. [28] J.L. Arthur, M. Hachey, K. Sahr, M. Huso and A.R. Kiester, Finding all optimal solutions to the reserve site selection problem: Formulation and computational analysis, Environmental and Ecological Statistics 4 (1997) 153-165. [29] L.L. Master, Assessing threats and setting priorities for conservation, Conservation Biology 5 (1991) 559-563. [30] M. Saetersdal, J.M. Line and H.J.B. Birks, How to maximize biological diversity in nature reserve selection: Vascular plants and breeding birds in deciduous woodlands, western Norway, Biological Conservation 66 (1993) 131-138. [31] F.W. Davis, D.M. Stoms and S. Andelman, Systematic reserve selection in the USA: An example from the Columbia Plateau ecoregion, Parks 9 (1999) 31-41. [32] M.B. Araújo and P.H. Williams, Selecting areas for species persistence using occurrence
23
data, Biological Conservation 96 (2000) 331-345. [33] A.O. Nicholls and C.R. Margules, An upgraded reserve selection algorithm, Biological Conservation 64 (1993) 165-169. [34] R.L. Church, R.A. Gerrard, M. Gilpin and P. Stine, Constructing cell-based habitat patches useful in conservation planning, Annals of the Association of American Geographers (2003), submitted. [35] H. Possingham, J. Day, M. Goldfinch and F. Salzborn, The mathematics of designing a network of protected areas for conservation, in: Proceedings of 12th National Australian Operations Research Conference, Adelaide, 1993, pp. 536-545. [36] D.M. Stoms, K.M. Chomitz and F.W. Davis, TAMARIN: A landscape framework for evaluating economic incentives for rainforest restoration, Landscape and Urban Planning (2003), submitted. [37] H.P. Possingham, S.J. Andelman, B.R. Noon, S. Trombulak, and H.R. Pulliam, Making smart conservation decisions, in: Conservation Biology: Research Priorities for the Next Decade, eds. M. Soule and G. Orians, Island Press Washington, 2001, pp. 225-244. [38] F. van Langevelde, F. Claasen and A. Schotman, Two strategies for conservation planning in human-dominated landscapes, Landscape and Urban Planning 58 (2002) 281-295. [39] C. Moritz, Strategies to protect biological diversity and the evolutionary processes that sustain it, Systematic Biology 51 (2002) 238-254.
24
Table 1. Attributes of the BiodiversityElements Object Class.
BiodiversityElements Object Class Field Name ElementID
Variable i
Notes Primary key
J
∑a x j=1
TotalAmount MinViableArea
TotAi
ij
j
, area or number of oc-
currences
MinAreai
Rarity Rank
e.g., GRank [29]
Name
for completeness
25
Table 2. Attributes of the PlanningUnits Feature Class.
PlanningUnits Feature Class Field Name
Variable
Notes
PlanningUnitID
j
Primary key
UnitArea
Aj
See Figure 2
UnitPerimeter
βj
See Figure 2 could be cost for fee title acquisition, conservation easement, restoration, plus management or op-
ConservationCost ExpertOpinionRank
Cj
portunity costs optional for preallocating units
26
Table 3. Attributes of the ElementOccursIn Relationship Class.
ElementOccursIn Relationship Class Field Name
Variable
Notes
ElementID
i
key
PlanningUnitID
j
key
HabitatQuality
hij
ProbabilityOccur
pij
AmountInPU
aij
See Figure 2
27
Table 4. Attributes of the AdjacentTo Relationship Class.
AdjacentTo Relationship Class Field Name
Variable
Notes
LeftPlanningUnitID
j
key
RightPlanningUnitID
k
key
SharedLength
bjk
See Figure 2
28
Table 5. Attributes of the DistanceFrom Relationship Class.
DistanceFrom Relationship Class Field Name
Variable
Notes
LeftPlanningUnitID
j
key
RightPlanningUnitID
k
key
Distance
djk
29
Table 6. Attributes of the ScenarioSpecifications Object Class.
ScenarioSpecifications Object Class Field Name ScenarioName
Variable
Notes Primary key
S
e.g., budget, number of units, BudgetLimit
RS
or maximum area
αS
Alpha-reliability
MinRequiredProbOccur InReserve
Optional for spatial clustering ClusteringWeight
wbS
algorithms
Description/Comment
For completeness
DateofScenario
For completeness
30
Table 7. Attributes of the ElementInitialization Relationship Class.
ElementInitialization Relationship Class Field Name
ElementID
Variable
Notes Primary key
i
e.g., number of occurrences, area, or perRepresentationGoal
ris
ConservationWeight
wis
centage of historical distribution
Need separate weight for each quality level, HabitatQualityWeight
wiq
MinRequiredProbElemOccur InReserve
αis
MinSeparationDistance
MinDisti
31
q (e.g., High/Medium/Low/None)
Table 8. Attributes of the PlanningUnitsInitialization Relationship Class.
PlanningUnitsInitialization Relationship Class Field Name
PlanningUnitID
Variable
Notes Primary key
j
e.g., 0 if not allocated, 1 if existing reserve, 2 InitialStatus Initial_ConservationValue
if pre-allocated by user
xj
e.g., irreplaceability or complementarity
ICVj
32
Table 9. Attributes of the SolutionSummary Relationship Class.
SolutionSummary Relationship Class Field Name
ScenarioName SolutionID
Variable S
Notes key
sS J
TotalCost
TotCs
∑c x j=1
j
j
J
∑A x j
j=1
TotalAreaSelected
j
TotASels J
∑x j=1
NumberOfUnitsSelected
Sels
TotalBoundaryLength
B[x]s
j
(see [27] for calculation)
I
∑y i =1
i
, number of elements whose NumberRepGoalsMet
GoalMets goal is met
33
Table 10. Attributes of the ElementRepresentationInSolution Relationship Class.
ElementRepresentationInSolution Relationship Class Field Name
ScenarioName ElementID
Variable
Notes
S
key
i
key
J
∑a x j=1
Final_Representation
RepiS
34
ij
j
, same units as representa-
tion goal
Table 11. Attributes of the PlanningUnitsAllocation Relationship Class.
PlanningUnitsAllocation Relationship Class Field Name
ScenarioName PlanningUnitID
Variable
Notes
S
key
j
key As in InitialStatus, plus 3 if selected
FinalStatus
xj
by algorithm Remaining value after network se-
Final_ConservationValue
FCVj
35
lected; is zero if xj greater than zero
Figure 1. Conceptual data model diagram for target-based reserve selection problems. The numbers refer to the corresponding table numbers in the text.
36
Figure 2. Diagram of planning unit object j and its area and perimeter attributes and relationships with patches of element i and with adjacent unit k.
Planning Unit j
Shared perimeter = bjk
Area = aik
Area = Aj Perimeter = βj Area = aij
Patch of element i
Planning Unit k
37