International Forestry Review 5(1), 2003
9
Communicating complexity and uncertainty in decision making contexts: Bayesian approaches to forest research J. GHAZOUL1 and M. McALLISTER2 1
Department of Environmental Science and Technology, Faculty of Life Sciences, Imperial College London, Silwood Park, Ascot, Berkshire SL5 7PY 2 Department of Environmental Science and Technology, Faculty of Life Sciences, Imperial College London, Royal School of Mines, Prince Consort Road, London SW7 2BP Email:
[email protected];
[email protected]
SUMMARY Ineffective communication of scientific research to decision makers and the public has often proved a barrier to uptake of knowledge by relevant stakeholders. One difficulty in communicating scientific information lies with the non-intuitive analytical language commonly used by scientists comprised of Frequentist statistical procedures. The more intuitive alternative, Bayesian inference, is not widely known among forest scientists. In contrast to the Frequentist approach, Bayesian results are given in terms of the probability of a hypothesis being true, and are therefore considerably more accessible to non-scientists. Additionally, and of particular benefit to scientists working in socially and ecologically complex forest environments, Bayesian inference allows the simultaneous consideration of multiple hypotheses and the integration of different types of information from many sources, reflecting scientific judgement as well as existing empirical data. Furthermore, the analysis proceeds by building on existing knowledge, and as such Bayesian inference is very well suited to adaptive management and decision making under uncertainty. Keywords: adaptive management, Bayes theorem, biometrics, Frequentist statistics, probability distribution.
INTRODUCTION In recent decades research in tropical forests has developed from silviculturally orientated stock assessment to integration of forest science and practice with ecological and social requirements of diverse stakeholders. A parallel trend has been the transformation of forest resource use and management from frontier-style non-regulated extraction to state-managed systems of production and, most recently, to sustainable management approaches implemented by local stakeholders to secure a wide variety of products and services. The forest research community has had a central role in the promotion of social and ecological sensitivities and in the development of sustainable concepts and practices. Changing perceptions and improved understanding of the causes of and solutions to forest problems have been brought about by an international research effort, the broad objective of which has been the sustainable use of forest goods and services for current and future generations. Despite the widespread recognition of the necessity of sustainability concepts in natural resource management there remain problems in implementing effective management practices to achieve such goals. We propose that these problems are a symptom of the inadequate
communication of scientific information by academic researchers to managers, stakeholders and the general public, rather than a lack of knowledge about the underlying functions of natural systems. Considerable progress has been made, particularly in the last twenty years, towards the accumulation of what is now an immense body of knowledge describing tropical forest ecology, structure and function, and the variety of human impacts on tropical forests and their biodiversity. In essence we now know enough of natural systems to manage forests for the sustainable production of timber and the variety of other products and services, though not, perhaps, about how to simultaneously enhance biodiversity (Putz et al. 2001). There are no conceptual barriers to the conservation of habitats of prime conservation importance, and to a very large extent we now have the biophysical answers to the environmental problems associated with tropical forestry that were first clearly and widely articulated in the 1970s. Yet despite extensive knowledge and apparent solutions about many environmental problems, forest destruction in the tropics continues unabated. Social, economic and political circumstances undoubtedly constrain proposed solutions (and their adoption), but even so, numerous interdisciplinary research programs have generated strategies that are, apparently, both economically and
10
J. Ghazoul and M. McAllister
ecologically sound. Nevertheless, the general inability of scientists to win the interest and cooperation of all relevant stakeholders in the implementation of proposed solutions and the subsequent failure of many of them are due in part to two different shortcomings of mainstream forestry science. The first is a failure to integrate social needs and perceptions into analytical frameworks applied to biophysical data. The second is a failure to communicate scientific information and predictions to stakeholders and decision makers using everyday interpretations of uncertainty and probability rather than the intricate logic required to correctly interpret the conventional Frequentist statistical procedures commonly used by research scientists. The inclusion of science and scientists in environmental policy development has been problematic owing to the demands of policy makers and the public for clear answers to complex social and biological problems that have inherent uncertainty. Integrating diverse views and perceptions, communicating uncertainty and presenting research outputs in unambiguous ways are crucial for the transparency and trust that are necessary to achieve harmony in resource management. Such issues are becoming increasingly important in a world undergoing global environmental change that is likely to impact upon people differentially in an already unequal world. This paper concerns communication among scientists, decision makers and the public, and proposes an alternative analytical approach, Bayesian inference, to integrating, interpreting and presenting information that reconciles the many different perceptions surrounding particular environmental issues. Research scientists are frequently asked, and expected, to engage in the decision making process. Therefore, they are increasingly in need of an analytical framework that clearly identifies the management actions that have the greatest likelihood of achieving the management objectives. The Bayesian approach, while philosophically quite different to the Popperian Frequentist analytical paradigm, is just as rigorous while being much more appropriate to the social-environmental scenarios that forest researchers and managers encounter. To illustrate the differences and similarities between Bayesian and Frequentist data analyses, this paper provides a simple, accessible example that can be implemented in a spreadsheet. It also reviews collective experience in the different approaches to data analysis and the communication of scientific results in forestry, fisheries and other fields to bolster support for increased application of Bayesian methods in forestry research.
COMMUNICATING SCIENTIFIC UNCERTAINTY The results of pure and applied scientific research are mostly communicated through technical journals that are not easily accessible to either decision makers or the public, although they do form the collective repository of scientific knowledge that, over time, shapes the views of scientists and through advocacy or commissioned advice begins to
mould policy. The forest science community is judged by its ability to deliver outputs that demonstrably improve environmental quality and/or human welfare. In this respect the communication of research results to decision makers and stakeholders should be an essential part of the activities of forest scientists. Issues of uncertainty and scientific inference influence public debate and policy more now than ever. In the UK, for example, science is playing an increasing role in discussions in local and national government, perhaps to a greater extent than most other countries around the world. The proportion of questions, motions and debates in the British Parliament related to science and technology has increased six-fold in the past decade (Padilla and Gibson 2000), with those concerned with environmental and life sciences accounting for most of the growth. This trend is set to continue. The recent Bovine Spongiform Encephalitis and Foot and Mouth crises, as well as issues such as genetically modified foods and Common Agricultural Policy reform, have raised calls for ethical and trustworthy science that is perceived as such by a concerned public. Many of these issues are of global relevance, concerning international trade and standards, food security and human welfare. Moral, legal and ethical questions are being increasingly asked about product certification, climate change, carbon sequestration, agricultural production and sustainable development in the face of demographic change, poverty and increasing demand. Hence there is a need for clear communication of science to governments, pressure groups, business and the public. An underlying problem for any scientific issue, but particularly for a complex, and value laden one such as forestry, is the requirement for a general understanding that science rarely provides absolute answers – there has to be a clear admission of the uncertainties involved with any conclusion or action. Yet the public, and the policy process, seeks, and has often come to expect, unambiguous and precise answers that it can accept as being correct (Ludwig et al. 2001). If scientists overstate the reliability of their results they may cause a lack of public trust in scientific advice (Haerlin and Parr 1999). Frequentist approaches to communicating research results A major difficulty in communicating research results to the general public is the way in which results are analysed and presented using the prevailing statistical paradigm of the Frequentist school. A common output of Frequentist analysis is the maximum likelihood estimate, i.e., the best estimate of the quantity of interest, given the available data. Focus by scientists and resource managers on the single best estimate of the state of the resource, often provided by the maximum likelihood estimate, is common and completely ignores uncertainty. In fisheries, the emphasis on the single best estimate of abundance coupled with the failure to take account of uncertainty has been largely responsible for catastrophic fishery collapses such as that of Northern Atlantic cod in 1992 off of eastern Canada (Hilborn and Walters 1992).
Bayesian approaches to forest research The formal way to convey scientific uncertainty is to make probability statements after statistical modelling has been applied. Conventional approaches to statistics, however, have some severe limitations when it comes to using probability statements to convey uncertainty. Frequentist statistical analyses typically generate a measure of uncertainty called the ‘p-value’ that is used by researchers to determine whether experimental results are ‘significant’ or simply a product of chance. Yet many people, including some with statistical training, incorrectly equate the p-value as the probability that a null hypothesis is correct and that the alternative hypothesis is incorrect. Scientific uncertainty in parameter estimates is often communicated with the use of 95% confidence intervals. Such confidence intervals are commonly interpreted to indicate that there is a 95% chance that the true value of interest lies within the confidence bounds given. While this seems to be a perfectly reasonable and intuitive interpretation, conventional Frequentist statistical methods do not actually allow such interpretation. Instead they allow probability statements only about the observations that could be made if the sampling process were to be repeated indefinitely (or frequently). Thus the textbook definition of a 95% confidence interval, that if the experiment were to be repeated indefinitely the bounds computed would be expected to overlap with the true but unknown value 95% of the time, is not intuitive and does not lend itself to clear interpretation. In controversial and commercial settings such as the determination of sustainable timber harvests and the evaluation of fish stock abundance, the provision of upper and lower bounds themselves have been misapplied. Commercial interests have often chosen to believe the upper bound and conservationists the lower bound, as was the case with Peruvian anchoveta, once the most abundant fish stock on Earth (Hilborn and Walters 1992). In the latter instance, a variety of estimates of maximum sustainable yield were provided as the fishery developed in the 1960s. Optimism prevailed and the fisheries managers allowed catches in line with the upper limits of the estimates until, with the onset of the El Niño oceanographic regime in 1972, the fishery collapsed. In an attempt to develop more flexible and intuitive approaches to accounting for uncertainty, conventional Frequentist statistics have, in the last few decades, been extended with the development of computer intensive methods such as Monte Carlo simulation and bootstrapping (Efron 1981; Manly 1990, 1997, 1998; Francis and Manly 2001). These methods have been applied in stochastic time series modelling of exploited resources to produce probability distributions for estimated parameters and model quantities of interest such as population abundance and harvesting mortality rates (Restrepo et al. 1992). The probability distributions obtained have been used to convey uncertainty in the quantities of interest and in the potential outcomes of alternative resource management options. This has provided a major step forward in the use of dynamic models
11
of exploited resources to evaluate resource status, model the potential outcomes of alternative management approaches, and to take into account uncertainties and convey them graphically to decision makers and resource stakeholders. However, this modelling approach still lacks conceptual refinement because the Frequentist statistical framework applied permits the distributions to be interpreted only as sampling distributions for the “estimates” obtained and not as probability distributions for the actual quantities underlying the estimates (Howson and Urbach 1991; Gelman et al. 1995; Malakoff 1999). As such, it is not possible to apply these methods to obtain conceptually consistent statements about risk, that is, the probability of something bad happening. Moreover, bootstrapping methods simply don’t work when data are few and have large sampling errors, as is often the case with ecological data. The Bayesian approach to communicating uncertainty The alternative to the Frequentist paradigm is Bayesian inference (see Table 1 for glossary of Bayesian terms) . By presenting the outputs of analyses as the probability that some hypothesis is true, Bayesian analyses are conceptually more accessible and intuitive. Additionally, in an uncertain world where information is incomplete and decisions still have to be made, Bayesian approaches allow decision makers to identify how much confidence to place in the information that is available. Most importantly, in allowing information to be updated and by including and building upon existing knowledge, Bayesian inference and decision theory provide a quantitative analytical framework that is well suited to the requirements of adaptive management procedures (Prato 2000). In the last decade, Bayesian probability, a notion of probability fundamentally different from the Frequentist one, has spread in the field of applied statistics (Howson and Urbach 1991; Ellison 1996; Malakoff 1999). Unlike Frequentist probability, which can be applied only to observable quantities, Bayesian probability conveys credibility statements for possible unobservable states of nature given current knowledge. In contrast, conventional Frequentist analyses calculate the probability of observing data given a specific value for a parameter or set of parameters, usually the null hypothesis. In the context of resource ecology, Bayesian probability statements can be made for alternative possible values for the actual abundance and status of a natural population, for example, there is a probability of 0.1 that actual abundance is below some critical threshold value such as 20% of carrying capacity. This thereby allows conceptually consistent statements about risk to be computed. Bayesian probability can also provide clear statements about the plausibility of alternative ecological hypotheses for processes that may be structuring ecological communities, such as the responses of populations to exploitation (McAllister and Kirchner 2002). They have been applied, for example, to assess the credibility of the four alternative hypotheses that harvesting
12
J. Ghazoul and M. McAllister
TABLE 1 Glossary of commonly used Bayesian and Frequentist terms Term
Definition
Bayesian probability
A number between zero and 1 that conveys a strength of belief or weight of evidence for some particular conjecture or hypothesis.
Frequentist probability
A number between zero and 1 that conveys the chance of some observable event occurring if some particular hypothesis is assumed to be true and if the experiment was to be repeated indefinitely (or frequently).
Probability distribution
A specification for the chance of each conceivable event occurring (Frequentist notion) or being true (Bayesian notion).
Confidence interval
A Frequentist concept that provides a lower and upper bound for some quantity of interest. The accompanying percentage indicates the percentage of times that the unknown true value would be included in the interval if the sample or experiment were to be carried out an infinite number of times.
Probability interval
A Bayesian concept that provides a lower and upper bound for some quantity of interest. The accompanying percentage gives the probability or strength of belief that the true value lies between the bounds indicated.
Prior probability
A Bayesian probability reflecting the credibility of some conjecture or hypothesis before statistically analysing some new set of data.
Likelihood function of the data
A probability density function that specifies the exact chance of each particular potential outcome of the dataset given some particular set of values for the underlying statistical model parameters. This is used in both Frequentist and Bayesian statistics.
Posterior probability
A Bayesian probability reflecting the credibility of some conjecture or hypothesis after statistically analysing some new set of data in light of previous information.
Bayesian data analysis
The analysis of data using Bayesian statistical methods which typically requires the formulation of prior probability functions and the computation of Bayesian posterior distributions for the model parameters of interest.
WinBUGs
Software for conducting Bayesian data analysis and Bayesian integration. BUGS stands for Bayesian inference Using Gibbs Sampling. The software is freely available at http://www.mrc-bsu.cam.ac.uk/ bugs/winbugs/contents.shtml.
Bayesian belief network (BBN)
Software (e.g., http://www.hugin.com/) that can be used for the purposes of decision analysis under uncertainty when there are many different variables that can affect the outcomes of a decision and there is uncertainty over the particular inter-relationship among the variables in determining the outcome of a particular course of actions. One of the key outputs of such software can be a prescribed optimal action to take given the uncertainties over the hypothesised inter-relationships and the utility function of the outcome variables.
in combination with either intra-specific competition, two alternative forms of inter-specific competition or habitat degradation has caused the decline in commercially valuable fish species (Sainsbury 1988). They have also been applied to provide synthetic and intuitive results when complex resource dynamics models with multiple parameters have been statistically fitted to large and complex datasets and to quantify uncertainty over the structural formulation of these models (Patterson 1999; McAllister and Kirchner 2002; Parma 2002). Applications of Bayesian statistical methods are rapidly proliferating in the fields of resource management, most notably in fisheries, but more recently in forestry, forest ecology and conservation (Crome et al. 1996, Wade 2000). In the context of forestry, Bayesian procedures have proved
far more informative than Frequentist analyses for assessing impacts of rainforest logging on birds and mammals in Queensland, Australia (Crome et al. 1996), guiding environmental policy decision making (Ellison 1996; Wolfson et al. 1996), resolving conflicts in natural resource management (Anderson et al. 1999), evaluating wildlife population viability under alternative management scenarios (Marcot et al. 2001), informing adaptive resource management (Prato 2000), and incorporating uncertainty into forest process models (Macfarlane et al. 2000). Application of Bayesian Inference Bayesian probabilities are computed using the same statistical probability functions of data as conventional
Bayesian approaches to forest research statistical analysis (Arnold 1990) except that the Bayesian (posterior) probability distribution is the combination of the likelihood distribution obtained for observed data with a prior probability distribution derived from information available prior to the research (Gelman et al. 1995). Statistical inference is made from this combined ‘posterior’ distribution that integrates prior knowledge (or beliefs) with new information. Analysis of new data generates a likelihood function as in Frequentist inference, i.e. the probability of observing the data given different values of the parameter (e.g., the hypothesised true value for resource abundance) (Figure 1a). Probability distributions of “prior” knowledge (or prior probabilities) may be constructed using expert judgement, local knowledge, existing datasets, published literature or other sources of information for the same (or similar) species in ecologically equivalent contexts (Ellison 1996; Punt and Hilborn 1997). If little or nothing is known, then “non-informative” prior probabilities are adopted. These are typically relatively flat distributions that are intended to imply ignorance or a large degree of uncertainty about the quantities of interest (Figure 1a).
Probability or likelihood
9
13
The posterior distribution upon which conclusions are based is the product of the prior distribution and the likelihood function scaled to be a probability distribution. This posterior distribution provides both the best estimate of abundance and the associated uncertainty, and in effect expresses how prior beliefs have been altered by the availability of new data. Should further research generate new information then the previous posterior distribution can be used as a revised prior, which is updated by application of the new observations to form a new posterior probability distribution (Figure 1b) that conveys all that is known about the model quantities following the data analysis. This integration of previous research with new data can be done without the requirement of having identical experimental set-ups. The formalism used to obtain posterior probabilities from prior probabilities and the probability of obtaining the data (i.e., the likelihood function) is Bayes’ theorem. This simple theorem of conditional probability was developed by Reverend Thomas Bayes in the mid 18th century but was overlooked until interest in mathematical approaches to decision making under uncertainty began to develop in the mid 20th century. The Bayesian idea of probability is appealing because it is conceptually intuitive but also mathematically and statistically rigorous.
8 Likelihood and Posterior distribution
7
Example application of the Bayesian and Frequentist methods
6 5 4 3
Prior (no data)
2 1 0 0
500
1000
1500
2000
2500
3000
Resource Abundance
FIGURE 1a A Bayesian analysis with a uniform (uninformative) distribution as the prior and a normal likelihood function derived from new observations.
To illustrate the differences in application and interpretation between Bayesian and Frequentist methods we constructed a hypothetical example that uses simulated data. In a small watershed in the Brazilian rainforest, data on the natural production and annual extraction of Brazil nuts each year have been compiled for the last 10 years, although extraction of nuts has continued for at least thirty years (see Table 2 for the simulated data). To evaluate whether the current rate of extraction is sustainable, forestry scientists fit a linear model to the natural production data. To simplify the analysis, they transform the data by taking
Probability or likelihood
30
TABLE 2 Hypothetical data on total natural Brazil nut production over an 10 year period in some watershed
Prior Posterior Likelihood
25 20
Year
15 10 5 0 0
500
1000
1500
2000
2500
3000
Resource Abundance
FIGURE 1b A second analysis with the posterior distribution from (a) now used as the prior distribution. The likelihood function resulting from new data is combined with the prior to get a new posterior distribution that is the product of the two.
0 1 2 3 4 5 6 7 8 9 10
Total production (Ty) (kg km-2) 380.3 373.3 340.9 393.3 404.5 410.3 330.9 368.7 373.9 375.0 321.0
Ty - T 0 0.0 -7.0 -39.4 13.0 24.2 30.0 -49.4 -11.6 -6.5 -5.3 -59.3
14
J. Ghazoul and M. McAllister
the difference between the initial year and each subsequent year Y to give DY and then applying a linear model with the y-intercept (at year 0) fixed at 0kg km-2yr-1. Frequentist hypothesis test. The scientists test the null hypothesis that the slope is larger than or equal to zero and set the value of alpha, the chance of a Type I error, to be 0.05. If they reject the null hypothesis, they will conclude that there has been a decreasing trend in total production over the last decade. If they do not reject the null hypothesis, they will conclude that the results are inconclusive. They also calculate a 95% confidence interval in the slope parameter. Bayesian data analysis. The scientists set out to calculate the probability that the slope is less than 0 kg km-2 yr-1, i.e. there is a decreasing trend in production over the last 10 years. To proceed, the scientists need to specify a prior probability distribution for the slope and a specific probability distribution for the data. A relatively flat prior probability distribution for the slope (b) is chosen (P (b)). This is a normal distribution with a mean of 0 kg km-2yr-1 and a standard deviation (SD) of 10,000 kg km-2 yr-1 (Figure 2). The probability distribution for the data, or the “likelihood function”, is the same as that used in standard least squares regression analysis. This is a normal distribution for each observation DY with the probability density of DY determined by each particular value for b. For each value for b, the mean expected value for DY is set equal to the regression prediction of total production, bY, and the SD is obtained from the maximum likelihood estimate (MLE) of the SD in regression residuals (Eq. 1). The use of the MLE for the SD in the likelihood function for each value for b, is equivalent to treating the SD as an uncertain random variable with an uninformative prior
Probability density
Prior
Probability density
Posterior
0 -10
-5
0
5 -2
-1
Hypothesized value for slope (kg km yr )
FIGURE 2 Prior and posterior distributions for the slope parameter to assess whether there has been a linear decrease in total natural production of Brazil nuts in an isolated watershed. The data used are provided in Table 2. The normalised likelihood function of the data is, in this case, identical to the posterior distribution.
distribution for it. The SD in the likelihood function given a particular value for b is given by:
Eq. 1 V E
1
¦ '< E<
1 <
where σ(b) is the SD in the normal likelihood function of the data given the value for b, DY is the total estimated natural production in year Y subtracted from that in year 0 and N is the number of years (10 in this case). The normal likelihood function of the dataset is given by: Eq. 2 /' E
1
<
§ ' E< · ¸ ¨¨ < ¸ V E V E S © ¹
where D is the set of observations of the total production subtracted from that in year 0. Using Bayes’ rule, the posterior probability for each value for b is directly proportional to the product of the prior probability and the likelihood function:
Eq. 3. 3 E ' v S E / ' E All Bayesian calculations were carried out in a Microsoft Excel spreadsheet using a grid-based approach (McAllister and Kirkwood 1998) that calculates the product in Eq. 3 for a large range of values for b (i.e., between -10 and +10 at steps of 0.25 kg km-2 yr-1).
RESULTS Under the Frequentist analysis, the maximum likelihood estimate of the slope is -2.26 kg km-2 yr-1 and the 95% confidence interval for this slope lies between -5.6 and +1.0 kg km-2 yr-1. The t-test statistic of -1.54 lies above the critical value for t, -1.83 and thus the null hypothesis cannot be rejected at alpha = 0.05. The p-value obtained is 0.079 indicating that there is about an 8% chance of obtaining a value more extreme than the one obtained if the null hypothesis were true and the experiment was to be repeated indefinitely. The results of the Bayesian analysis are summarised by the posterior distribution for the slope (Figure 2). The Frequentist component of the Bayesian calculation, the normalised likelihood function of the data, is also shown and is practically identical to the posterior distribution. In this case, only the data and not the prior have influence on the posterior, since the Bayesian results are identical to the normalised likelihood function. The posterior mean estimate is practically the same as the MLE, -2.26 kg km-2 yr-1. The probability that the slope is less than 0 kg km-2 yr1 is 0.913 or about 91%. This implies that there is a 91% chance that given the data the alternative hypothesis, i.e., there has been a decreasing trend in natural nut production over the last decade, is true. Notice that that the probability that the slope is zero or positive (0.087) is similar to but not equal to the p-value of 0.079. In contrast to the pvalue, the Bayesian probability value 0.087 properly gives the probability that the null hypothesis is true. The 95%
Bayesian approaches to forest research probability interval for the slope is similar but not identical to the Frequentist confidence interval and given by the values -5.7 and +1.1 kg km-2 yr-1. The interpretation of this is that there is a 95% probability, in terms of strength of belief or weight of evidence, that the true value for the slope lies between -5.7 and 1.1 kg km-2 yr-1. The example demonstrates that numerical results can be very similar between the Bayesian and Frequentist methods when applied to the same data. For example, the MLE and posterior mean for the slope were very similar. The values for the 95% Bayesian probability and Frequentist confidence intervals were similar but not identical. Also the value for the Frequentist p-value was similar but not identical to the Bayesian probability for the null hypothesis, i.e., that the slope is equal to or larger than 0. A key difference is that axioms of Bayesian probability permit probabilities to be assigned to the hypotheses of interest, e.g., a probability can be computed and assigned to the hypothesis that there has been a linear decrease in nut production over the last 10 years. In other words, the Bayesian method allows for a posterior probability distribution to be defined for the values of the parameter of interest, such as the slope, and for probability statements about particular hypotheses of interest. The Frequentist method, however, only allows probability statements to be made about the data (i.e., the chance of obtaining an empirical result more extreme than the one obtained, should the experiment be repeated indefinitely and the null hypothesis be true). The intuitive interpretations of the Bayesian probability results lend themselves to decision making. For example, a decision rule could be implemented that required a reduction in annual harvests if it is deemed that there is at least a 90% chance that natural production has been decreasing over the last 10 years. This can be implemented without the convolution of having to specify a value for alpha (the chance of a Type I error) and without being forced to make probability statements for the data only. It should be noted that the hypothetical case given here is an artificially simple univariate example with uncertainty in only one key parameter, the slope. In most ecological situations there is uncertainty in large numbers of model parameters and more complex models apply. In such instances, methods to integrate the joint posterior distribution for the vector of model parameters are required and methods such as Markov Chain Monte Carlo Methods have been developed specifically for this purpose and are now commonly applied in fisheries and ecological sciences (Gelman et al. 1995; Patterson 1999; Parma 2002) but not, as yet, in the forest sciences. Relatively simple worked examples in Bayesian estimation and decision analysis can be found in Hilborn and Mangel (1997) and McAllister and Kirkwood (1998). Some advantages of Bayesian analysis In addition to the conceptual appeal of Bayesian probability statements, Bayesian methods offer unique
15
advantages for several other reasons. Firstly, they permit a wider range of sources of scientific information to be included in statistical analysis than just the data that would ordinarily be applied in a Frequentist analysis (Howson and Urbach 1991; Malakoff 1999; Wade 2000). When the data are sparse or relatively uninformative about some key model parameters, informative prior probabilities can be developed for these parameters that incorporate expert judgement, empirical data from other similar situations, or both, and that permit integration of local and traditional knowledge with scientific methods of enquiry, to provide an approach that is particularly well suited to environmental problems. The inclusion of prior information via expert knowledge can give forest stakeholders an opportunity to contribute to the forest assessment and management process. Furthermore, Bayesian hierarchical modelling methods (Gelman et al. 1995) have recently been developed and applied to utilise data from related systems to formulate prior probability distributions for parameters in the system of interest (where no prior information is available). The basic idea is that the value of the parameter for the system of interest may be taken as a draw of values from the available set of similar populations. This provides a powerful objective tool to utilise information from similar systems into the analysis of the system of interest, and, in the context of forest science, a protocol for integrating and synthesising information from many broadly similar case studies. Secondly, in stochastic time series modelling of resource dynamics Bayesian “state-space” models provide a conceptually and methodologically elegant approach to dealing with both observation error and process error in the model components (McAllister and Kirchner 2002; Parma 2002). Uncertainty can be modelled in the observation process through the likelihood function of the data as is typically done in statistical modelling. However, informative prior probability density functions can be used in addition to determine the extent and form of the process error in model equations for population processes such as births and deaths. As such, Bayesian methods facilitate the development of estimation models for exceedingly complex systems with multiple uncertain parameters, typical of those encountered in natural resource management. NonBayesian approaches to modelling process error and observation error also exist such as the Kalman Filter (Harvey 1989). However, unlike the Bayesian state space approach, the Kalman Filter requires that the error terms are Gaussian. Thirdly, Bayesian methods offer a rigorous and conceptually intuitive approach to dealing with model uncertainty (e.g. Kass and Raftery 1995; Crome et al. 1996; Wade 2000; Marcot et al. 2001; McAllister and Kirchner 2002; Parma 2002). Bayesian data analysis permits the relative credibility of each alternative model to be evaluated against the data, taking into account uncertainty over the range of values for the parameters in each model. Prior probabilities must be assigned to each alternative model to start with but these are often uninformative probabilities
16
J. Ghazoul and M. McAllister
of equal value to represent uncertainty over the alternative model structures identified (see above). Due to computational limitations, only a handful of plausible alternative models are typically considered, with the ones considered being the ones expected to yield the most divergent outcomes for each of the management actions considered. The results of combining the priors for each model with the data analyses are posterior probabilities for each of the alternative models evaluated. These probabilities indicate the likelihood that each model is true given the data. These probabilities can be exceedingly useful in sorting out which model results should be taken into account in decision making, especially when the consequences of alternative management actions depend strongly on the model structure adopted (McAllister and Kirchner 2002). Fourthly, Bayesian statistical analysis produces the key probabilistic inputs required by statistical decision analysis (Berger 1985; Punt and Hilborn 1997). Statistical decision analysis provides a systematic approach to using the best available information in the making of decisions under uncertainty. This is a formal modelling technique that permits the potential consequences of alternative actions that could be taken to be evaluated against recognised objectives. It requires several formalised steps: (1) Formulate the objectives to be achieved and the measures of policy performance based on these objectives; (2) Formulate the alternative actions that could be taken; (3) Formulate the alternative hypotheses and scenarios that could determine the potential outcomes of each management action; (4) Evaluate the plausibility of the alternative hypotheses using Bayesian statistical analysis; (5) Evaluate the potential consequences of each alternative action under each alternative hypothesis; (6) Summarise the results in the form of probability distributions for the potential outcomes of each alternative action; and (7) Present the results to the decision makers. Bayesian inference in forest science Bayesian methods hold particular promise for forestry contexts. The presentation of Bayesian analysis outputs facilitates understanding and decision making by the simplicity of its interpretation. More importantly, the Bayesian approach is ideally suited as an analytical framework for adaptive management where new data and information are used to update existing thinking and approaches. Furthermore, its ability to combine information from a variety of sources and formats suits the integration of traditional and scientific knowledge with local perceptions allowing for the development of scientifically-based but locally guided management systems. Bayesian decision theory can integrate social and ecological qualitative and quantitative information to determine the extent to which the current state and uses of an ecosystem are sustainable. The approach is flexible and adaptive in that the likelihood of occurrence of several potential or desired outcomes can be assessed under alternative
management scenarios. These likelihoods can be readily updated given new knowledge, objectives, management or resource states. The application of Bayesian inference is most advanced in fisheries management science but, due to many similarities between the settings of fisheries and forest management, Bayesian methods for estimation and decision making under uncertainty hold considerable promise for forestry and inroads have recently been made in this area (Crome et al. 1996; Ellison 1996; Wolfson et al. 1996; Anderson et al. 1999; Macfarlane et al. 2000; Marcot et al. 2001). Crome et al. (1996) for example, addressed the impact of logging on birds and small mammals in the Queensland rainforest using both conventional Frequentist and Bayesian analyses. The Frequentist analysis was largely uninformative owing to insufficient statistical power, that is, the probability of detecting statistically significant impacts was low even if such impacts existed. Thus even after the implementation of the large-scale study the failure to detect effects could not be taken as evidence that there were no effects. Bayesian analysis of the same data showed that widely divergent opinions (elicited from foresters, conservation activists and members of the public) about logging impacts at the start of the study, represented by Bayesian prior probability distributions of logging outcomes, came to the informative consensus across opinion groups that the negative effects of logging over the whole study area was not likely to be greater than the extent of canopy opening, and for some species effects would be positive. When microhabitats and species were analysed separately, however, it was clear that consensus among the three groups was not achieved for some of the species, indicating that the study had not provided sufficient information for detailed unambiguous conclusions to be reached on the more specific issues. However, even this remains a much more informative conclusion than the noninformative Frequentist analysis. The application of these methods to more complex scenarios that include a variety of stakeholders and decision-makers has yet to be attempted in the forestry context. The abundance of information about resource use, management and human behaviour in forest environments from numerous case studies and research programmes, coupled with the urgent need to communicate this information in appropriate formats to decision makers, makes the application of Bayesian approaches for synthesising and presenting information particularly relevant and timely. Problems with Bayesian inference The advantages of Bayesian over Frequentist inference include the ability to combine information from different sources, the simultaneous consideration of multiple hypotheses and the designation of probability as a measure of belief that facilitates estimation and prediction. Bayesian inference has, however, been criticised for its apparent lack of explanatory power and subjectivity (Dennis 1996).
Bayesian approaches to forest research The central feature of the Popperian scientific approach is its power to explain emergent patterns by challenging existing mechanistic models and rejecting poor models in favour of improved ones. Similarly, Bayesian inference draws its strength from its ability to estimate and predict outcomes and recent methodological developments have boosted its explanatory power (Gelman et al. 1995; Jensen 1996). Bayesian analysis methods have recently been developed that evaluate the credibility of multiple alternative explanatory models using wide ranging sources of data. The methods provide a conceptually appealing approach for model selection and the quantification of uncertainty over model choice. The methods facilitate explanations for detailed ecological observations by facilitating the formulation and identification of experiments that can provide powerful tests of ecological hypotheses, and providing statistical methodology to rigorously quantify the probabilistic credibility of each alternative model-hypothesis given the available data (Sainsbury 1988; Hilborn and Mangel 1997; McAllister and Kirchner 2002). We do not suggest that Bayesian inference replace Frequentist approaches in science generally, but rather emphasise that Bayesian inference can be a considerably more powerful and intuitive tool in environmental decision-making contexts particularly where decisions are based on information from a wide variety of sources. Scientists favouring Frequentist approaches have also objected to the introduction of subjective opinions in Bayesian analysis (Efron 1986; Dennis 1996). It is true that prior distributions have been used to reflect prior subjective beliefs of experts or stakeholders but this is not the only way to use Bayesian methods. Prior probability distributions could equally be constructed from evidence drawn from existing published empirical studies. The fact that they can and often do use subjective beliefs drawn from practitioners, experts and stakeholders represents not a weakness but a strength when dealing with the socially and politically complex issues commonly associated with resource use management (Crome et al. 1996; Kuikka et al. 1996). It is precisely this ability of the Bayesian approach to incorporate the beliefs of both resource users and managers that makes it so appropriate for dealing with resource use conflicts and different perceptions of sustainability and environmental values.
INTEGRATING ACROSS DISCIPLINES Training given to environmental scientists must be broad just as it should be deep. Environmental science has progressed by reductionism, but progress in this manner has had costs in that we have often lost sight of the context within which we work. It is thus futile to conserve species in protected areas if conditions leading to human encroachment are ignored. Within the ecological research community there needs to be integration across pure and applied aspects of the discipline such that new insights
17
relating ecosystem structure to function are used to inform applied ecological studies. A Bayesian approach to assessing outcomes of particular human-environment interactions provides a methodology to integrate the predictions of ecological theory with empirical field studies and local knowledge and experience. Comprehensive studies that seek realistic improvements in human conditions and environmental states need to incorporate not only objective information belonging to a diverse range of disciplines, but also the subjective perceptions of stakeholders and decision makers. Bayesian analytical methods can facilitate the integration of results across disciplines by providing tools for the combination of information from a variety of sources and disciplines. Summarising information as a probability distribution permits the combination of results from several different protocols and research methodologies. Bayesian analysis therefore provides a quantitative tool that can consolidate complex information into a single decision support system. Forest researchers are well placed to undertake such integrated studies as they have access to information traditionally belonging to geographers, historians, economists and anthropologists. Geography has long been concerned with human impacts and its consequences, and it is clear that all conservation and development problems require a geographic perspective. Developments in remote sensing and GIS are the latest technological expressions of this. Ecologists and environmental managers need historians to interpret the underlying reasons behind the state of crisis with the global environment, as the source of the crisis lies in “how ethical systems function rather than how ecosystems function” (Worster 1993, quoted in Ludwig et al. 2001). Additionally, the relationship between economics and ecology, despite being fraught with tensions, is complementary as both address questions of scarcity and competition, and each has provided insights to the other. Economic models have proved remarkably useful for ecological interpretation, and as economists begin to challenge the mechanistic approach of their own discipline it is likely that new synergies with ecology and environmental science will unfold.
CONCLUSIONS Although the points made in this paper are applicable to a range of environmental issues, they are particularly pertinent to forestry in its broadest sense. If researchers are to continue to contribute to and inform management decisions in the forestry sector they will need to integrate the theoretically complex advances of pure ecology with the socially complex realism of applied ecology, and then communicate the outputs of this integration in a manner that is easily accessible and comprehensible to stakeholders and policy-makers. Environmental problems challenge concepts of ‘experts’ and decision-making approaches because while environmental data are generally accurate
18
J. Ghazoul and M. McAllister
they tend to be insufficiently precise. Owing to demands for reliable knowledge and prediction in a universe of uncertainty we need to consider novel ways of interpreting and presenting data. As we begin to cross traditional disciplinary boundaries we also need an analytical framework that allows for the integration of different information types. The Bayesian approach has been applied in a variety of settings, including forest management, to deal with the difficulty of making decisions under uncertainty, particularly when views differ among stakeholders about the status of the resource and how the resource should be managed. These settings include new developing markets where there is enormous uncertainty over resource abundance and demand (McAllister and Kirchner 2002). Overall, Bayesian decision analysis has provided a systematic and intuitive approach to guiding the decision-making process by allowing use of the best available information in a rigorous statistical framework, involving stakeholders at several stages of the evaluation, taking into account the key uncertainties affecting management decisions, and conveying explicitly the uncertainties in the potential decision outcomes with the use of Bayesian probability statements. Bayesian inference can facilitate the conceptual and analytical marriage of the natural and social sciences to achieve truly comprehensive and widely acceptable solutions.
ACKNOWLEDGEMENTS We thank Catherine Michielsens and two anonymous reviewers for comments on earlier versions of the manuscript.
REFERENCES ANDERSON, D.R., BURNHAM, K.P., FRANKLIN, A.B., GUTIERREZ, R.J., FORSMAN, E.D., ANTHONY, R.G., WHITE, G.C. and SHENK, T.M. 1999. A protocol for conflict resolution in analyzing empirical data related to natural resource controversies. Wildlife Society Bulletin 27: 1050–1058. ARNOLD, S.F. 1990. Mathematical Statistics. 1st edition, Prentice Hall Inc., Englewood Cliffs, New Jersey. 636pp. BERGER, J. 1985. Statistical Decision Theory and Bayesian Analysis. 2nd edition, Springer, New York. 617pp. CROME, F.H.J., THOMAS, M.R. and MOORE, L.A. 1996. A novel Bayesian approach to assessing impacts of rain forest logging. Ecological Applications 6: 1104–1123. DENNIS, B. 1996. Discussion: Should ecologists become Bayesians? Ecological Applications 6: 1095–1103. EFRON, B. 1981. Nonparametric estimates of standard error – the jackknife, the bootstrap and other methods. Biometrika 68: 589–599. EFRON, B. 1986. Why isn’t everyone a Bayesian. American Statistician 40: 1–5. ELLISON, A.M. 1996. An introduction to Bayesian inference for ecological research and environmental decision-making. Ecological Applications 6: 1036–1046.
FRANCIS, R. and MANLY, B.F.J. 2001. Bootstrap calibration to improve the reliability of tests to compare sample means and variances. Environmetrics 12: 713–729. GELMAN, A., CARLIN, J., STERN, H. and RUBIN, J. 1995. Bayesian Data Analysis. 1st edition, Chapman and Hall, London. 552pp. HAERLIN, B. and PARR, D. 1999. How to restore public trust in science. Nature 400: 499–499. HARVEY, A.C. 1989. Forecasting, structural time series models and the Kalman filter. Cambridge University Press, Cambridge. 570pp. HILBORN, R. and MANGEL, M. 1997. The Ecological Detective: Confronting models with data. Monographs in Population Biology No. 28. Princeton. 315pp. HILBORN, R. and WALTERS, C.J. 1992. Quantitative fisheries stock assessment: choice, dynamics, and uncertainty. Chapman and Hall, New York, NY. 570pp. HOWSON, C. and URBACH, P. 1991. Bayesian reasoning in science. Nature 350: 371–374. JENSEN, F.V. 1996. An Introduction to Bayesian Networks. 1st edition, UCL Press. 178 pp. KASS, R.E. and RAFTERY, A.E. 1995. Bayes factors. Journal of the American Statistical Association 90: 773–795. KUIKKA, S., GISLASON, H., HANSSON, S., HILDÉN, M., SPARHOLT, H. and VARIS, O. 1999. Modelling environmentally driven uncertainties in Baltic cod management by Bayesian influence diagrams. Can. J. Fish. Aquat. Sci. 56: 629–641. LUDWIG, D., MANGEL, M. and HADDAD, B. 2001. Ecology, conservation and public policy. Annual Review Ecology and Systematics 32: 481–517. MACFARLANE, D.W., GREEN, E.J. and VALENTINE, H.T. 2000. Incorporating uncertainty into the parameters of a forest process model. Ecological Modelling 134: 27–40. MALAKOFF, D. 1999. Bayes offers a ‘new’ way to make sense of numbers. Science 286: 1460–1464. MANLY, B.F.J. 1990. Randomisation and Monet Carlo methods in biology. 1st edition, Chapman and Hall. 296pp. MANLY, B.F.J. 1997. A method for the estimation of parameters for natural stage-structured populations. Researches on Population Ecology 39: 101–111. MANLY, B.F.J. 1998. Testing for latitudinal and other body-size gradients. Ecology Letters 1: 104–111. MARCOT, B.G., HOLTHAUSEN, R.S., RAPHAEL, M.G., ROWLAND, M.M. and WISDOM, M.J. 2001. Using Bayesian belief networks to evaluate fish and wildlife population viability under land management alternatives from an environmental impact statement. Forest Ecology and Management 153: 29–42. MCALLISTER, M. and KIRCHNER, C. 2002. Accounting for structural uncertainty to facilitate precautionary fishery management: Illustration with Namibian orange roughy. Bulletin of Marine Science 70: 499–540. MCALLISTER, M.K. and KIRKWOOD, G.P. 1998. Bayesian stock assessment: a review and example application using the logistic model. Journal of Marine Science 55: 1031–1060. PADILLA, A. and GIBSON, I. 2000. Science moves to centre stage. Nature 403: 357–359. PARMA, A.M. 2002. In search of robust harvest rules for Pacific halibut in the face of uncertain assessments and decadal changes in productivity. Bulletin of Marine Science 70: 423– 453.
Bayesian approaches to forest research PATTERSON, K.R. 1999. Evaluating uncertainty in harvest control law catches using Bayesian Markov Chain Monte Carlo virtual population analysis with adaptive rejection sampling and including structural uncertainty. Can. J. Fish Aquat. Sci. 56: 208–221. PRATO, T. 2000. Multiple attribute evaluation of landscape management. Journal of Environmental Management 60: 325– 337. PUNT, A.E. and HILBORN, R. 1997. Fisheries stock assessment and decision analysis: The Bayesian approach. Reviews in Fish Biology and Fisheries 7: 35–63. PUTZ, F.E., BLATE, G.M., REDFORD, K.H., FIMBEL, R. and ROBINSON, J. 2001. Tropical forest management and conservation of biodiversity: an overview. Conservation Biology 15: 7–20.
19
RESTREPO, V.R., HOENIG, J.M., POWERS, J.E., BAIRD, J.W. and TURNER, S.C. 1992. A simple simulation approach to risk and cost-analysis, with applications to swordfish and cod fisheries. Fishery Bulletin 90: 736–748. SAINSBURY, K. 1988. The ecological basis of multispecies fisheries, and management of a demersal fishery in tropical Australia. In: Gulland, J. (ed.) Population Dynamics. 1st edition, John Wiley and Sons Ltd, London. 422pp. WADE, P.R. 2000. Bayesian methods in conservation biology. Conservation Biology 14: 1308–1316. WOLFSON, L.J., KADANE, J.B. and SMALL, M.J. 1996. Bayesian environmental policy decisions: Two case studies. Ecological Applications 6: 1056–1066.