Flexible hierarchical mark-recapture modeling for ...

Flexible hierarchical mark-recapture modeling for open populations using WinBUGS Matthew R. Schofield1

Richard J. Barker2

Darryl I. MacKenzie3

November 9, 2007

1 University

of Otago, Department of Mathematics and Statistics, P.O. Box 56, Dunedin, New

Zealand, email: [email protected] 2 University of Otago, Department of Mathematics and Statistics, P.O. Box 56, Dunedin, New Zealand, email: [email protected] 3 Proteus Wildlife Research Consulting, P.O. Box 5193, Dunedin, New Zealand, email: [email protected]

Abstract Hierarchical mark-recapture models offer three advantages over classical mark-recapture models: (i) they allow expression of complicated models in terms of simple components; (ii) they provide a convenient way of modeling missing data and latent variables in a way that allows expression of relationships involving latent variables in the model; (iii) they provide a convenient way of introducing parsimony into models involving many nuisance parameters. Expressing models using the complete data likelihood we show how many of the standard mark-recapture models for open populations can be readily fitted using the software WinBUGS. We include examples that illustrate fitting the Cormack-Jolly-Seber model, multi-state and multi-event models, models including auxiliary data, and models including density dependence. Keywords: Bayesian; Hierarchical Modeling; WinBUGS.

1

Introduction

In the last 10-15 years, Bayesian inference has undergone a renaissance largely due to the development of computer intensive methods such as Markov chain Monte Carlo (MCMC) for fitting complex hierarchical models. Hierarchical models can be difficult to fit using frequentist techniques because of the need for complex and high-dimensional integrals in finding expressions for the (marginal) likelihood function. By obviating the need for explicit integration, Bayesian methods overcome the difficulties of fitting hierarchical models using frequentist techniques. Hierarchical models have been developed in many areas of ecological modeling including capture-recapture methods, for example Johnson and Hoeting (2003), Link and Barker (2005). Hierarchical models are of interest because of three appealing features: 1. Models can be expressed in way that is natural for ecologists. Often complex marginal distributions can be split into conditional distributions that are both simpler and more natural to work with. For example, the negative binomial distribution arises when a random variable Y is distributed as a Poisson P (λ) with parameter λ distributed as a gamma distribution. In this example, expressing models for the mean of the marginal distribution for Y can be easily expressed in terms of the parameter λ. Furthermore, choice of distribution for λ need not be restricted to the gamma distribution, for example one could choose the log-normal distribution, creating a greater variety of potential models for Y than the negative binomial. 2. Hierarchical models provide a useful framework for models with missing data and latent variables. This can facilitate inference about complex biological processes by allowing ecologists to model latent variables in the same way that they would if they were observed directly. For example, ecologists are often interested in the relationship between parameters such as survival probabilities and abundance N . If the data available con2

cerning N are from an open population mark-recapture model, this is difficult because N does not appear in the likelihood. However, as we show below, embedding the mark-recapture model into a hierarchical model with abundance included as a latent variable allows expression of such density dependence relationships. 3. Hierarchical models can be used as a device for parsimonious expression of models involving many nuisance parameters. A particularly useful way of representing hierarchical models is as a graphical model. Graphical models represent dependencies among random variables by a graph in which each random variable is a node and the lines between the nodes represent conditional probabilities. For example, the well-known linear regression model Y ∼ N (Xβ, σ 2 ) has two nodes X and Y with the line connecting X and Y representing the conditional relationship Y |X as a normal distribution for Y with mean Xβ. Here, X is said to be the ‘parent’ of Y . This particular graphical model (Figure 1) is ‘directed’ in that the line connecting X to Y , representing the stochastic relationship Y |X, has a different meaning to the absent line connecting Y to X, representing the stochastic relationship X|Y . It is is also ‘acyclical’ in the sense that there are no pathways that lead back to any particular node. Directed acyclical graphs (DAGs) have been exploited in the computer program BUGS (Spiegelhalter et al. 2003) which uses MCMC in order to fit Bayesian hierarchical models that can be expressed via DAGs. BUGS can also be accessed via a graphical user interface (GUI) called WinBUGS (Spiegelhalter et al. 2003) that can write the BUGS model based on a DAG drawn by the user. The advent of BUGS means that complicated hierarchical models can now be fitted by ecologists within a Bayesian inference framework. The cost of adopting the Bayesian inference framework is that priors need to be specified for unknowns (e.g., parameters, latent variables or predictions) that have no parents. A major benefit, however, is that all inference is carried out using conditional probability. Thus, the inference problem is reduced to one 3

of finding the correct posterior distribution for unknowns of interest. A second benefit of Bayesian inference is that inference for hierarchical models is simplified by the fact that inference is based on marginal posterior distributions which can be found using techniques such as MCMC. In contrast, the high-dimensional integrals usually needed to find the joint likelihood function for maximum-likelihood prove intractable except for simple problems. In this paper we illustrate the use of BUGS to fit complex mark-recapture models for open populations. Mark-recapture models can be thought of fundamentally as models for missing data, or equivalently, corrupted data. In the Cormack-Jolly-Seber model, inference about survival probabilities is based on right-censored survival times for a marked sample of animals. The approach we take is to express each model in terms of what is known as the complete-data-likelihood (CDL) and show how the model can be fitted in BUGS. The CDL is obtained by using data augmentation (Tanner and Wong 1987) to write a likelihood that explicitly includes missing data components in the model that are usually integrated out of the commonly used observed-data likelihood. For example, the interval censored times of death are included and then integrated out of capture-recapture models. However, we choose to include the interval censored times of death, and write our models in terms of the CDL. Using the CDL allows us to have a clear separation of nuisance parts of the model and parts that are of biological interest. The CDL also facilitates hierarchical modeling and makes some useful extensions of the model relatively easy to fit using Bayesian modeling methods. For example, the multi-state model (Brownie et al. 1993, Schwarz et al. 1993) or models including auxiliary data (Burnham 1993, Barker 1997) can be fitted in BUGS without the need for a specialized likelihood functions (see 2.2.1 and 2.3). A further advantage of use of the CDL is that latent components may be available for use in the analysis such as population size which can be used to fit density-dependence models (see 2.4).

4

2

Modeling using the CDL

We outline how to fit some commonly used mark-recapture models using the CDL in BUGS, starting with the Cormack-Jolly-Seber model (Cormack 1964, Jolly 1965, Seber 1965). We generalize to more complex models by multiplying additional conditional likelihood components to the CDL (Schofield and Barker 2008). If the additional likelihood components have missing data, we follow the same procedure of including the missing data in the model so that we are modeling in terms of the CDL. Unless otherwise stated we used vague priors on parameters in each of the examples. Copies of our WinBUGS code are available at www.maths.otago.ac.nz/∼rbarker/WinBUGS.

2.1

Cormack-Jolly-Seber model

The data we obtain from a standard k period capture-recapture study is the u. by k matrix X, consisting of 0 and 1’s, where u. is the total number of observed individuals. A value Xij = 1 denotes capture of individual i in sample j, with Xij = 0 otherwise. The data also contain information on the time of death for each individual. In order to write simple WinBUGS code we choose to express this through the matrix a. The value aij = 1 means that individual i was alive at the time of sample j, with aij = 0 otherwise. The matrix a comprises missing and observed components. As we observed the individual alive when captured, the values of a are observed to be 1 from the sample of first capture up to an including the sample of last capture for each individual. Using a classical approach and defining the survival probability between sample j and j + 1 as Sj , we sum over the missing components of a using χj = (1 − Sj ) + Sj (1 − pj+1 )χj+1 , j = 1, . . . , k − 1, with χk = 1, 5

to obtain the observed data likelihood (ODL). Including the unknown values of a allows us to model in terms of the CDL. The CDL for the CJS model (Figure 2) can be factored to give the conditional likelihood component for survival multiplied by the conditional likelihood component for capture given the time of death, [X|p, a, t1 ][a|S, t1 ],

(1)

where p is the probability of capture, t1 is a vector indicating the time of first-release for each animal. Usually, the CJS model is parameterized in terms of apparent survival φj = Sj Fj where Fj is the probability that an animal in the marked population at occasions j and j + 1 has not emigrated. Under the usual assumption of permanent emigration Fj is wholly confounded with Sj , hence the use of φj , however under an alternative assumption it is wholly confounded with pj+1 (Burnham 1993). For clarity, we prefer to write the CJS model as in (1) as extending it to explicitly include movement is straightforward. The conditional likelihood component for whether or not an individual is alive at a given sampling period is the outcome of a single Bernoulli trial, [a|S, t1 ] =

u. k Y Y

[aij |aij−1 , S]

(2)

i=1 j=t1i +1

where [aij |aij−1 , S] = Bern(aij−1 Sj ), i = 1, . . . , u., j = t1i + 1, . . . , k, where t1i is the sample of first release for individual i. The term aij−1 is required because an individual, once dead, can not return to life. Conditional on knowing a, the capture process is a series of Bernoulli trials, [X|p, a, t1 ] =

u. k Y Y

[Xij |p, a]

i=1 j=t1i +1

where [Xij |p, a] = Bern(aij pj ), i = 1, . . . , u., j = t1i + 1, . . . , k. 6

(3)

The term aij is required because an individual can only be available for capture while it is alive. As can be seen with the CJS model, specifying the problem in terms of the CDL allows us to factorize the model naturally with the survival process of interest separate from the capture process, which is a nuisance aspect of the model. Coding the CDL into BUGS requires three steps: 1. Specifying the model for the partially observed alive matrix as in (2) 2. Specifying the model for the captures conditional on individuals being alive as in (3) 3. Specifying the required data in order to fit the model. The data for the CJS model the matrix of observations X and the partially observed matrix a. All missing values of a (before first capture and after last capture) are specified as NA. The WinBUGS code used for the CJS component is similar to that used by Gimenez et al. (2007).

2.2

Models for time-varying covariates

To add fully-observed covariates to the CJS model requires two modifications to the BUGS code. Firstly, the data input step must be modified to include the covariates. Secondly, the relevant parameters must be expressed as functions of the covariates in the model statement. Often the covariates are individual-specific and partially-observed due to the covariate value being unable to be observed at occasions when the individual was not observed. Common partially-observed covariates include individual length and weight as well as covariates such as location or breeding status. The inclusion of partially-observed time-varying covariates in mark-recapture models has been the subject of considerable research over the past 15 years. The inclusion of a categorical covariate, z, into a capture-recapture model is a defining feature of the multi-state model (Schwarz et al. 1993, Brownie et al. 1993). In the 7

multi-state mark-recapture model z is partially observed, being known for each animal in the samples in which the animal is caught. Accounting for the missing data in the covariate z has spurred the development of computer programs such as M-SURGE (Choquet et al. 2004) and MARK White and Burnham (1999), designed to model multi-state data using a specialized observed data likelihood function. Bonner and Schwarz (2006) have solved the equivalent problem for continuous time-varying covariates. The approach we adopt is to extend the CDL for the CJS model to allow partiallyobserved covariates by adding the distribution for the covariate z and modifying the conditional distributions for X and a to include conditioning on z: [X|p, a, z, t1 ][a|S, z, t1 ][z|ψ, t1 ]

(4)

with the DAG representation in Figure 3. To include partially observed covariates in WinBUGS we need to (i) include the u. × k structure z in the data file, with all unknown values of z specified as NA, and (ii) specify the distribution for z.

2.2.1

Multi-state model

In the Schwarz-Arnason model (Schwarz et al. 1993), z is a first order Markov chain with transition matrix from sample j to j + 1 denoted Ψj . To specify the distribution for the covariate we add to the BUGS model statement: for(i in 1:udot){ for(j in first[i]+1:k){ z[i,j] ~ dcat(psi[z[i,j-1],1:ns,j-1]) } } where psi[j,h,l] is the probability of moving from state j to state h between sample ℓ and ℓ + 1, and ns is the number of states. 8

2.2.2

Continuous covariate example

Bonner and Schwarz (2006) included partially observed body-weight as a continuous covariate when examining the survival probabilities of the meadow vole, Microtus pennsylvanicus, at the Patuxent Wildlife Research Center in Laurel, Maryland. Using BUGS and following Bonner and Schwarz (2006) we modeled the effect of body weight on survival and capture probability as: logit(Sij ) = β0 + β1 wij ,

i = 1, . . . , u., j = 1, . . . , k − 1

logit(pij ) = γ0 + γ1 wij ,

i = 1, . . . , u., j = 1, . . . , k − 1,

where wij is the standardized weight (note that Bonner and Schwarz (2006) did not standardize the covariate values). The body weights are modeled through time as wij ∼ N (wij−1 + ∆j−1 , σ 2 ),

i = 1, . . . , u., j = t1i + 1, . . . , k,

recalling that t1i is the first release occasion for individual i. The BUGS code we used to express the covariate relationship was: for(i in 1:udot){ for(j in first[i]+1:k){ w[i,j] ~ dnorm(mu[i,j],tau) mu[i,j]

Flexible hierarchical mark-recapture modeling for ...

Flexible hierarchical mark-recapture modeling for ...

Suggest Documents

Modeling hierarchical structures - Hierarchical Linear Modeling ... - arXiv

Hierarchical Modeling for Computational Biology

Hierarchical Stochastic Modeling for Multiscale

A computerassisted system for photographic markrecapture analysis

Collaborative Hierarchical Sparse Modeling

Bayesian hierarchical modeling for temperature reconstruction ... - arXiv

Hierarchical Stochastic Modeling for Multiscale ... - Semantic Scholar

A HIERARCHICAL STATISTICAL MODELING APPROACH FOR THE ...

A Bayesian hierarchical modeling approach for ...

Loss Performance Modeling for Hierarchical ... - Semantic Scholar

A Multidimensional Hierarchical Framework for Modeling ... - arXiv

Nonparametric Hierarchical Modeling for Detecting ... - Semantic Scholar

Hierarchical Variability Modeling for Software Architectures - arXiv

Bayesian hierarchical modeling for signaling pathway ... - arXiv

Efficient Bayesian Hierarchical User Modeling for Recommendation ...

LNCS 2351 - Hierarchical Shape Modeling for ... - People.csail.mit.edu

Bayesian Hierarchical Modeling for Categorical Longitudinal Data

Error Modeling for Hierarchical Lossless Image Compression

Efficient duration and hierarchical modeling for ... - ScienceDirect.com

The kZIG: Flexible Modeling for ZeroInflated Counts

Dynamic Modeling for a Flexible Spacecraft With

Modeling of Flexible Beams for Robotic Manipulators

Network Flow Modeling for Flexible ... - Computer Science

The kZIG: Flexible Modeling for ZeroInflated Counts