Bayesian Estimation of Age-specific Bird Nest ...

1 downloads 0 Views 141KB Size Report
Jing Cao1, Chong Z. He2 and Timothy D. McCoy3. 1. ..... Dr. Timothy (Tim) McCoy has a PhD in Fisheries and Wildlife from the University of. Missouri, Columbia ...
Bayesian Estimation of Age-specific Bird Nest Survival Rates With Categorical Covariates Jing Cao1 , Chong Z. He2 and Timothy D. McCoy3 1. Department of Statistical Science, Southern Methodist University 2. Department of Statistics, Virginia Tech 3. Nebraska Game and Parks Commission e-mail: [email protected] Summary.

The populations of many North American landbirds are showing signs of declining.

Gathering information on breeding productivity allows critical assessment of population performance and helps identify good habitat-management practices. He (2003) proposed a Bayesian model to estimate the age-specific nest survival rates. The model allows irregular visiting schedule under the assumption that the observed nests have homogeneous nest survival. Because nest survival studies are often conducted in different sites and time periods, it is not realistic to assume homogeneous nest survival. In this paper, we extend He’s model by incorporating these factors as categorical covariates. The simulation results show that the Bayesian hierarchical model can produce satisfactory estimates on nest survival and capture different factor effects. Finally the model is applied to a Missouri red-winged blackbird data set. Key words:

Bird nest survival; Age-specific survival rate; Bayesian estimation; Hierarchical

model; Noninformative prior.

1

INTRODUCTION

In a typical nest survival study, investigators periodically search for active nests in a marked area and revisit them until they have failed (destroyed or abandoned) or succeeded (at least one egg hatches or one baby bird fledges, as defined in the study). Note that nests that failed before they would be discovered are not included in the data. 1

Mayfield (1961, 1975) noticed that the apparent nest survival, which is defined as the proportion of successful nests in a sample, is often positively biased because nest losses early in incubation are underrepresented. He proposed an ad hoc method to estimate the daily survival rate using exposure days (the cumulative number of days of the nests under observation) and the number of losses. Early statistical methods on nest survival commonly assume that the daily survival rates are constant throughout the nesting period (Mayfield, 1961; Mayfield, 1975; Miller and Johnson, 1978; Johnson, 1979; Hensler and Nichols, 1981; Bart and Robson, 1982). More current methods relax this assumption but still require other constraints such as daily visits or exact nest age when the nest was first encountered. See He (2003) for more detailed account of the literature on nest survival. He, Sun, and Tra (2001) developed a Bayesian model for age-specific nest survival studies where nests are revisited daily. In reality, the schedule of daily visits is hard to maintain because of bad weather or schedule conflicts, etc. Furthermore, biologists believe that the act of observation can influence nest survival. For example, the direct disturbance of nests may cause the parent birds to abandon the nests or inform predators of the locations of nests they might have otherwise overlooked (Cornelius 1993). He (2003) extended the model to deal with general irregular visiting schedule, accommodating any mixture of visiting schedule. Because this model gives the estimates of age-specific daily survival rates, it is useful for biologists studying bird nest survival. These methods assume that all nests have a homogeneous survival pattern, whether constant or age-specific. In practice, observations are usually made in different sites, years and time periods. For example, the Missouri Red-winged Blackbird data, which we analyze for illustration, were collected from two counties in three years. In each year, there were three observation periods (15 May to 14 June, 15 June to 14 July, and 15 July to 7 August). It is not reasonable to assume homogeneous nest survival through years and over all sites. Recently, approaches to estimate nest survival that incorporate covariates (biological factors) have been proposed. Dinsmore, White, and Knopf (2002) introduced a logistic nest survival model, now available in the program MARK (White and Burnham, 1999), that can incorporate covariates. Shaffer (2004) proposed a logistic-exposure nest survival model which uses maximum likelihood to estimate parameters. Instead of each observed nest, 2

Shaffer treated each visitation interval for a nest as an observation. The daily survival rates were assumed to be homogeneous within each visitation interval. The average nest age and the average value of the covariates during each interval provide the data for Shaffer’ s model. Both of the methods allow irregular visiting schedule, but they require that nests are aged correctly at first encounter, an assumption that may be difficult to meet in field studies. In this paper, we extend He’s model (2003) to incorporate categorical covariates. We assume a hierarchical logistic prior for the survival hazard rate instead of the independent beta prior in He’s model. The prior includes four covariates (site, year, observation period, and nest age). This model does not require the correct nest age at first encounter nor the date on nest outcome (success or failure). We also compare two different priors for the nest age effect. The model has the potential to incorporate additional covariates of interest. In Section 2, we introduce the notations of He’s Bayesian model (2003). In Section 3, we extend He’s model by changing the priors for survival hazard rates to include categorical covarites. In Section 4, simulation study is conducted to evaluate the performance of the models. In Section 5, the estimation results of the Missouri red-winged blackbird data are presented. Deviance Information Criterion (DIC) is employed for model selection. Discussion of the model is provided in Section 6.

2

NOTATIONS IN HE’S MODEL

Suppose that J is the number of days a nest is required to survive to be considered successful, n is the number of observed nests (sample size), and k designates the kth observed nest. We define uk , zk and tk to be the nest age at the first encounter, the number of days from the first encounter to the outcome, and the nest age at the outcome of the kth nest, respectively. Define yk =

   1, if the kth nest is a success,   0, otherwise.

When yk = 1, we can see that tk = J. Under irregular visiting schedule, we know zk only up to an interval [ZLk , ZRk ], where 3

ZLk is the lower bound and ZRk is the upper bound of zk . Similarly, we know uk only up to [ULk , URk ]. The nests in the study area can be classified into 3 groups: undiscoverable nests (u = ∞) whose locations are unaccessible, truncated nests (t < u ≤ J) that failed before they would be discovered, and observed nests (u ≤ t ≤ J). Note that truncated nests and observed nests are discoverable (u ≤ J). Let δi = P (u = i | u ≤ J), i = 1, · · · , J, qj = P (t = j, y = 0), j = 1, · · · , J. Here δi is the conditional probability that nest age at first encounter is i, given that the nest is discoverable, and qj is the probability that nest age at failure is j. Note that δ1 + · · · + δJ = 1 and 0 ≤ q1 + · · · + qJ ≤ 1. Let qJ+1 = 1 − (q1 + · · · + qJ ). Then qJ+1 = P (y = 1) is the nest success rate. Because the ranges of these parameters are irregular, He (2003) reparameterized δi and qj as discrete hazard rates so that the ranges of the new parameters are unit intervals. Define the hazard rates corresponding to δi and qj by δi , i = 1, · · · , J − 1, δi + · · · + δJ qj = , j = 1, · · · , J, qj + · · · + qJ+1

αi = ξj

and αJ = ξJ+1 = 1. Note that the hazard rates are conditional probabilities, αi = P (u = i | i ≤ u ≤ J), i = 1, · · · , J − 1, ξj = P (t = j, y = 0 | j ≤ t ≤ J), j = 1, · · · , J. Finally, the age-specific daily survival rates are sj = P (t > j | t > j − 1) = 1 − ξj , j = 1, · · · , J. See He (2003) for more details of the model.

4

3

MODEL DEVELOPMENT

He (2003) dealt with only one nest population, and she assumed independent beta priors for daily survival hazard rates. In practice, data is often collected in different sites and time periods. To incorporate these factors, we develop a Bayesian hierarchical model with categorical covariates. In order to expand the support of survival hazard rate from [0,1] to (−∞, +∞), we use a logit transformation. Define ξbscj ´ 1 − ξbscj = βb + ϕs + γc + θj + εbscj ,

vbscj = log

³

(1)

where ξbscj is the survival hazard rate at age j in site b, year s and observation period c; and (βb , ϕs , γc , θj ) represent the site effect, year effect, observation period effect, and age effect, respectively. Other effects are accounted for as unexplained variation εbscj . Because investigators follow established field methodology for collecting nesting data and they tend to follow the same visiting schedule during a study, it is reasonable to assume that the encounter rates are the same for all the cells (each cell is a combination of site, year, and observation period). As in He (2003), the conditional encounter hazard rates have one-stage beta priors, (αi | ci , c∗i ) ∼ Beta(ci , c∗i ), i = 1, · · · , J − 1,

(2)

where ci and c∗i are positive constants. Here and in the following, we use (x | · ) to denote the conditional distribution of x given its argument. The error term εbscj in (1) is assumed to have a two-stage prior, (εbscj | δ0 ) ∼ N(0, δ0 ),

δ0 ∼ IG(a0 , b0 ),

(3)

where N(u, s) denotes a normal distribution with mean u and variance s, and IG(a0 , b0 ) denotes an inverse gamma distribution. Here a0 and b0 are constants. Usually, a small number of sites are included in a nesting study, and the sites may not be adjacent. It is unnecessary to put a spatial structure on the site effect. A two-stage normal 5

prior is sufficient. The years included in the data collection period are usually consecutive, but there is no obvious influence from the previous year. Climate, which contributes the most to year factor, is rather random. We also assume a two-stage normal prior for the year factor. As for the observation period factor, biologists suggest that nests with early-initiated incubations tend to have higher survival rates. Usually there are only two or three time periods of primary and secondary nesting activity during the breeding season. These can hardly form a time series, so we assume a two-stage normal prior. These three priors are taken as random effects, and the priors are: (βb | δ1 ) ∼ N(0, δ1 ),

δ1 ∼ IG(a1 , b1 ),

(4)

(ϕs | δ2 ) ∼ N(0, δ2 ),

δ2 ∼ IG(a2 , b2 ),

(5)

(γc | δ3 ) ∼ N(0, δ3 ),

δ3 ∼ IG(a3 , b3 ),

(6)

where ai and bi (i = 1, 2, 3) are constants. Next, we introduce two age-effect priors based on different assumptions.

3.1

Age-Effect Prior 1 (Normal)

In some situations, researchers may have very limited knowledge about the bird’s survival pattern. For example, there may be several different stages in the nesting period. The iid normal prior is a natural choice to provide more flexibility. In addition, it’s very simple to implement. With the other priors remaining the same, we assume θj ∼ N(u, s) j = 1, · · · , J, where u and s are constants. The age effect is a fixed effect in the model.

3.2

Age-Effect Prior 2 (AR(1))

It is reasonable to believe that the survival rates in the nesting period are autocorrelated. A first order auto-regressive time series model (AR(1)) is useful in accounting for sequential dependence and provides a means of smoothing. Sun, Speckman and Tsutakawa (2000)

6

incorporated an AR(1) process into a generalized linear mixed model. In our case, we assume that !

Ã

δ4 (θ1 | µ, ρ, δ4 ) ∼ N µ, , (1 − ρ2 ) (θj | θj−1 , µ, ρ, δ4 ) ∼ N(ρ(θj−1 − µ) + µ, δ4 ),

for j = 2, · · · , J,

µ ∼ N(u, s), ρ ∼ Uniform(−1, 1), δ4 ∼ IG(a4 , b4 ), where u, s, a4 , and b4 are constants. The joint prior distribution of θ = (θ1 , · · · , θJ )0 is (θ | µ, ρ, δ4 ) ∼ N(µ1, τ Σ),

(7)

where 1 is a J × 1 vector with the elements being 1, τ = δ4 /(1 − ρ2 ) and Σ is a J × J matrix with element σij = ρ|i−j| . Similar to He’s proof (2003), we can show that the posterior distribution is proper.

4

SIMULATION STUDY

In the simulation study, we assume that there are 2 sites, 3 years and 2 observation periods (altogether 12 cells). The cell sample size varies from 1 to 43. The total sample size is 285. As for the choice of the true survival rates in the cells, we follow the pattern shown in the logit linear mixed model (1). All the cells have the ’S-shaped’ survival curve suggested by the biologists. Assuming J = 19 and nest visits every two days, we generate 100 samples with sample size 285. Before implementing Gibbs sampling, noninformative priors are assigned to the parameters. For example, variance parameters (δi ) follow an inverse gamma distribution with hyperparameters (ai , bi ) = (2.0, 1.0). It has a mean at 1.0 and an infinite variance. √ Note that the figure and the tables in the section present the mean (and the mse) of the estimates over 100 samples. Figure 1 shows the simulation results for Cell 1. It includes 7

two sets of estimates from the Bayesian hierarchical model with the two priors for the age effect. The nest success rates of all the cells are listed in Table 1. As shown in Figure 1, both of the estimates follow the true survival curve closely. By contrast, the estimated survival curve under the AR(1) prior is relatively smoother and more stable. This is because the AR(1) prior provides a smoothing effect in the estimation. Table 2 presents the true values and the estimates of the three factor effects. The magnitude of the effects varies from 0.07 to 0.3. The estimates successfully capture the factor effects on nest survival. We let the cell sample size vary from 1 to 43 because the Missouri red-winged blackbird data set has similarly unbalanced cell size. We have tried other balanced and unbalanced cell sample sizes. Our experience is that the effect of cell sample size on the results is not significant, given the same total sample size. To test the robustness of the model, we tried different shapes of survival curve in simulation study. In addition, the true survival patterns in different cells are no longer given as suggested by the logit linear mixed model (1). Instead, the survival rates are parallel, proportional, or a mixture of the two, to the rates in different cells. The true survival curve and factor effects are estimated with similar accuracy shown in this simulation study.

5

APPLICATION

We applied the model to the data from McCoy’s study (1996), which was partly funded by the Missouri Department of Conservation. The project was to study whether avian abundance and composition, avian nesting success, and vegetation characteristics differ between different kinds of fields in northern Missouri. The study areas were located in two counties (Macon and Linn) in north-central Missouri. The same areas were sampled from 1993 to 1995. In each year, there were three observation periods (15 May to 14 June, 15 June to 14 July, and 15 July to 7 August), making 18 cells. During each observation period, teams of searchers walked abreast (approximately 1m apart) actively scanning for nests and flushing birds until nesting plots had been systematically traversed. They marked the nests and revisited them every two to three days to determine outcomes. The success of a nest was defined as the 8

nest producing at least one fledging young bird. There are 246 nests recorded in the data, but the cell sample size varies from 1 to 43. There are three factors (year, site, and observation period) in the study. To select the proper subset of the factors, we use the Deviance Information Criterion (DIC) (Spiegelhalter et al., 2002). For model Mi , let θi be the unknown parameters, fi (y | Mi , θi ) be the likelihood function, and θ¯i denote the posterior means of θi . The DIC for model Mi is given by ¯ i + pD , DICi = D i

(8)

¯i is the posterior mean deviance suggested as a measure of fit, and pD is the effective where D i number of parameters calculated as the difference between the posterior mean deviance and the deviance at the posterior means of the parameters. The quantity pDi serves as a penalty term that measures the complexity of the model. Thus the DIC provides a Bayesian measure of model fit and complexity. Smaller values of DIC indicate better models. Table 3 summarizes the DIC scores of the candidate models with the AR(1) prior. Modelyo , the model with the year and observation period factor has the smallest DIC. It has a model fit close to the global model (Modelsyo ) and one less unit in the effective number of parameters. From Table 3 we can see that the observation period factor is the most important one compared with the other two factors. The incorporation of the site factor does not improve the estimation. Figure 2 shows the estimates of red-winged blackbird nest survival in 1993 during the first observation period. For red-winged blackbird, there are thirteen days in the incubation stage and eleven days in the nestling stage. In Figure 2, one can observe an obvious drop of the survival rate starting at the transition from incubation to nestling. This may be explained by the fact that there is no vocalization by young prior to hatching. The begging call develops soon after hatching, which is structurally simple and quiet initially. It becomes louder and clearly audible later in the nestling stage, and thus may draw attention from predators (Yasukawa and Searcy, 1995). As the nestlings grow older, they tend to get stronger, and the survival rate recovers. Table 4 presents the estimated nest success rates from Modelyo with the AR(1) prior. It shows that there was a declining trend through the years. The nest success rate was much 9

lower in the third observation period than those in the previous periods, indicating that nest survival was significantly lower late in the breeding season.

6

DISCUSSION

In this paper, we develop a Bayesian hierarchical model to include categorical covariates that may influence nest survival. With this model, we can study the effects of various factors and may be able to give suggestions on improving habitat- and land-management practices. Compared with other current methods on nest survival, our model requires the fewest assumptions. One underlying assumption in the model is that nest survival is homogeneous in each combination of levels of the categorical variables. As in this study, the red-winged blackbird nest survival is assumed to be homogeneous in each cell. If some other biological factor is also a categorical variable, it can be easily incorporated into the model. Increasingly, nest survival data sets include GPS coordinates and continuous covariate measurements, such as vegetation height at the nest and distance to woody cover. In this case, homogeneous nest survival in the area may no longer be a valid assumption. We are developing another Bayesian model to incorporate continuous covariates in nest studies.

REFERENCES Bart, J. and Robson, D.S. (1982), “Estimating survivorship when the subjects are visited periodically”, Ecology, 63, 1078-1090. Cornelius, W.L. (1993), An Avian Nest Survival Modeling Scheme and Comparison Of Nest Survival Probability Estimation Methods, Ph.D. Thesis, North Carolina State University. Dinsmore, Stephen J. , White, Gary C. , and Knopf, Fritz L. (2002), “Advanced techniques for modeling avian nest survival”, Ecology, 83 (12) , 3476-3488 He, Z., Sun, D., and Tra, Y. (2001), “Bayesian modeling age-specific survival in nesting studies under dirichlet priors”, Biometrics, 57, 1059-1066. 10

He, Z. (2003), “Bayesian modeling of age-specific survival in bird nesting studies under irregular visits”, Biometrics, 59, 962–973. Hensler, G.L. and Nichols, J.S. (1981), “The Mayfield method of estimating nesting success: a model, estimators and simulation results”, Wilson Bulletin, 93, 42-53. Johnson, D.H. (1979), “Estimating nest success: the Mayfield method and an alternative”, The Auk, 96, 651-661. Mayfield, H. (1961), “Nesting success calculated from exposure”, Wilson Bulletin, 73, 255261. Mayfield, H. (1975), “Suggestions for calculating nest successes”, Wilson Bulletin, 87, 456466. McCoy, Timothy D. (1996), Avian Abundance, Composition, and Reproductive Success on Conservation Reserve Program Fields in Northern Missouri, Master Thesis, University of Missouri-columbia Miller, H.W. and Johnson, D.H. (1978), “Interpreting the results of nesting studies”, Journal of Wildlife Management, 42, 471-476. Shaffer, T. L. (2004), “A unified approach to analyzing nest success”, The Auk, 121, 526–540. Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and van der Linde, A. (2002), “Bayesian measures of model complexity and fit”, Journal of the Royal Statistical Society, Series B, Methodological, 64, 583–616. Sun, D., Speckman, P. L. and Tsutakawa, R. K. (2000), Random effects in generalized linear mixed models. In Dey,D.K., Ghosh,S.K. and Mallick,B.K., editors, Generalized Linear Models: A Bayesian Perspective, Marcel Dekker, New York, pages 23-39. White, G. C. and Burnham, K. P. (1999), “Program MARK: survival estimation from populations of marked animals”, Bird Study, 46, 120–139. Yasukawa, K. and Searcy, W. A. (1995), “Red-winged Blackbird”, The Birds of North America, 184

11

Biographical Sketches Dr. Jing Cao is an assistant professor in the Department of Statistical Science at Southern Methodist University. She received a Ph.D. in statistics from University of MissouriColumbia. Her research interests include Bayesian analysis, survival analysis, spatiotemporal models, and nonparametric Inference. Dr. Chong He is an Associate Professor in the Department of Statistics at Virginia Tech. She received a Ph.D. in statistics from Purdue University. Her research interests include Bayesian Analysis, Small Area Estimation, Survival Analysis, Sampling Survey and Spatiotemporal models. She is especially interested in developing and applying statistical methods in ecology, conservation, and environmental research. Dr. Timothy (Tim) McCoy has a PhD in Fisheries and Wildlife from the University of Missouri, Columbia and is currently the Agricultural Program Manager with the Nebraska Game and Parks Commission. His research and current work has focused on identifying the impacts of federal agriculture conservation programs on wildlife populations, and improving wildlife benefits resulting from those programs.

12

0.0

0.02

sqrt(mse)

0.04

0.98 0.94 0.90

age−specific daily survival rate

5

10

15

5

age

10

15

age

√ Figure 1: Bayesian estimates of age-specific daily survival rates and mse for Cell 1 in the simulation study. —— true daily survival rates; · · · · · · estimates under the AR(1) prior; - - - estimates under the normal prior.

Table 1: Nest Success Rate Estimates (Simulation Study) Cell Cell Size 1 30 2 43 3 21 4 1 5 15 6 40 7 35 8 7 9 25 10 38 11 17 12 13 Note: estimates’



True 0.34663 0.15174 0.44288 0.23289 0.35733 0.15998 0.24274 0.08201 0.33595 0.14370 0.25271 0.08788

Nest Success Rate Estimate Normal AR(1) 0.33497 (0.05634) 0.35414 (0.05272) 0.16004 (0.04328) 0.16564 (0.04706) 0.40871 (0.06961) 0.44171 (0.06844) 0.22252 (0.04321) 0.23955 (0.04351) 0.33977 (0.06345) 0.36784 (0.06167) 0.16695 (0.04586) 0.17730 (0.05110) 0.23359 (0.04721) 0.24662 (0.04632) 0.09386 (0.03804) 0.09477 (0.04281) 0.29734 (0.06282) 0.33238 (0.05691) 0.13933 (0.03618) 0.14839 (0.03673) 0.24801 (0.05001) 0.26022 (0.05053) 0.10004 (0.04022) 0.10230 (0.04251)

mse are included in the parentheses.

Table 2: Factor Effect Estimates (Simulation Study) Estimate Factor True Normal AR(1) Site-1 -0.15 -0.1782 (0.0776) -0.1781 (0.0797) Site-2 0.15 0.1403 (0.0802) 0.1409 (0.0791) Year-1 0.1 0.0889 (0.0956) 0.0890 (0.0979) Year-2 -0.17 -0.1657 (0.1212) -0.1660 (0.1225) Year-3 0.07 0.0467 (0.1012) 0.0491 (0.1013) Obs-1 -0.3 -0.3170 (0.0896) -0.3173 (0.0860) Obs-2 0.3 0.2774 (0.0868) 0.2794 (0.0870) Note: estimates’



mse are included in the parentheses.

Table 3: DIC Results Model Modelo Modely Models Modelyo Modelso Modelsy Modelsyo

DIC 1417.780 1453.470 1455.387 1412.798 1420.620 1452.679 1413.691

¯i D 1406.586 1441.524 1445.028 1400.079 1408.306 1440.356 1399.915

s: site, y: year, o: observation period.

pD 11.194 11.945 10.359 12.719 12.314 12.323 13.776

10

15

20

0.05 0.02 0.0

standard deviation

0.98 0.94 0.90

age−specific daily survival rate

5

5

age

10

15

20

age

Figure 2: Bayesian estimates of age-specific daily survival rates and their standard deviations for the Missouri Red-winged Blackbird data. —— estimates with the AR(1) prior; · · · · · · indicates age=13.

Table 4: Nest Success Rate Estimates Year Obs. Period Nest Success Rate Estimate 1993 1 0.47701 (0.07186) 1993 2 0.53687 (0.07337) 1993 3 0.25326 (0.05453) 1994 1 0.43201 (0.06773) 1994 2 0.49175 (0.08817) 1994 3 0.22161 (0.05717) 1995 1 0.37939 (0.05669) 1995 2 0.43837 (0.07782) 1995 3 0.18412 (0.03688) Note: estimates’



mse are included in the parentheses.