Spatial Distance Sampling Modeling of Cetaceans ... - CiteSeerX

14 downloads 0 Views 391KB Size Report
quadrat sampling with the slight adjustment that instead of counting in squares, ... travel down a line and count everything to a distance w either side of the line.
Spatial Distance Sampling Modeling of Cetaceans Observed from Platforms of Opportunity

By Peter Henrys April 2005

1

Abstract In this paper I outline the standard methods of distance sampling and how they are used to obtain estimates of density and abundance for species of interest. I then develop these methods following the approach of Hedley (2000) whereby waiting distances between detections are modelled to produce a density map of the area of interest. In doing so Standard distance sampling, multi-covariate distance sampling and generalized additive models are all discussed and combined together to achieve the density surface. The methods presented are then applied to a data set of fin whale provided by the Biscay Dolphin Research Programme (BDRP). Using the spatial model produced their a priori beliefs on the location of fin whale and trends in numbers are assessed. Both the density map produced (showing locations of high densities) and within season abundance estimates (together with 95% confidence intervals) support the claims set out by the BDRP.

2

Acknowledgements For this project I am very much indebted to the help, support and advice of my supervisor, Professor Stephen T. Buckland, without whom this project would not have been possible. I am also very grateful to Dr. Louise Burt for all her help and information that she kindly provided me with whenever necessary. I would also like to thank the Biscay Dolphin Research Programme who very kindly supplied me with their data which I use in the application of my methods in chapter 6 and also invited me to observe the survey work they carry out on board the ‘Pride of Bilbao’. For that I am very grateful. Finally I would like to thank my parents for all their support throughout this project and through all of my time at university. Without their support and guidance I know I would not be the person I am today. Thank you all. Peter.

3

Contents Introduction

5

Conventional Distance sampling Introduction Line Transect Methods Likelihood Variance Objects occurring in clusters Assumptions

8 8 9 14 15 17 17

Multi-Covariate Distance Sampling Introduction Likelihood Incorporating covariates into the detection function Density and Abundance estimation Variance Objects occurring in clusters Discussion

19 19 20 21 22 23 23 24

Generalised Additive Models Introduction GAM framework Smooth functions Assumptions Summary

25 25 26 27 28 29

Spatial Modelling Introduction Waiting distances Constant encounter rate model Use of the GAM Variable encounter rate model Summary

30 30 31 32 33 34 35

Application of methods Data Distance Sampling Waiting distance model Estimating density Estimating abundance Discussion

36 36 38 38 40 42 44

Discussion

47

4

1. Introduction Over recent years the job of a wildlife manager has become increasingly involved. The first, and most obvious, statistic he wishes to extract from a population is how many animals there are. By observing this over a given time frame one can look into trends in population numbers and if the population is stable or if there are areas for concern in their abundance. Stock assessment, as this is generally known, is crucial when managing a population or if one wishes to draw any inference from it. Methods have been around for many years on how to calculate such estimates, but this has not been the case for whale stocks due to their ability to spend long periods of time underwater and thus go undetected when surveys are conducted. There have, however, recently been methods formulated in order to take account of this uncertain detection and estimates of whale abundance and density can now be made. It is distance sampling methods of this kind, particularly line transect sampling, proposed by Seber (1982) and developed by Buckland et al (1993) and Buckland et al (2001), which have become acknowledged as the standard tool with which to assess whale stocks. The robust methods used provide good estimates of abundance and density in the region of interest and associated variance / precision. The wildlife manager, however, now wishes to draw more information from his survey than just abundance and density. If the area surveyed is large, for example, he wishes to know where the high and low density regions are. This is often related to environmental factors affecting the population, which for whale populations can be ocean depth, sea surface temperature and any other topographical measure. Knowing where high densities of the population are and which environmental factors relate to or can explain this, enables the wildlife manager to control, maintain and monitor the population much more easily. The standard distance sampling methods are, however, fairly restricted when trying to extract this sort of data from the population. The standard analyses produce estimates of

5

average density over the region, not spatially specific densities throughout the region. In order to do this they often stratify the region into sub-regions of interest and separate analyses conducted in each stratum. This method falls down when there are limited observations in particular sub-regions and therefore analysis cannot be conducted in that area. Other methods use gridded sample points throughout the area of interest and geographic information systems to provide detailed environmental maps from which spatial distribution of a species can be obtained. However, a grid of sample points, although acceptable for terrestrial surveys, is not practical for cetacean surveys. Spatial distance sampling methods incorporate standard line transect analysis with other environmental factors in order to model the distribution of cetaceans throughout the region. The spatial models do not require large sample size throughout the region and can be used to estimate abundance in any sub-region of interest by numerically integrating under the fitted surface and from this average density in the sub-region can be estimated if required. The main problem with cetacean surveys in the past has been the cost of surveys. It is much more expensive to conduct at sea surveys, requiring a suitable survey vessel, than it is to observe terrestrial animals. It has therefore been beneficial for survey teams to conduct their surveys from so-called “platforms of opportunity”. These “platforms of opportunity”, often commercial vessels, merchant navy vessels or ferries, can provide the basis for large surveys and can run for long periods of time at a relatively low cost. This low-cost way of collecting large quantities of data has meant that surveys conducted from “platforms of opportunity” are very popular amongst cetacean research groups. When used with standard distance sampling methods, this non-random placement of transect lines, due to the predetermined route of the vessel, violates one of the main assumptions of randomly placed lines and thus results in biased estimates of density and abundance. The spatial models, however, by using model-based methods rather than design-based, result in a much lower bias due to using knowledge of the variable of interest and its relationship with auxiliary variables (Hedley 2000). Another solution proposed by Marques (2001) is to incorporate any covariates that may affect the detection of an animal into the standard distance sampling framework.

6

In this paper I will outline the basics of distance sampling for line transects and how recent developments have enabled cetacean population estimates to be obtained. I will then go on to discuss “platforms of opportunity” and how the conventional distance sampling methods are developed in order to account for non random transects. In the fourth chapter I will cover statistical models and in particular generalised additive models used in the spatial modelling principles reviewed in chapter five. In chapter six I will apply the methodology mentioned in the opening chapters to fin whales observed in the English Channel and Bay of Biscay. Finally, chapter seven reviews the methods and application of the methods presented and offers future directions in the field of spatial modelling of animal distributions.

7

2. Conventional Distance Sampling Introduction In order to estimate density and abundance in a given region, the first, and most obvious, methods developed were what are now known as quadrat sampling and strip transect sampling. Quadrat sampling is based on randomly positioning m quadrats throughout the survey region and counting all objects that fall inside them. To obtain an estimate of density in the whole region, the number of objects counted is simply divided by the area of the surveyed region. This gives the simple formula: n Dˆ = a

(2.1)

where n is the number of objects observed and a is the sum of the areas of the m quadrats. Strip transect sampling, an example of which is shown in Figure 2.1, is very similar to quadrat sampling with the slight adjustment that instead of counting in squares, we now travel down a line and count everything to a distance w either side of the line. The m lines are, like the quadrats, placed in a random position, or, more commonly, placed systematically, equal distance apart with a random start point. The area surveyed, a , now has the following form:

a = 2 wL

m

where L = ∑ li (the sum of the lengths of the transect lines)

(2.2)

i =1

The problem with the two approaches mentioned above is that they both assume that every object within the surveyed region is counted, an assumption that is often violated in practice, particularly with cetacean surveys.

8

2w

li Figure 2.1: A typical strip transect survey, with 5 transects of equal length, showing observed objects (red dots), within the strip and unobserved objects (black dots), outside the strip. The dark blue line shows the path taken by the observer along the transect of length and width as shown.

Line transect methods

In line transect methods the assumption of observing everything within the strip is relaxed and instead the distance from the transect line to the observed object is recorded. The standard methods allow for objects away from the line to go undetected, but objects on the line should have probability of detection equal to one. This is due to the fundamental principle of distance sampling, which states that the further away from the transect line an object is, the smaller its detection probability. Like strip transect sampling, line transect sampling typically consists of randomly placed transects throughout the survey region, or more usually, a systematic grid of transects with a random start point. This random positioning of transects is crucial because it enables us to reasonably assume that objects are uniformly distributed with respect to

9

distance from the transect line. This allows the proportion of missed objects to be estimated, and thus an estimate of total density or abundance can be obtained. Introducing Pa , the probability that a randomly selected object within the surveyed area is detected, and combining 2.1 and 2.2, we get: Dˆ =

n 2 ⋅ w ⋅ L ⋅ Pˆa

(2.3)

Because we no longer assume that all objects within the transect strip are observed, we wish to model the detection probability of an object at a perpendicular distance x away from the line, known as g (x). In order to obtain the function g (x) a probability density function, f (x), is fitted to the histogram of frequency of objects at perpendicular distance x away from the transect line. Instead of using all the data obtained in the survey, Buckland et al (2001) suggest to truncate around 5% of observations from the right-hand tail. This is often preferred so that the extreme tail of the data does not affect the overall fit of the model. Having plotted a histogram of observed distances from the line with appropriate truncation, a probability density function can now be fitted. To fit a probability density function to the histogram, there are standard general models to choose from, which include a key function and a series expansion integrated together as shown in 2.4. The standard models are shown in table 2.1 and all are suitable choices of models due to the criterion set out in Burnham et al (1979, 1980:44) – model robustness, pooling robustness, shape criterion and estimator efficiency. Note that there are no sine terms or odd polynomials in the series expansions because we require an even function (one that is symmetric about the line) as the distribution is folded on (-w,w) so that it is on (0,w) ~ f ( x ) = key ( x)[1 + series ( x )]

(2.4)

~ where f ( x ) = f ( x) rescaled so that it integrates to one.

The models suggested can all be fitted using the software program Distance (Thomas et al, 2004), which has become the accepted standard tool for distance sampling analysis. To choose which of the models fitted is more suitable it is recommended to look at the goodness-of-fit of the model to the data, the fitted model plotted on the histogram of data and also the Akaike’s Information Criterion (AIC) value (Akaike, 1973). The fit of the

10

model is determined by the chi-squared goodness of fit test and the model giving the minimum AIC value, given in 2.4, is generally preferred, although suitability of the chosen model should also be considered. For example, if the negative exponential model provides the minimum AIC value, the spiked nature of the model would lead us to choose a different model if this was not thought appropriate. AIC = −2 ⋅ log e ( L) + 2q

(2.5)

where L is the evaluated likelihood function evaluated at its maximum and q is the no. of parameters fitted. Having fitted a model to the histogram of observed objects, it is now crucial to understand how f (x) and g (x) are related so that the detection function, g (x), can also be fitted. To look at this relation, we can simply express each function in terms of what it represents probabilistically.

Key Function

Uniform, 1 / w

Series Expansion

Cosine,

m

∑a j =1

Uniform, 1 / w

j

cos( jπx / w)

Simple Polynomial,

m

∑a j =1

Half Normal, exp(− x 2 / 2σ 2 )

Cosine,

m

∑a j =2

Half Normal, exp(− x 2 / 2σ 2 )

j

j

cos( jπx / w)

Hermite Polynomial,

m

∑a j =2

Hazard-rate, 1 − exp(−( x / σ ) − b )

Cosine,

m

∑a j =2

Hazard-rate, 1 − exp(−( x / σ ) − b )

j

( x / w) 2 j

j

H 2 j (x / σ )

cos( jπx / w)

Simple Polynomial,

m

∑a j =2

j

( x / w) 2 j

Table 2.1: The general models used to fit a pdf to the histogram of observed objects at a given distance, x, from the transect line.

11

= pr{object in ( x, x + dx) object detected}

f ( x )dx

= =

=



f ( x) =

pr{object in ( x, x + dx ) and object detected} pr{object detected}

[ pr{object detected

object in ( x, x + dx )} ⋅ pr{object in ( x, x + dx)}] Pa

g ( x) ⋅ (dx ⋅ L) /( w ⋅ L) Pa g ( x) w ⋅ Pa

(2.6)

where Pa is the probability that a randomly selected object within the surveyed area is detected. w

Recalling that f (x) is a pdf,



w

f ( x )dx = 1 and therefore

0

∫ g ( x)dx = w ⋅ P

a

.

(2.6b)

0

This result, assuming that objects are uniformly distributed with distance from the line and that g(0)=1 (i.e. detection of an object on the line is certain), is intuitively obvious w

since it gives

∫ g ( x)dx / w ⋅ 1 = P (i.e. a

the probability that a randomly selected object

0

within the surveyed area is detected is equal to the area under the detection function divided by 1 ⋅ w , corresponding to complete detection out to distance w ). w

Let us now introduce the notation µ = w ⋅ Pa = ∫ g ( x )dx

(2.6c)

0

From equation 2.3 a ⋅ Pa = 2wLPa = 2 µL , and from eqn 2.6 µ =

g ( x) , which, at distance f ( x)

x=0, gives

µ = 1 / f (0) ,

(2.6d)

because g (0) is assumed to be 1. Thus g (x) is the same shape as f (x), but rescaled so that

g (0)=1. The value µ is called the effective strip (half-) width because the expected number of objects missed when located at distances less than µ is equal to the expected number seen at distances greater than µ. This is shown in Figure 2.2. 12

Figure 2.2: The top figure shows the pdf, f(x), fitted to perpendicular distances, which have been truncated at distance w , and grouped into intervals for displaying as a histogram. The bottom figure shows the corresponding fitted detection curve, g(x), with g(0)=1. The shaded regions are equal in area; µ is thus the effective strip (half-) width.

13

Equation 2.3 can now be re-written as:

n ⋅ fˆ (0) Dˆ = 2L

(2.7)

Likelihood

Maximum likelihood estimation is often used to estimate the parameters of a model. In line transect sampling we are particularly interested in N , the number of objects in the entire region, and so, having estimated other parameters, we can maximise the likelihood with respect to N and thus achieve our objective. In strip transects the likelihood we wish to maximise is fairly straightforward due to the major assumption that everything within the strip is seen. The likelihood to maximise, if groups are detected independently, is simply a binomial function with the number observed, n , equalling the number seen within the strips, the total number, N , equalling the number in the whole region and the probability of an object being within the strip, π c = a / A . N L( N ) =   (π c ) n (1 − π c ) N −n n

(2.8)

In line transect sampling the full likelihood becomes more complicated because it is no longer assumed that all objects within the transect strip are detected. Because of this the likelihood now consists of two components: the likelihood for N as in (2.8) and the likelihood for the vector of parameters θ in the detection function, g ( x) , and pdf, f ( x ) . The first component of the likelihood, as in (2.8), known as the encounter rate component (Borchers et al, 2002), has a different binomial probability associated with it than that shown in (2.8). This is because instead of the probability being whether the animal was in the strip, it is whether it was in the strip and detected. So introducing the notation Pa = detection probability of an object within the surveyed region, the encounter rate likelihood becomes N L( N Pa ) =   (π c Pa ) n (1 - π c Pa ) N − n n

14

(2.9)

The second component of the full likelihood is simply the likelihood, given the observed data, of the parameters of the detection function, obtained by using the pdf of the data,

f ( x ). Using results we have already proved in this chapter, the likelihood of the vector of parameters, θ , is given by

L(θ ) =

n

∏ i =1

f ( xi ) =

n



g ( xi )

µ

i =1

 1   L(θ ) =  P w a  

n

n

g ( xi )

∏ w⋅ P

=

i =1

a

n

∏ g(x )

(2.10)

i

i =1

The full likelihood is the product of equations 2.9 and 2.10 N L( N ;θ ) =   (π c ) n (1 - π c Pa ) N − n n

n

∏ i =1

g ( xi ) w

(2.11)

If we maximise the conditional likelihood, in (2.10), to obtain an estimate for the parameters, in particular Pa , and then we substitute these values into the likelihood for N , equation (2.9), we obtain a maximum likelihood estimation for density as

Dˆ =

n , which is identical to the density estimate described in equation (2.7). 2 ⋅ µˆ ⋅ L

In practice the full likelihood is rarely used as it forces the assumption that objects are uniformly distributed through the whole survey region. Instead we maximise the conditional likelihood, given in (2.10), then estimate abundance by design-based multiplying up from the strips to the entire area. Variance

It is simply not enough to have one answer of density or abundance from a survey. Any statistician would be interested in the variance of the estimates and / or the associated confidence intervals for population size N, because this can tell you more about the population than the estimate alone does. To obtain an estimate of variance from line

15

transect surveys, the bootstrap (Efron, 1979; Efron and Tibshirami, 1993; Davison and Hinkley 1997; Manly, 1997) is often used. The bootstrap is a straightforward method to implement and provides fairly robust results. There are two forms of the bootstrap – the nonparametric bootstrap and the parametric bootstrap. The nonparametric bootstrap involves re-sampling units from those already observed with replacement. The units are assumed to be independent identically distributed and in line transect surveys are usually taken to be the transect lines. Suppose that there are n transect lines in the survey region, then the bootstrap would be to resample n lines from these n with replacement. If we take B=999 of these n resamples, then we can follow the same procedure as we did with the original observations, using the same, original, model refitted to each resample, and thus have 999 estimates of Dˆ or Nˆ . If these estimates are placed in ascending numerical order then a 95% confidence interval is given by simply reading off the 25th and 975th smallest values. More generally a 100(1 − 2α )% confidence interval is given by the ( B + 1)α and the

( B + 1)(1 − α ) smallest values (Buckland, 1984). The variance estimate of

Dˆ or Nˆ from the nonparametric bootstrap is given simply as the sample variance of the estimates Dˆ or Nˆ from the B resamples obtained. The parametric bootstrap involves generating new observations by using the data already obtained to tell you about the underlying population. A new population of objects is generated from some spatial state model, and then we generate new detection distances from the detection function fitted to the original data. If this is repeated B times, there will be B new samples and the standard procedure can again be carried out in order to obtain estimates of Dˆ or Nˆ . Variance and confidence intervals are obtained as in the nonparametric bootstrap. In practice the nonparametric bootstrap is often favoured, taking the transect lines as sampling units, because the parametric bootstrap relies too heavily on the fitted detection function and the spatial state model, which can be inappropriate.

16

Objects occurring in Clusters

For objects that occur in groups, a new component is incorporated into equation 2.7. Instead of n being the number of objects in the survey region, it is the number of groups or ‘clusters’. And so in order to get an estimate for density a new term for expected cluster size is included in the defining equation. Thus equation 2.7 is re-written as n ⋅ Eˆ [s ] ⋅ fˆ (0) Dˆ = 2L

(2.12)

where Eˆ [s ] is the expected size of clusters in the population. Unbiased estimates of cluster size can often be hard to obtain because, for example, larger clusters can be seen out to a greater distance than smaller clusters (Drummer and McDonald, 1987; Drummer et al, 1990; Otto and Pollock, 1990). Therefore the mean of observed clusters is rarely taken as Eˆ [s ] and instead a regression of cluster size (or log e of cluster size) against its associated detection probability is made and E[s ] is estimated where detection is certain (i.e. g (0) = 1 , where size bias should not occur). Assumptions

Throughout this chapter assumptions have been made that have enabled formulation of key results. Below is a list of these main assumptions and the effect of violating them. 1. Animals that occur on the transect line are detected with certainty ( g (0) = 1 ) If this assumption does not hold then all estimates will be negatively biased. Cases where this does not hold and one wishes to estimate g (0) are considered in Buckland et al (2004) and Borchers et al (2002). However, in most surveys one can take care to ensure that g (0) is as close as possible to 1 just by the way in which an observer conducts his survey. 2. Objects are uniformly and independently distributed within the survey region

17

This is assumed for the full likelihood approach of estimating abundance given in equation (2.11), but not for using the conditional likelihood and design-based approach. Estimating the proportion of missed objects assumes uniformity from the transect line (i.e. on (0,w)), but this has been ensured through the original design. If objects are not independently distributed in the region then confidence intervals can also be biased. However, in practice the methods have been found to be remarkably robust to violation of this assumption. 3. Objects do not move prior to detection Responsive movement of the object to the observer can cause problems in analysis. If objects move prior to detection then this will introduce a positive bias in estimates of Dˆ and Nˆ . However, providing that the movement of the object is slow relative to the movement of the observer, then this bias is small (Hiby, 1986). Buckland et al (2001) introduce a strategy for when object movement is fast. 4. Distances are measured accurately The main violation of this assumption occurs when the perpendicular distances are rounded and especially when many small distances are rounded to zero. This makes it very hard to accurately fit the detection function to the data, as the histogram will appear “spiked” at x = 0 and fluctuate up and down where distances have been rounded to and from respectively. For more information on dealing with bias caused by violating this assumption see T. Marques (2004).

For a more detailed overview of line transect sampling the reader is referred to Seber (1982) and Buckland et al (2001), which have become acknowledged as the standard texts in this field.

18

3. Multi-Covariate Distance sampling Introduction

Within the conventional distance sampling framework, the probability of detecting an object is purely a function of its distance, x, from the transect line. This means that in the previous section we have assumed that each object is just as detectable as any other, providing they are the same distance away from the transect line. The models used in the detection function are pooling robust, which means that density can be accurately estimated when heterogeneity caused by covariates other than distance is ignored. If detection on the line is certain then this property holds. If stratum specific estimates of density are required and sightings data are pooled, then un-modeled heterogeneity can cause large bias (Buckland et al, 2004). This can often cause problems as there may be environmental factors, survey factors or object characteristics that affect this. When cetacean surveys are conducted from so-called “platforms of opportunity” this is a particular problem because there can exist substantial heterogeneity between sighting conditions, observers and survey vessels. To overcome the problem that violation of this assumption causes there are two possible approaches to take: design-based or modelbased. The design-based approach minimises the bias caused by this violation by post stratifying (Anganuzzi and Buckland, 1993) the region up into strata whereby the variable of interest is expected to be similar between sets within each stratum. Having stratified, estimates of f (0) , often the variable of interest, can be obtained in each stratum. From this an overall estimate is obtained by an area-weighted mean of estimates from all strata. The idea of this method is that it minimises the bias caused by allocating more effort to different areas within the region. This method, however, relies on there being sufficient sample size within each stratum. The model-based approach is to incorporate covariates into the estimation of f (0) . This can be done by directly incorporating covariates into the estimation procedure via a multi-covariate detection function (Ramsey et al, 1987) and into the “key + series adjustment” methodology outlined in chapter 2.

19

Because of the problem of sample size within strata that occurs when surveys are conducted via platforms of opportunity, I concentrate the remainder of this chapter with the multi-covariate detection function approach of Marques (2001) and Marques and Buckland (2003). Likelihood

If we wish to incorporate covariates into the standard distance sampling framework, we must look at the joint density between the perpendicular distances, x, and associated covariates z (z = z1 ,..., z p ) . Conditioning on the object being detected and assuming x and z are independent we have: pr[object at x and has covariate values z object detected]

= =

pr[object detected at x and has covariate values z] pr[object detected] g ( x, z ) ⋅ π ( x ) ⋅ π ( z )

∫ ∫ g ( x, z ) ⋅ π ( x) ⋅ π ( z )dxd z z x

= f ( x, z ) Suppose we have no knowledge of π (z ) , then it is useful to look at f (z ) as derived by Borchers (1996). f ( z) =

∫ f ( x, z)dx =

X

π ( z ) ∫ g ( x, z )π ( x)dx X

∫ ∫ g ( x, z )π ( x)π ( z )dxd Z Z X

and if π ( x) = 1 / w , i.e. lines are placed randomly, then, f ( x z) =

f ( x, z ) g ( x, z ) = f ( z) µ ( z)

(3.1)

where µ ( z ) = ∫ g ( x, z )dx . X

Therefore the conditional likelihood is given by: n

n

i =1

i =1

L(θ ; x, z ) = ∏ f ( xi z i ) = ∏

20

g ( xi , z i ) µ (z i )

(3.2)

Incorporating covariates into the detection function

The usual assumptions of line transect sampling, as outlined in chapter 2, are made, and, as well as the perpendicular distance of each detected object away from the transect line being recorded, a set of q explanatory covariates, z i ( z i = z1i ,..., z qi ) , associated with each of the n objects is recorded (i = 1,..., n) . These covariates can now be included in the detection function using Buckland’s (1992) ‘key + series adjustment’ methods, as described in equation 2.4, to give m   g (x z ) = k ( x, z ) ⋅ 1 + ∑ α j ⋅ p j ( x s )  j =1 

Where

k ( x, z ) is a key function,

(3.3)

p j (⋅) is an adjustment term with respective

coefficients α j and x s is a standardised x to avoid numerical problems. Using the substitution in equation 3.1, the conditional pdf can be written as

f (x z ) =

m  k ( x, z )  ⋅ 1 + ∑ α j ⋅ p j ( x s ) µ ( z )  j =1 

(3.4)

If the scale term has the form 

q





k =1



σ i = exp β 0 + ∑ β k z ik 

(3.5)

and the standardised value x si was taken to be xi σ i , then the covariate value will only affect the detection function by adjusting the scale and thus the shape will remain constant. As in conventional distance sampling, adjustment terms and model selection are based on minimum AIC.

21

Density and Abundance estimation

To obtain an expression for abundance and density, the Horvitz-Thompson estimator (Horvitz and Thompson, 1952) is seen as the most appropriate method. We wish to look at the probability of observing object i conditional on it being within the surveyed area and having corresponding vector of covariates z i . If we define this as Pa ( z i ) , then, by comparison with equation 2.6b, it is given as w

Pa ( z i ) =

1 1 1 g ( x, z i )dx = ⋅ ∫ w f (0 z i ) w0

(From 2.6c & 2.6d)

(3.6)

The Horvitz-Thompson estimate of abundance of objects in the surveyed area is therefore given by n

Na = ∑ i =1

n 1 =w∑ f (0 z i ) Pa ( z i ) i =1

However, we must estimate Pa ( z i ) and hence

(3.7)

f (0 | z i ) , leading to the Horvitz-

Thompson-like estimator (Borchers, 1996) n

Nˆ a = w∑ fˆ (0 z i )

(3.8)

i =1

To estimate abundance in the whole region from this, a likelihood approach can be taken, as in equation 2.9, or a more design based ‘scaling up’ of the estimate can be made. Here I look at the design based ‘scaling up’ as outlined in Borchers, 1996. A n ˆ A ˆ A ⋅ Na = Nˆ = ⋅ Nˆ a = ∑ f (0 z i ) 2 Lw 2 L i =1 a

22

(3.9)

To obtain an estimate for density via this Horvitz-Thompson approach, the above equation 3.9 is simply divided by A. The notation used above is the same as that for conventional distance sampling outlined in the previous chapter.

Variance

As for conventional distance sampling, the variance of density or abundance can be estimated via the bootstrap (Efron and Tibshirani, 1993). Transect lines are usually taken as sampling units to be re-sampled because of the assumption that sampling units are independently and identically distributed. Having sampled the transect lines with replacement B times, the method described above can produce estimates of abundance and density for all B resamples. Confidence intervals and variance estimates can be obtained from this very efficiently. Details of further variance and confidence interval estimation can be found in Marques (2001) and Marques and Buckland (2003), to which the reader is referred if a more rigorous formulation is required. For most practical applications the robust method of the bootstrap should be more than adequate. Objects occurring in clusters

For objects occurring in clusters, the formulation is quite different from that outlined for conventional distance sampling. Recall that for objects occurring in clusters, the conventional distance sampling estimate of abundance is given by n ⋅ Eˆ [s ] ⋅ fˆ (0) Nˆ = A ⋅ Dˆ = A ⋅ 2L

(3.10)

An overall estimate of cluster size is used instead of the individual cluster sizes per observation. Hence, some information is lost in using this expectation. Using the HorvitzThompson estimator as outlined above, because the individual detection probabilities are used, we can include individual cluster size in the formulation. The Horvitz-Thompson

23

estimator of the number of individuals within the surveyed region, when the objects are in clusters, is given by: n

si

i =1

Pˆa ( z i )

Nˆ a = ∑

n

= w∑ s i ⋅ fˆ (0 z i )

(3.11)

i =1

And, as in 3.9, abundance in the whole area is ‘scaled up’ from area in surveyed region A A ˆ A n Nˆ = Nˆ a = Na = ∑ si ⋅ fˆ (0 z i ) a 2 wL 2 L i =1

(3.12)

Discussion

In all surveys, wherever it is possible, data for all possible covariates, which may influence detection, should be collected. If covariate data are collected then heterogeneity can be modelled in the population and bias can thus be reduced for when detection on the line is not certain. In this scenario, multi covariate distance sampling is preferred to conventional distance sampling. For cases where detection on the line is certain, covariate models are still useful for reducing the reliance on pooling robustness. Program DISTANCE (Thomas et al, 2004) has multi covariate distance sampling included as a standard tool.

24

4. Generalised Additive Models (GAMs) Introduction

The aim of the statistical model is to explain data already obtained and to predict for values that are not included in the data set. One of the most popular and frequently used tools available to the applied statistician is the linear regression model. This model, be it with either one predictor X (simple linear regression) or a set of p predictor variables or covariates X i (i = 1,..., p ) (multiple linear regression), is usually written as

and

E[Yi ] = β 0 + β 1 xi1

(4.1a)

E[Yi ] = β 0 + β1 xi1 + ... + β p x ip

(4.1b)

respectively. The model, however, requires a heavy assumption that the dependence between the response variable and the predictor variables (covariates) is linear in nature and that the errors are normally distributed. It is easy to envisage situations and data where these assumptions would not hold. The right hand side of equations (4.1a) & (4.1b) is known as the linear predictor and here the response variable is on the same scale as this linear predictor. The formulation of the Generalised Linear Model (McCullagh and Nelder, 1989) is that the response variable need no longer be equal to the linear predictor, rather some function, g (⋅) known as a link function, of the response variable is equal to the linear predictor. We therefore have that the framework for the Generalised Linear Model, comparing with (4.1b), is given as g ( E[Yi ]) = g ( µ i ) = β 0 + β 1 xi1 + ...β p xip

(4.2)

Where β 0 + β 1 xi1 + ... + β p xip is the linear predictor,η i . Generalised Linear Models also offer the chance for the distribution of the response variable to differ from that of the normal. In simple and multiple linear regressions, the distribution is assumed normal because of the assumption of a normal (Gaussian) error distribution. With Generalised Linear Models the error distributions can be assumed to be from any distribution within the exponential family (e.g. Normal, Binomial, Poisson or

25

Gamma). The GLM though, although offering much more flexibility than the simple or multiple regressions, is still restricted to the linear form of the predictor. GAM framework

The generalised additive model uses similar techniques to the GLM but is not restricted to the linear form of the predictor as in 4.2 and thus has an advantage of being more flexible to fit the data. This is because the GAM introduces smoothing functions into the model rather than as a straightforward additive linear form (Hastie and Tibshirani, 1990). The construction of the generalised additive model is q

g ( µ i ) = η i = β 0 + ∑ f k (u ki ),

i = 1,..., n

(4.3)

k =1

using the same notation as with 4.2 and where q is the number of covariates (u ) in the model, f k (⋅) is the one dimensional smooth function for the kth covariate and n is the number of observations. Like with GLMs the link functions used in the model are usually chosen from a group of commonly used transformations, which are used with specific error distributions and will be suitable for most data sets (McCullagh and Nelder, 1989; Hastie and Tibshirani, 1990). A table of link functions together with the corresponding error distribution is shown in table 4.1 . Link function: g ( µ ) = η

Inverse link: µ = g −1 (η )

µ

η

log(µ )



Normal; Gamma; Inverse

1

1

Poisson

µ

η

 µ   log 1 − µ 

eη 1 + eη

Error Distn

Link Name

Normal; Gamma; Identity Poisson Normal; Gamma; Log Poisson

Binomial

Logit

Table 4.1: A table showing commonly used link functions and the error distribution with which they are associated.

26

One of the unique properties of generalized additive models is the nonparametric form of the functions f k (⋅) of the covariates. Hastie and Tibshirani (1990) suggest various scatterplot smoothers that can be used on the observed values and covariates. Smooth Functions

The intuitive idea behind the scatterplot smoother is to let the data show us the appropriate functional form. The types of smoothers used as the non-parametric functions, f k (⋅) in Generalised Additive Models are outlined by Hastie and Tibshirani (1990) and are summarised below. Bin Smoother: The bin smoother works by dividing the predicted values into disjoint sets that cover the entire data set, then by averaging the response in each divided set. Running mean: The principle behind this smoother is to take k points to either the left or right of each observation y i and to average these k values. Thus with every new observation the running mean smoother changes value. This principle is also known as moving average and commonly appears in time series analyses. Kernel Smoothers: This smoother produces a weighted average defined by some function known as the Kernel around a target value specified. The Kernel is usually specified as some function that decreases in a smooth fashion as it moves further from the target value. The Gaussian density is often used due to this reason. Splines: The fit is represented as a piecewise polynomial, whereby the defining points of the pieces are regarded as knots. The polynomials are forced to join smoothly at these knots by having continuous first and second derivatives. The majority of splines are based on cubic polynomials and particularly cubic smoothing splines, which are obtained by minimising the penalised residual sum of squares (PRSS). In the remainder of this paper I shall be using cubic smoothing splines because of the smoother fit than the smoothers based on neighbourhood averaging. The extent of smoothing for each covariate depends on its number of parameters or degrees of freedom (Hastie and Tibshirani, 1990), which can be equal to 1 for a linear representation of the covariate term or equal to the number of observations to interpolate the data. The number of degrees of freedom is equal to the number of knots minus 1. 27

Assumptions

When modelling data using generalised additive models there are various assumptions that are made in the process. They are much the same as what must be assumed for generalised linear modelling. They are: •

The n observations are statistically independent.



The response variable Yi follows a distribution from the exponential family



The link function is correctly specified



The variance function is correctly specified.

There is, however, no assumption that the relationships are linear, unlike in generalised linear modelling, and is often why GAMs are preferred. In order to assess these assumptions it is necessary to investigate residual plots, to look closely at the smooth plots and to examine the overall fit of the model produced. The overall fit of a model can be judged by comparing the Generalized Cross Validation (GCV) scores. The model with lowest GCV score provides the better fit to the data. If terms in the model are looked at individually then we can see which to include in the model. Terms with estimated degrees of freedom close to 1 should be dropped from the list of smooth functions, and considered for inclusion as a linear term in the model. If the confidence band on the plot of a smooth includes 0 everywhere, then it should be considered whether to drop this term, especially if upon removal the GCV score decreases.

28

Summary

Generalised Additive Models allow much more flexibility than Generalised Linear Models because the predictor is equal to some non-parametric form which is evaluated by use of smooth functions rather than a linear form. This allows the data themselves to dictate the modelled relationship, rather than for the user to specify a functional form. The model requires no further assumptions than those associated with Generalised Linear Models and is therefore often preferred. However, one has to be careful when fitting a GAM that the model does not ‘over-fit’ the data. This is because the added flexibility allows a smoother with as many degrees of freedom as there are data points and such a smoother would tend to be a poor predictor of where new data points will lie. As mentioned, it is the GCV score that is often used to determine the most appropriate model. This is a good measure to use because it assesses how well a model predicts observations that are not used in the fit. Models with too many or too few parameters will tend to predict poorly and thus the GCV will choose a model that fits the data well but also has good predictive power.

29

5. Spatial Modelling Introduction

Within the distance sampling framework one of the major assumptions that has to made is that the transect lines are randomly placed. When surveys are conducted from platforms of opportunity, this is clearly not the case. Our approach to obtain estimates of density and abundance therefore has to change. Instead of producing a design-based estimate of overall density or abundance, which requires the assumption of randomly placed transects, we can produce a spatial model throughout the region. In particular we model density throughout the region and obtain a density surface, which gives an estimate of abundance for any region of interest by numerical integration. The advantages of this method are clear to see as the non-random placement of transects would lead us to make unreasonable assumptions if analysed using standard distance sampling. The density maps produced also provide a good visual representation of ‘hot spots’ and distribution of objects in the region. One of the simpler ways in which to incorporate this spatial variation into the general framework was proposed by Hedley (2000). The method involved dividing the transect line into separate cells and counting the number of objects within each cell. Given that information on relevant covariates is available for the model (e.g. latitude and longitude) the number of objects can be modelled throughout the region and if an appropriate offset term is included in the model then estimated densities are obtained. The problem with this method is that it relies on the rather subjective initial choice of cell size with which to count objects within. Another problem with this method is that platforms of opportunity usually have long periods of time where nothing is observed along the route. Using this method, this results in zero inflated data sets and more complicated modelling, such as Generalised Linear Mixed Modelling (GLMM) or Generalised Additive Mixed Modelling (GAMM), must be carried out to compensate for this. Hedley (2000) also proposes a method based on modelling distances between detections, which has more theoretical appeal as the subjective choice of cell size is no longer needed. It is this method that I look more closely at.

30

Waiting Distances

The waiting distance is defined as the along-transect distance between observed objects i and i + 1 .It is sometimes referred to as the inter-detection distance, but I shall refer to it

in this paper as the waiting distance. Waiting distance models can be used to estimate density because there exists a relationship between the two. Specifically, the relationship is that in areas of high density the waiting distance between detections is short (Buckland et al, 2004). If xi is defined as the distance travelled along the transect until object i is reached, then the waiting distances l i are defined as xi − xi −1 (for i = 2,..., n ). l1 is the distance surveyed until the first object is detected and is therefore equal to x1 , and l n +1 is the distance surveyed after the last object has been detected. An example with 3 observations is shown in figure 5.1. It is worth noting that the waiting distance is the distance between successive detections irrespective of whether they are detected on the same transect or not.

Figure 5.1: An example of the waiting distance model, showing the notation used with 3 detected objects.

31

Constant encounter rate model

In this first, rather simplistic model we assume that between detections density, expected encounter rate and the vector of available spatial covariates ( Z ), all remain constant between detections, but can change once a detection has occurred. If the position of detected objects is defined as ( x, y ) with x regarded as the along-trackline distance to the object and y the perpendicular distance from the transect line to the object, then the locations of objects are assumed to follow an inhomogeneous Poisson process with rate parameter D ( x, y ) , which varies spatially (Hedley 2000). If there are two successive detections made at ( xi ,0) and ( xi +1 ,0) and therefore the waiting distance is li +1 , as defined earlier, then in this model the density of objects between the two detections is given by D ( xi +1 ,0) . Using this it is clear to see that the estimated number of detected objects within this segment is given by: 2µˆ ⋅ li +1 ⋅ D ( xi +1 ,0)

(5.1)

where µˆ is the estimated effective strip width for that particular segment. We define the cumulative distribution of the random variable M i , the along-trackline distance surveyed until the first detection (with start point ( xi ,0) ), evaluated at ( xi +1 ,0) to be: FM i (li +1 xi ) = P ( M i ≤ l i +1 xi ) = 1 − P (M i > li +1 xi )

(5.2)

Since M i > li +1 if and only if there were no detections in the strip of half width w, then

FM i (l i +1 xi )

= 1 − P(no detections in the strip) = 1−

[2µˆ ⋅ li +1 D( xi +1 ,0)]0 exp{− 2µˆ ⋅ li+1 D( xi+1 ,0)} 0!

32

= 1 − exp{− 2 µˆ ⋅ l i +1 D( x i +1 ,0)}

(5.3)

which is the exponential distribution with rate proportional to the reciprocal of the average density of detections. The density at D ( xi +1 ,0) is therefore given by 1 2µˆ ⋅ li +1

(5.4)

The way the waiting distances have been set up make this result intuitively obvious since the density is given by the one detection made divided by the effective area surveyed. If we model the waiting distances l i and obtain our estimates µˆ of µ , then the densities throughout the region can be evaluated. Use of the GAM

The Generalised Additive Model, as discussed in the previous chapter, can now be used to model the response variable of interest – the waiting distances, l i . Remembering that in the previous chapter we had the formula q

g ( µ i ) = η i = β 0 + ∑ f k (u ki ),

i = 1,..., n

(5.5)

k =1

we only need adapt this slightly to model the response variable l i as a function of the set of spatial covariates z .

g [E (l i )] = β 0 + ∑ f k ( z ki ),

i = 1,..., n

(5.6)

k

with notation as previously defined. The waiting distances in this model follow an exponential distribution with rate proportional to the fitted density in that interval (Buckland et al, 2004), and therefore the gamma error distribution is used. All waiting distances modelled are required to be

33

positive values and so something like a log link function is used to ensure that all responses are indeed positive. If the fitted values of the model are denoted lˆi then the estimated density in the interval ( xi −1 ,0) to ( xi ,0) is 1 2 µˆ ⋅ lˆi . Variable encounter rate model

The model just proposed is not often viable in practice because one would expect the density of objects to vary smoothly along the transect line. In the previous model density was only allowed to change when an object was detected. We therefore introduce an iterative procedure that alters the observed waiting distances to the waiting distances that would have occurred if the underlying Poisson process was actually homogeneous and density between detections was constant (Hedley, 2000). Firstly, though, the spatial model should be applied to the observed waiting distances and the method described above in the constant encounter rate model should be applied. If the ~ adjusted waiting distances are defined as li (i = 1,..., n) , then they satisfy the equation:

{

1 − exp − 2µˆ ⋅ ∫

xi +1

xi

}

{

}

~ Dˆ ( x,0)dx = 1 − exp − 2 µˆ ⋅ li ⋅ Dˆ ( xi ,0)

(5.7)

which implies that



xi +1

xi

~ Dˆ ( x,0)dx = li ⋅ Dˆ ( xi ,0)

(5.8)

~ So the adjusted waiting distances li are given by

~ ∫ l =

xi +1

xi

i

Dˆ ( x,0)

Dˆ ( xi ,0)

34

(5.9)

The spatial modelling methods can now be applied to the adjusted waiting distances, which should give a lower bias than the original waiting distances. The method above can be repeated using the adjusted waiting distances in the original model and another new set of waiting distances can be obtained with even lower bias. This procedure is iterated until convergence of the waiting distances. Hedley (2000) comments on when convergence cannot be reached. Summary

In this chapter the methodology has been explained to enable objects to be spatially modelled. Using the distance sampling approach of chapters 2 and 3, estimates of the effective strip width, µˆ can be obtained throughout the region. The generalised additive model framework explained in chapter 4 can then be used to model the waiting distances between detections. The density surface is then be obtained by calculating the inverse of the ‘waiting area’ (twice the effective strip width times the waiting distance). From the density surface estimates of abundance can be obtained for any sub-region of interest simply by numerically integrating under the fitted surface.

35

6. Application of Methods Data

The data used in this example are observations of fin whale (Balaenopter physalus) recorded by the Biscay Dolphin Research Programme (BDRP) on board the ‘Pride of Bilbao’ ferry, which operates from Portsmouth to Bilbao. The surveys that the BDRP carry out are conducted every month on board the same ferry which follows a (more or less) fixed route. The survey vessel travels at a speed between 15 – 22 knots and the surveyors’ platform is located approximately 30m above sea level. A 450 angle centred on a bearing of 000 degrees is surveyed with concentration on the centre line to ensure g (0) = 1. All observations are recorded in the same fashion and entered on the same sheet based on a standardized Sea Watch Foundation sighting pro forma. Effort data are also collected by observing the ship’s position every half an hour and by an automated device called ‘logger’ which collects information as the ship travels. All observers are highly experienced and the surveys are made at the same time each month. The fin whale, shown in figure 6.1, is the second largest animal in the world and because of this, and its distinctive dorsal fin, the fin whale is quite easy to identify.

Figure 6.1: An artist’s impression of a fin whale, showing its distinctive dorsal fin.

36

The fin whale occurs in small well-defined cluster sizes of between 1 and 7 and are most often seen on their own. They are not disturbed by the survey vessel and are neither attracted to the ship nor avoid the ship, which makes them an easy species to survey. There were 281 observations of fin whale made by the BDRP between July 1995 and August 2002 each having information available on latitude, longitude, depth, sea state, time, day, month, year and cluster size. The locations of the 281 observations are plotted below on a map of the area. The BDRP comment that, although sightings are made all year round, they are sighted with less frequency October to May, which they suggest is due to seasonal migration. The fin whale is also more often seen in deeper waters and from past knowledge the BDRP suggest that fin whales occur below 46 degrees (latitude) north. These are two statements that can be tested by the spatial model.

Figure 6.2: The 281 observations of fin whale plotted on a map of the area.

37

Distance Sampling

The first step in the analysis was to obtain an estimate of the effective strip width, µ and the expected cluster size, E[s ] . To do this I used the software program Distance (Thomas et al, 2004). Distance data, however, was not available for all the 281 observations and so a subset of 64 observations, which all had distance data, were included in order to obtain the detection function. Distance gave the expected cluster size as E[s ] = 1.8646

(6.1)

Distance was used to fit a multiple covariate model for the detection function. Minimum AIC and goodness-of-fit showed the chosen model to contain covariates of sea state and time of day. Having obtained the model, the covariate values for all 281 observations can be substituted into the detection function, and using the following formula 281 estimates of the effective strip width can be obtained. w

µ = ∫ g ( x)dx 0

(6.2)

This therefore gives a spatial representation of the effective strip width that changed at 281 locations along the route and so is not assumed constant along it. The average effective strip width throughout the whole region, for interest, was

µ = 1.4076km Waiting distance model

The 281 waiting distances (l i i = 1,..., n) were calculated as the distance between on effort successive detections irrespective of whether they were on the same transect or not. Any off effort time was ignored. The transects are defined by the on effort search time for any journey. A particular journey can therefore have more than one transect. For example, one trip will have search effort on day 1, then off effort time during the night, and then

38

search effort on day 2. This would give 2 transects. Using this definition of the transect lines in this example, they are plotted on a map of the area and are shown below in fig 6.3.

Figure 6.3: A plot of the transect lines throughout the region covered by the survey vessel

39

Once the waiting distances have been obtained, they can now be modelled spatially using the GAM framework. Two spatial covariates latitude (lat ) and longitude (lon) were available for inclusion in the model along with depth (depth) and season ( season ): with March, April and May – Spring, June, July and August – Summer, September, October and November – Autumn and December, January and February – Winter. A generalised additive model with a logarithmic link function was fitted to the waiting distances and a gamma error distribution was assumed. Latitude, Longitude and depth were considered for inclusion in the model as cubic smoothing splines with anything up to 10 degrees of freedom, or as linear terms. GCV score was used in model selection, with the lowest being preferred. The final model chosen is given as E[lˆi ] = exp{β 0 + s (lat i ,5) + loni + as. factor ( season)},

i = 1,..., n

(6.3)

Depth failed to enter the model as a covariate because the information it provided was adequately explained by the latitude and longitude terms, on 5 degrees of freedom and included as a linear term respectively. Including season as a factor covariate will allow for different estimates of abundance for each season and migration can be assessed. A plot of the GAM smooths can be seen in figure 6.4. Estimating Density

To estimate density throughout the region of interest, the model was used to predict the waiting distances for a grid of spatial points in each of the four seasons. The grid was taken as the average journey along the route extrapolated out to the widest covered points either side. The resolution within the grid was such that each point was approximately a 9km by 9km square.

40

Figure 6.4: A plot of the GAM smooths for each of the spatial covariates in the model, showing the shape of the functional form of latitude and the linear form of longitude. Observations are plotted as dashes along the horizontal axis.

41

I have now obtained estimates of waiting distances throughout the region I am interested in by use of the GAM described in equation 6.3. To obtain estimates of density in the region from this, I need to calculate the ‘waiting area’ defined as twice the effective strip width times the waiting distance. The effective strip widths along the trackline were obtained as previously mentioned for all of the 281 observations, and these estimates were used to give estimates of µ for each point in the grid of interest. Multiplying the 2 vectors together and then by 2 gave the waiting area of each point throughout the region. Density from this is given as its reciprocal. However, as calculated earlier, estimated cluster size is 1.8646 and so it is this divided by the waiting area that gives us the density throughout the region. This is used because I am assuming that the expected cluster size is constant throughout the region. The fitted density surface together with scales for each of the seasons is given in figure 6.5 and 6.6 respectively. The density surface for each season is identical because season was included as a factor covariate which means that it will only change the scale of the model, not its shape. Estimating Abundance

To estimate abundance in the given strip, the density surface that has been fitted is integrated under the area. This can be done for all four seasons, giving abundance estimates in the region of interest for Spring, Summer, Autumn and Winter. If one wished to calculate abundance for a sub-region of the area then the integration would simply be changed to be under the new sub-region. This shows one of the main advantages of the spatial model over conventional distance sampling. The non-parametric bootstrap was used to obtain confidence intervals for the estimates of abundance. The trips, not transects, were used as the sampling units and were sampled with replacement until a new set of p trips had been obtained (p = the no. of trips made in original sample). Waiting distances were then obtained, as before, for the new data set and, conditioning on the original model, predicted waiting distances for the grid were made. The same steps as before were carried out in order to produce an estimate of abundance Nˆ B1 and this value was stored in a data frame. The whole process was repeated 42

99 times to give a set of 100 estimates of abundance. Putting these estimates in order the confidence limits could be read off as the 2.5th and 97.5th values. This was done for all four seasons.

Figure 6.5: The fitted density surface of fin whale in the region of interest between Portsmouth and Bilbao.

43

Figure 6.6: The scales of the fitted density surface for each of the four seasons. The plot for the four seasons looks identical and is why only one plot is given – in figure 6.4.

Abundance Estimate (Nˆ )

95% Confidence Interval

Spring

183

(142,211)

Summer

1463

(1267,1702)

Autumn

594

(569,653)

Winter

38

(26,46)

Season

Table 6.1: Abundance estimates of fin whale in the region together with 95% confidence intervals, obtained by the non-parametric bootstrap.

Discussion

The highlighted area on the density map is an area to which the model was extrapolated slightly. To ensure that this region was not dominating the plot and hiding any high density areas that may occur elsewhere, a second density map of the area was produced using just the average route as the grid to predict on. This map is shown below in figure 6.7. As can be seen from this map, the area of high density has not changed and thus we can be sure that the extrapolated region is not over-dominating the plot.

44

Figure 6.7: A second density map of the region predicted for the average route travelled

It is also worth noting that models that fit nearly as well as the chosen model were also used to produce a density map of the region. All maps produced from the similarly good fitting models were very alike. The two statements that the BDRP make, as mentioned earlier, are that fin whale occur beneath 46 degrees latitude and that seasonal migration occurs with a higher abundance present in the summer. Looking at the density map produced, it is clear to see that few fin whales occur above 46 degrees latitude, but beneath this, high densities are present. This therefore backs up the a priori beliefs of the BDRP. The abundance estimates show a

45

clear seasonal trend that peaks in summer and is at a low in the winter. Again this corresponds with the hypothesised trend, and supplies good evidence of seasonal migration. In my analysis I used a value of estimated cluster size in order to obtain the density surface. In doing this I assumed that mean cluster size did not vary through the region. If it was believed that the mean cluster size did vary according to some spatial trend and was therefore not constant, then it could be modelled throughout the region to produce another fitted surface corresponding to expected cluster size. In the non-parametric bootstrap, used to estimate confidence intervals for abundance estimates in each season, I only produced 99 re-samples of the trips and thus had 100 estimates of abundance. This is quite a low value of re-samples for sound confidence intervals from the bootstrap, but the inefficiency in the method meant that to do 500 or 1000 re-samples would not have been possible. The confidence intervals presented here give an idea of the actual confidence limits, but should not be taken to be completely accurate. Variance estimation of this procedure is one of the major areas of research and is discussed more in the following chapter.

46

7. Discussion The application of the method presented shows how the wildlife manager can easily identify areas of high and low density. The covariates that are included in the model can also suggest the relation to any environmental features. Knowing where the hot spots in density occur, by looking at the density map produced, the wildlife manager can control, maintain and monitor the population much more easily and the area of frequent sightings is known. Surveys from platforms of opportunity often violate assumptions for use with general distance sampling and so the spatial representation supplies a solution, which, although rather more complex, provides less biased results and more useful information for ecologists. The method presented works well for the example given due to the type of species of interest and the covariates used in the detection function and spatial model. Other species may not work as well as the example. For example, if sighting specific covariates are included in the detection function, then the effective strip widths obtained corresponding to the observations cannot be used to predict the effective strip widths for points in the grid. Or if the species occur in less well-defined clusters then care has to be taken when estimating the expected cluster size because this could be very variable. Also if expected cluster size cannot be assumed constant throughout the region then this too must be modelled to give a spatial representation of cluster size. There has been much literature produced (Gotway and Stroup, 1997; Augustin et al, 1998) on the topic of variance for spatial methods of this kind. The problem can arise when there is evidence of spatial autocorrelation, due to either missing covariates or genuine autocorrelation. To overcome this, care should be taken to measure all relevant covariates that could account for spatial autocorrelation or to model it by defining a variance-covariance matrix such as Gotway and Stroup (1997). Researchers are continuing to develop methods to tackle the problem of variance estimation. The nonparametric and parametric bootstraps are liable to bias caused by un-modelled spatial autocorrelation and the non-parametric bootstrap can cause particular problems because

47

of the assumption of independent identically distributed (i.i.d.) sampling units (transects). The nonparametric bootstrap also requires the transects to have good spatial coverage so that, on resampling, whole sub-regions are not missed. In the example presented here, the transects provided good spatial coverage and the assumption of i.i.d. sampling units was ok because all trips sampled were along the same route. For a review on variance problems with this method the reader is referred to Buckland et al (2004): 66-68. Any extrapolation away from the transect line needs to be carefully considered. Too much extrapolation can completely change the look of the plot and affect any inference. To overcome this it would be nice to have information on the species at other points in the area. In my example the area of interest is covered by more than the one ferry route and if information was available from other vessels then further developments in the methods could possibly enable a complete spatial covering of the area. Using the method presented, it is suggested that only very limited extrapolation beyond the surveyed strip should be performed. There is a lot of scope for methods of this kind to develop and Buckland et al (2004) “expect significant methodological development in this field over the next few years”. Developments like those I have presented here but that can be applied to any species in any situation, and include a spatial representation of cluster size and effective strip width, are needed. Given a lot of data through time and by using the density maps, trends can be analysed by seeing where the hot spots move through time and whether more environmental features become prominent in the model. The methods presented, although developed with the use of standard statistical software in mind, are at present very complicated to carry out. Obtaining confidence intervals via the non-parametric bootstrap, for example, required a lot of time and complex code to be written. I would imagine that in future years a more user friendly way of performing this sort of analysis will be available so that field biologists and ecologists, not just the statistician, can achieve these results. Incorporation into standard software program Distance (Thomas et al, 2004) is the obvious future direction for this technique.

48

Overall, I think that as the methods become more and more robust and calculation and analysis are simplified by the addition of appropriate software, then more surveys will be analysed in this way. Analyses that perform standard distance sampling alone will instead use the distance sampling results as a component in the much more detailed spatial model.

49

References: Akaike, H (1973) Information theory and an extension of the maximum likelihood principle, in International Symposium on Information Theory, 2nd edn (eds B. N. Petran and F. Csaaki), Akedeemiai Kiadi, Budapest Hungary, pp. 267-81. Anganuzzi, A. A. and Buckland, S. T. (1993) Post–stratification as a bias reduction technique. Journal of wildlife management, 57, 827-34 Augustin, N. H., Mugglestone, M. A. and Buckland, S. T. (1998) The role of simulation in modeling spatially-correlated data. Environmetrics, 9, 175–96. Borchers, D. L. (1996) Line Transect Estimation with Uncertain Detection on the Trackline, Ph. D. thesis, University of Cape Town. Borchers, D. L., Buckland, S. T., and Zucchini, W. (2002) Estimating Animal Abundance: Closed Populations, Springer-Verlag, London. Buckland, S. T., Anderson, D. R., Burnham, K. P., and Laake, J. L. (1993) Distance Sampling: Estimating Abundance of Biological Populations, Chapman and Hall, London. Buckland, S. T., Anderson, D. R., Burnham, K. P., Laake, J. L., Borchers, D. L., and Thomas, L. (2001) Introduction to Distance Sampling, Oxford University Press, Oxford. Buckland, S. T., Anderson, D. R., Burnham, K. P., Laake, J. L., Borchers, D. L., and Thomas, L. (2004) Advanced Distance Sampling, Oxford University Press, Oxford. Buckland, S. T. (1984) Monte Carlo confidence intervals. Biometrics, 40, 811-17. Buckland, S. T. (1992) Fitting density functions using polynomials. Applied Statistics, 41, 63-76. Burnham, K. P., Anderson, D. R., and Laake, J. L. (1979) Robust estimation from line transect data. Journal of wildlife management, 43, 992-6 Burnham, K. P., Anderson, D. R., and Laake, J. L. (1980) Estimation of density from line transect sampling of biological populations. Wildlife monographs, 72, 1-202. Davison, A. C. and Hinkley, D. V. (1997) Bootstrap Methods and their Application, Cambridge University Press, Cambridge. Drummer, T. D. and McDonald, L. L. (1987) Size bias in line transect sampling. Biometrics, 43, 13–21.

50

Drummer, T. D., Degange, A. R., Pank, L. L. and McDonald, L. L. (1990) Adjusting for group size influence in line transect sampling. Journal of Wildlife Management, 54, 5114 Efron, B. and Tibshirani, R. J. (1993) An Introduction to the Bootstrap, Chapman and Hall, London. Efron, B. (1979) Bootstrap methods: another look at the jackknife. Annals of Statistics, 7, 1-16. Gotway, C. A. and Stroup, W. W. (1997) A generalized linear model approach to spatial data analysis and prediction. Journal of Agricultural, Biological, and Environmental Statistics, 2, 157–178. Hastie, T. J. and Tibshirani, R. J. (1990) Generalized Additive Models, Chapman and Hall, London. Hedley, S. L. (2000) Modelling Heterogeneity in Cetacean Surveys, Ph. D. thesis, University of St Andrews. Hedley, S. L., Buckland, S. T. and Borchers, D. L. (1999) Spatial modelling from line transect data. Journal of Cetacean Research and Management, 1, 255–64. Hiby, A. R. (1986) Results of a hazard rate model relevant to experiments on the 1984/85 IDCR minke whale assessment cruise. Report of the International Whaling Commission, 36, 497-8. Horvitz, D. G. and Thompson, D. J. (1952) A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association, 47, 663–85. Manly, B. F. J., (1997) Randomization, Bootstrap and Monte Carlo Methods in Biology, 2nd edn. Chapman and Hall, London. Marques, F. F. C. (2001) Estimating Wildlife Distribution and Abundance from Line Transect Surveys conducted from Platforms of Opportunity, Ph. D. thesis, University of St Andrews. Marques, F. F. C. and Buckland, S. T. (2003) Incorporating covariates into standard line transect analyses. Biometrics, 59, 924–35. Marques, T. A. (2004) Predicting and correcting bias caused by measurement error in line transect sampling using multiplicative error models. Biometrics, 60, 757-63.

51

McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd edn, Chapman and Hall, London. Otto, M. C. and Pollock, K. H. (1990) Size bias in line transect sampling: a field test. Biometrics, 46, 239–45. Ramsey, F. L., Wildman, V. and Engbring, J. (1987) Covariate adjustments to effective area in variable-area wildlife surveys. Biometrics, 43, 1–11. Seber, G. A. F. (1982) The Estimation of Animal Abundance and Related Parameters, Macmillan, New York. Thomas, L., Laake, J. L., Strindberg, S., Marques, F. F. C., Buckland, S. T., Borchers, D. L., Anderson, D. R., Burnham, K. P., Hedley, S. L., Pollard, J. H. and Bishop, J. R. B. (2004) Distance 4.1. Release 2, Research Unit for Wildlife Population Assessment, University of St. Andrews. http://www.ruwpa.stand. ac.uk/distance/

52

Suggest Documents