Estimating uncertainty in nonlinear models: applications to ... - ICES

ICES CM 2007/O:36

Estimating uncertainty in nonlinear models: applications to survey-based assessments Coby L. Needle∗

Richard M. Hillary

FRS Marine Laboratory

Imperial College

Aberdeen

London

[email protected]

[email protected]

Tel: +44 (0) 1224 295456

Tel: +44 (0) 20 7594 9311

Fax: +44 (0) 1224 295511

ICES Annual Science Conference 2007: Theme Session O Abstract Stock assessments based on research vessel surveys or other fishery-independent sources of information are becoming increasingly important as drivers of fisheries management advice, in Europe and elsewhere. In some cases this approach has arisen as a consequence of stringent management measures which have led to less reliable commercial catch and effort data; in others, the stock trends indicated by fishery-independent data are used as informative counterparts to more traditional catch-based assessment methods. An important feature of any survey-based method should be the estimation of the variances or confidence intervals (CIs) of population metrics such as mortality and abundance. There are many ways to estimate these variances. In this paper, we explore five such methods, namely the analytic delta method, residual and data bootstraps, parameter simulation, and a Bayesian approach. We use each method to analyse the variance properties of model fits to simple bivariate data, before considering the implications for survey-based stock assessment models. We conclude that variance estimators need to be considered carefully for such models, as the incorrect choice can result in misleading fisheries management advice.

1

I NTRODUCTION

A number of survey-based (or fishery-independent) assessment methods have become available in recent years. These can be invaluable for providing management advice in situations where fisheries are developing or under moratoria, or where commercial catch information is questionable: they can also provide useful supplementary information to more standard catch-at-age analyses. Examples of such methods include catch-size analyses (CSA; Mesnil 2003), biomass random-effect models (BREM; Trenkel 2007), year-class curves (YCC; Cotter et al. 2007), extensions to time-series analysis (TSA; Gudmundsson 1994), and separable survey-based models (SURBA; Needle 2003, Needle 2004, Beare et al. 2005). Many other such models exist or are in development. Of key importance to the use and understanding of survey-based models is the estimation of uncertainty. Survey data are noisy and variable, due to small sample sizes and natural variation in fish distribution, and it ∗ Corresponding

author

1

can be difficult to ascertain the underlying signal contained in such data. Without an estimate of uncertainty, the output from any such model will indicate much more precision and confidence than is probably justified: but the wrong estimate of uncertainty can also lead to incorrect management action, with potentially serious consequences for sustainability. In this paper, we have used simulation analyses to explore the uncertainty estimation methods that might be appropriate for survey-based assessment models. This follows a similar exercise carried out at the 2007 meeting of the ICES Working Group on Methods of Fish Stock Assessments (ICES 2007), where comparisons were made of the uncertainty in simple surplus-production models as estimated by bootstrap or Bayes approaches. Here, we fit two simple models (linear and Ricker) to simulated data, and estimate model-fit uncertainty using five different methods: the delta method, data and residual bootstrapping, parameter simulation from (potentially multivariate) Normal distributions, and a Bayes approach. The analyses are coded throughout in R (R Development Core Team 2006). Conclusions are drawn about the suitability of each of these methods, and we discuss the implications of our findings for future work on survey-based assessment methods.

2

M ETHODS

2.1 Confidence interval estimation In this paper, we are primarily interested in the confidence intervals (CIs) of lines (or curves) fitted to data by loglinear regression, and (more specifically) the different CIs generated by different CI-estimation methods. We apply five such methods: Delta This is an analytic method in which variances (and thus CIs) of fitted lines are approximated via functions of the variances on estimated parameters (see Appendix B). Residual bootstrap Here multiplicative residuals to the model fit are resampled and applied to the original data. The model is fitted to the new dataset. As this is done many times, an empirical distribution of the underlying fitted line is generated, from which best fits and CIs can be drawn. Data bootstrap In this approach, the original data are resampled before proceeding as for the residual bootstrap. Parameter simulation Here parameter values are simulated from a univariate (or multivariate) Normal distribution based on the variance-covariance matrix of the original model fit. Repeating this many times leads to a distribution of fitted lines as before. Bayes In the Bayesian approach, noninformative prior distributions of the model parameters are updated to posterior distributions using the available data and Bayes theorem. Again, this provides a distribution of fitted lines from which inferences can be drawn about confidence intervals (or, in this case, probability intervals, which are equivalent for our purposes). Details on the implementation of these methods are given below (Sections 2.2 to 2.6). We applied these CI-estimation methods to two simple case studies. The first is the linear model R = γS 2

(1)

in which γ gives the slope of a straight line passing through the origin (Figure 1). R and S are the dependent and independent variables, although in the context of fish stock assessment they are used specifically to denote, respectively, recruitment to the fished population, and reproductive potential (the latter is usually approximated by parental spawning-stock biomass.) The second case study is the Ricker model (Ricker 1958) R = αSe−βS , (2) where α and β index density-independent and density-dependent effects, respectively. When drawn on an (S, R)-scatterplot, the Ricker model produces a dome-shaped curve with slope at the origin given by α and maximum recruitment at S = β1 (Figure 1). For both case studies, and for each of m simulated datasets, n (S, R)-points were generated using Linear:

Rk

=

γSk eεk

(3)

Ricker:

Rk

=

αSk e−βSk +εk

(4)

¡ ¢ where k = 1, . . . , n, Sk ∼ U (Smin , Smax ) and εk ∼ N 0, σ 2 . Parameter values used in the case studies are given in Appendix A. Model parameters were estimated using loglinear regression, via Linear:

ln (Rk /Sk )

= b0

(5)

Ricker:

ln (Rk /Sk )

= b1 + b2 Si

(6)

from which parameters were derived using Linear: Ricker:

γˆ α ˆ

ˆ

=

eb0

=

ˆ b1

(7)

e ,

βˆ = −ˆb2

(8)

Confidence intervals were then estimated using the aforementioned five methods described in more detail below (Sections 2.2 to 2.6). We summarise the results of the analyses in two ways. Firstly, we denote the estimated upper and lower 95% confidence limits of γˆ (for example) by γˆ + and γˆ − , respectively. We tally the number mo of the m datasets for which the true parameter value γ lies within the estimated limits (ˆ γ − , γˆ + ). The ratio − + mo /m, which can also be written as P (γ ∈ [ˆ γ , γˆ ]), summarises the frequency with which the estimated confidence interval overlaps the true parameter value. It will be conditional on the combination of the model, the data and the CI-estimation approach. If P (γ ∈ [ˆ γ − , γˆ + ]) & 0.95, then we can conclude that the combination is appropriate: in other words, that the CI-estimation method results in valid confidence intervals when the given model is fitted to data with the same distributional assumptions as those used in the simulations. More general inferences cannot be made. ˆ i,j denote the ith point (where i = 1, . . . , imax ) on the fitted line for the jth run Secondly, we let R (where j = 1, 2, . . . , m), so that Linear: Ricker:

ˆ i,j R

=

ˆ i,j R

=

γˆ Si,j α ˆ Si,j e

(9) ˆ i,j −βS

(10)

where (for some τ ¿ 1.0) imax = τ1 (Smax − Smin ) and Si,j ∈ [Smin , Smin + τ, Smin + 2τ, . . . , Smax ]. The approximate pointwise 95% confidence limits for the fitted lines are then estimated by calculating 3

ˆ i,j for each i and j (and hence for each Si,j ). These percentiles (2.5% for lower, 97.5% for upper) of R + ˆ ˆ − . We develop a measure of the overall area of the fitted-line confidence limits are denoted by Ri,j and R i,j confidence interval by approximating the integral of the lower and upper confidence intervals for the jth run through area summation: Z ˆ+ R j

iX max

≈

ˆ+ , τR i,j

(11)

ˆ− . τR i,j

(12)

i=1

Z ˆ− R j

iX max

≈

i=1

Then the average area of the confidence interval over m realisations is ¶ Z m µZ 1 X ˆ+ − R ˆ− . Iˆ = R j j m j=1

(13)

Examples of the areas covered by estimated CIs are given in Figure 1.

2.2 The delta method A general introduction to the delta method can be found in Appendix B. To apply the method to the linear model R = γS, we take natural logs of both sides to give ln R = ln(γS) and hence

µ ln

R S

(14)

¶ = ln γ = b0 ,

(15)

where we define b = (b0 ). Then the transformation function G (see Appendix B) is written as G(S; b) = [G0 (b0 )] = eb0 .

(16)

In other words, to recover the required parameter γ from the estimated parameter b0 , the necessary transformation is γ = eb0 . The derivative of G with respect to the estimated parameters is then · ¸ £ ¤ ∂G0 G0 (S; b) = G0 0 (b0 ) = = eb0 = G0 (S; b)T . (17) ∂b0 From Equation 45 we get

h i ˆ Var [ˆ γ ] = e2b0 Var ˆb0 ,

(18)

Similarly, the formula for the variance of the fitted line (Equation 46) becomes h i γ] ˆ = Var [ˆ Var ln R 2 γˆ which simplifies to

h i h i ˆ = Var ˆb0 . Var ln R

(19)

(20)

To apply the delta method to the Ricker model, we note firstly via Equation 6 that the transformation function G can be vectorised as

" G(b) =

G1 (b1 ) G2 (b2 ) 4

#

" =

eb1 −b2

# .

(21)

400 300 200 100 0

Recruitment (R)

0

20

40

60

80

100

60

80

100

200 150 100 0

50

Recruitment (R)

250

SSB (S)

0

20

40 SSB (S)

Figure 1: Examples of simulated data (points), fitted lines (solid black lines) and estimated confidence intervals (shaded area) for the linear (top) and Ricker (bottom) models. In each case the solid red line gives the true model from which the points are generated. The R + R − ˆ − R ˆ (see text for details). shaded area is computed as I˜ = R

5

Then

" 0

G (b) =

∂G1 ∂b1 ∂G2 ∂b1

#

∂G1 ∂b2 ∂G2 ∂b2

" =

eb1 0

0 −1

# = G0 (b)T ,

(22)

and Equation 45 yields " Var [G(b)] =

ˆ Var[ˆ α] Cov[ˆ α, β] ˆ ˆ Cov[α ˆ , β] Var[β]

#

h i i  h ˆ ˆ e2b1 Var ˆb1 −eb1 Cov ˆb1 , ˆb2 h i h i . = ˆ −eb1 Cov ˆb1 , ˆb2 Var ˆb2 

(23)

To derive the formula for the confidence limits on the fitted line, we first rewrite Equation 46 as µ ¶2 h i ∂G ∂G h i µ ∂G ¶2 h i ∂G ˆ = Var [α + 2Cov α ˆ , βˆ + Var βˆ . Var ln R ˆ] ∂α ∂α ∂β ∂β

(24)

From Equation 6 we see that ∂G ∂α ∂G ∂β

=

1 , α ˆ

(25)

= −S,

(26)

and thus

h i h i ³ h i´ h i 2ˆ b1 ˆ = e Var ˆb1 − 2S −eˆb1 Cov ˆb1 , ˆb2 + S 2 Var ˆb2 Var ln R α ˆ2 α ˆ which simplifies to h i h i h i h i ˆ = Var ˆb1 + 2SCov ˆb1 , ˆb2 + S 2 Var ˆb2 . Var ln R

(27)

(28)

Confidence limits on the fitted curve follow from Equation 47.

2.3 Residual bootstrap A simple approach to uncertainty estimation is to bootstrap the residuals to the fitted model, apply these residuals to the original data, and refit the model to the now-perturbed data. When this is done many times, an empirical frequency distribution of the estimated parameters is constructed, and percentiles of this distribution can be used as approximations to the confidence limits of the estimated parameters or the fitted line. The algorithm used can be written as follows: ³ ´ 1. Fit the model (Equations 5 and 6) to the data (S, R) to generate estimated parameters b = ˆb1 , ˆb2 , . . . ³ ´ ˆ i . In the case studies discussed below, these residuals and corresponding residuals ri = ln Ri /R are assumed to be lognormally distributed, since that is the assumption implicit in the choice of the estimating equations – that need not be the case, however. 2. For j = 1, . . . , J: (a) For k = 1, . . . , n: ∗ i. Randomly sample from residuals r to get rk,j . ∗

∗ ii. Apply this residual to the kth data point, so that Rk,j = erk,j Rk,j .

6

³ ´ ∗ (b) Fit the appropriate model (linear or Ricker) to the dataset of points given by Sk,j , Rk,j to give ³ ´ ∗ ∗ ˆ∗ ∗ new fitted parameters γˆj or α ˆ j , βj via transformed parameters bj . The new fitted model is ˆ∗

∗ ∗ then Rk,j = γˆj∗ Sk,j (linear) or Rk,j =α ˆ j∗ Sk,j e−βj Sk,j (Ricker).

3. For each Sm ∈ [Smin , Smax ], calculate percentiles (2.5% and 97.5%) of the set of points given by the re-estimated models. These percentiles will give approximate 95% confidence limits about fitted line generated by the model (as described on page 3). Confidence intervals for the estimated parameters can also be derived. We note that, for a one-parameter model, the confidence interval on the plotted line will have a monotonic relationship with the confidence interval on the estimated parameter. However, for models with two or more parameters, the lower confidence bounds (say) on all the parameters will not necessarily yield the equivalent lower confidence bound on the fitted line. For example, the lower confidence bound on the fitted Ricker curve is approximated by the combination of the lower bound on the α parameter and the upper bound on the β parameter. With more parameters the issue becomes more complicated, and this is why percentiles of the fitted line itself must be calculated (see page 3).

2.4 Data bootstrap An alternative bootstrap method is to resample points with replacement from the original data and refit the model each time. The algorithm for this approach is as follows: 1. For j = 1, . . . , J: (a) For k = 1, . . . , n:

³ ´ ∗ ∗ i. Randomly sample from (S, R) data points to get new data Sk,j , Rk,j .

³ ´ ∗ ∗ (b) Fit the appropriate model (linear or Ricker) to the dataset of points given by Sk,j , Rk,j to ³ ´ ∗ give new fitted parameters γˆj∗ or α ˆ j∗ , βˆj∗ . The new fitted model is then Rk,j = γˆj∗ Sk,j (linear) ˆ∗

∗ or Rk,j =α ˆ j∗ Sk,j e−βj Sk,j (Ricker).

2. Calculate percentiles (2.5% and 97.5%) of the set of re-estimated models to give approximate 95% confidence limits about the fitted line.

2.5 Parameter simulation In this approach, an assumed distribution (we use Normal distributions) about each fitted parameter is generated, using the estimate of mean and variance from the model fit. If there is more than one parameter, the distribution is multivariate normal and is determined by the estimated variance-covariance matrix: in this way, parameter inter-correlations are maintained. We then simulate parameters many times from this distribution, each time generating the corresponding model fit. Percentiles of the frequency distribution of these generated lines can then be considered to be approximations to the confidence interval of the underlying line, as before. The algorithm for this approach is similar to those for the residual and data bootstraps: 7

1. For j = 1, . . . , J: (a) Randomly ³sample from ´ the multivariate normal distribution of the estimated parameters b to ∗ ∗ ˆ∗ ˆ give bj = b1 , b2 , . . . . The distribution is N(b, Var[b]). ³ ´ (b) Transform to parameters γˆj∗ or α ˆ j∗ , βˆj∗ of the underlying model. The new fitted model is then ˆ∗

∗ ∗ Rk,j = γˆj∗ Sk,j (linear) or Rk,j =α ˆ j∗ Sk,j e−βj Sk,j (Ricker), where k = 1, . . . , n.

2. Calculate percentiles (2.5% and 97.5%) of the set of re-estimated models to give approximate 95% confidence limits about fitted model. Parameter simulation for the Ricker model ³ makes ´ use of the mvrnorm function (from the MASS liˆ brary; Venables and Ripley 2002) to simulate α ˆ , β pairs from a multivariate Normal distribution specified by the variance-covariance matrix of the model parameters.

2.6 Bayes The Bayesian approach to statistical modelling is a two-step model for parameter probability distributions, given descriptive data. The first part of the process involves describing a probability model for how the data were generated, given the parameters (usually called the likelihood function). The second part is the inclusion of any potential distributional information we might have on the parameters external to the information in the descriptive data (normally called the prior distribution). The probability distribution of the parameters, θ, and the process variables, X, given the data/observations, Y , is given by the following interpretation of the Bayes’ rule: π(θ, X | Y, D) ∝ π(Y | θ, X, D) × π(θ),

(29)

where π(θ, X | Y, D) is the posterior distribution of the parameters and process variables; π(Y | θ, X, D) is the likelihood/probability model for the data, given the parameters; and π(θ) is the prior distribution of the parameters. The set D is called the knowledge: the set of assumed-known variables used in the probability model. One of the main advantages of using the Bayesian paradigm is that we can define the probability distribution of the parameters of interest, given the data. So, all the information on the uncertainty in the estimated paramters is contained in this distribution, whereas frequentist methods have to make either approximating assumptions about the form of the distribution of the parameters, or we have to resample the data to try and obtain confidence intervals for the parameters/variables of interest. In all but the simplest of cases, the posterior distribution of a set of parameters, θ, is rarely of a known form, and there are many methods that exist with which we can generate a sample from this distribution of interest - from MCMC (Markov chain Monte Carlo) methods to various types of importance sampling algorithms (Gilks et al. 1996, Kloek and Van Dijk 1978). With a suitably representative sample from the distribution of interest (θ 1 , . . . , θ n ) we can then infer all of the relevant statistics of θ. One point to note is that we refer to the intervals as probability intervals or credible sets (Berger 1985). These are different from confidence intervals, which in the frequentist sense relate to the frequency with which an interval will contain the (resampled) mean. Probabilty or credibility intervals can be interpreted as the interval within which the parameter of interest has a certain probability of being situated. 8

For the purposes of this paper, we work with quasi-non-informative prior distributions (Jeffreys 1961, Box and Tiao 1973) as we are trying to compare the uncertainty estimates between a variety of different methods, so having almost non-informative priors will mean we are at all times letting the data dictate the estimates, not the priors. To apply the Bayesian approach to the linear model, we need to define the likelihood function for the data, given γ, and the prior distribution for γ. From the manner in which the noise is generated in the data (Equation 3) we assume a normal-log likelihood function, as would seem sensible: n Y

µ

(ln Ri − (ln γ + ln Si ))2 √ π(R | S, γ, σ ) = exp − 2σ 2 2πσ 2 i=1 2

1

¶ .

(30)

For a prior model for γ, we model γ ∗ = ln γ, and assume that π(γ ∗ ) = ¡ ¢ π σ2 =

¡ ¢ N µγ ∗ , σγ2∗ ,

(31)

IG (λσ , ψσ ) ,

(32)

where IG() is the inverse-gamma distribution. The reason for modelling γ ∗ as a priori normal and the variance as inverse-gamma is that these priors are conjugate to the likelihood function – by this we mean that the prior combines with the likelihood to yield a posterior distribution for γ ∗ which is of a known form. In fact, the conditional posterior for γ ∗ is normally distributed as follows:   µγ ∗ P + σ12 ln (Ri /Si ) 2 σ 1 ∗ , (33) π(γ ∗ |R, S, σ 2 ) = N  γ , 1 n n 1 + + 2 2 2 2 σ σ σ σ γ∗

γ∗

while the conditional posterior of σ 2 is again inverse-gamma distributed: ¶ µ ¡ ¢ ψσ n , π σ 2 |R, S, γ ∗ = IG λσ + , 2 1 + ψσ κ where κ = 0.5

n X

2

(ln Ri − γ ∗ − ln Si ) .

(34)

(35)

i=1

For the prior parameters we have µγ ∗ = 0, σγ2∗ = 1000, λσ = 2.0 and ψσ = 1.0, which are effectively non-informative priors, so all the information should be coming from the data. The two parameters are resampled directly from their conditional posteriors. For the Bayesian Ricker formulation, we assume again a normal-log likelihood function for the data: π(R | S, α, β, σ 2 ) =

n Y i=1

√

µ ¶ (ln Ri − (ln α + ln Si − βSi ))2 exp − . 2σ 2 2πσ 2 1

(36)

We model, as with the linear case, ln α = α∗ and assume again normal priors for the α∗ and β parameters. Again, we assume an inverse-gamma prior for σ 2 : ¡ ¢ π(α∗ ) =N µα∗ , σα2 ∗ , ¡ ¢ π(β) =N µβ , σβ2 , π(σ 2 ) =IG (λσ , ψσ ) . 9

(37) (38) (39)

All of these priors are conjugate to the likelihood, so that the (conditional) posteriors of both α∗ , β and σ 2 are normally distributed as follows: P   ln Ri −ln Si +β∗Si µα∗ + 2 2 σ 1 σ ∗ , π(α∗ | β, σ 2 , R, S) =N  α , 1 (40) 1 + σn2 + σn2 2 2 σα σ ∗ α∗ P   µβ Si (ln Ri −α∗ −ln Si ) − 2 2 σ σ 1 β P 2 P 2 , π(β | α∗ , σ 2 , R, S) =N  , (41) Si Si 1 1 + + 2 2 σ2 σ2 σβ σβ ¶ µ ψσ n 2 ∗ , (42) π(σ | R, S, α , β) =IG γσ + , 2 1 + ψσ κ where κ = 0.5

n X

2

(ln Ri − α ˆ − ln Si + βSi )

(43)

i=1

Having known forms for each of the conditional posterior distributions of each parameter makes it then very simple to use the Gibb’s sampler (Gilks et al. 1996) to generate a sample from the joint posterior for all the parameters and then compute the relevant confidence intervals.

3

R ESULTS

3.1 The linear model Figures 2 and 3 show that, at least for the single realisation shown, the model fits and related CIs are similar for all CI-estimation methods. Combining the results from many simulated datasets leads to the same conclusion: Figure 4 indicates that the area covered by the CIs is very similar for all five methods. In terms of the medians of these distributions, Table 1 shows that the residual bootstrap provides slightly ˜ with the Bayes method giving the widest CIs. On the other tighter CIs overall (the lowest value of I), hand, the Bayes-generated CIs do provide the best coverage of the true parameter with the highest value of P (γ ∈ [ˆ γ − , γˆ + ]), although this metric is also greater than 0.95 (our notional threshold) for the delta and parameter simulation methods. Any differences between the methods are small, however, and there is little to choose between them in this case study. P (γ ∈ [ˆ γ − , γˆ + ])

I˜

Delta Residual bootstrap Data bootstrap

0.967 0.946 0.943

4729 4268 4271

Parameter simulation Bayes

0.955 0.969

4389 5032

Method

` ˆ − + ˜´ Table 1: Summary of the linear model analysis. P γ ∈ γ ˆ ,γ ˆ gives the proportion of datasets for which the true parameter value lies within the estimated confidence interval of the estimated parameter. I˜ denotes the area covered by the confidence interval between Smin and Smax , averaged over all m runs. The “best” value in each column is highlighted.

10

True Res. boot median Res. boot CI

0

50

Recruitment (R)

100 150 200 250 300

100 150 200 250 300 0

50

Recruitment (R)

True Fit Delta CI

0

20

40

60

80

0

20

80

60

80

True Param sim median Param sim CI

0

0

50

Recruitment (R)

100 150 200 250 300

100 150 200 250 300

True Data boot median Data boot CI

0

20

40

60

80

0

100 150 200 250 300

SSB (S)

20

40 SSB (S)

50

True Bayes median Bayes CI

0

Recruitment (R)

60

SSB (S)

50

Recruitment (R)

SSB (S)

40

0

20

40

60

80

SSB (S)

Figure 2: Linear model fits for a single realisation of the data simulation.

11

1.0

1.0

0.2

0.4

0.6

0.8

True Res boot med Res boot CI

0.0

0.0

0.2

0.4

0.6

0.8

True Fit Delta CI

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0.0

0.5

1.5

2.0

2.5

3.0

2.5

3.0

1.0

Residual bootstrap

1.0

Delta

1.0

0.2

0.4

0.6

0.8

True Param sim med Param sim CI

0.0

0.0

0.2

0.4

0.6

0.8

True Data boot med Data boot CI

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0.0

0.5

1.0

1.5

0.0

0.2

0.4

0.6

0.8

True Bayes med Bayes CI

0.0

0.5

1.0

1.5

2.0

Parameter simulation

1.0

Data bootstrap

2.0

2.5

3.0

Bayes

Figure 3: Standardised frequency distributions of γ ˆ estimates for the linear model.

12

0

0

50

100

Frequency

150

200

200 150 100 50

Frequency

2000

4000

6000

8000

2000

4000

Residual bootstrap

150

Frequency

50

100

200 150 100

0

0

50

Frequency

8000

200

Delta

6000

2000

4000

6000

8000

2000

4000

6000

8000


150 100 0

50

Frequency

200

Data bootstrap

2000

4000

6000

8000

Bayes

Figure 4: Histograms of the area I˜ =

“R

ˆ+ − R

R

” ˆ − of the confidence interval for the linear model fitted to J simulated datasets. R

13

3.2 The Ricker model Differences between the methods are no more apparent when applied to the Ricker model. The singlerealisation results in Figures 5, 6 and 7 suggest all five CI-estimation methods produce very similar CIs, as well as median estimates. This is confirmed by the summary plots in Figure 8, where the histograms of CI area are comparable for all methods (the Bayes method has perhaps a tighter distribution of CI area, but at the cost of a higher median value); and by Table 2, which shows that the Bayes method results in better parameter coverage, but wider CIs overall (the residual bootstrap gives the tightest CIs from these analyses). P (α ∈ [ˆ α− , α ˆ + ])

P (β ∈ [βˆ− , βˆ+ ])

I˜

Delta Residual bootstrap

0.939 0.899

0.939 0.906

8264 7220

Data bootstrap Parameter simulation Bayes

0.898 0.919 0.970

0.909 0.921 0.969

7529 7646 8779

Method

` ˆ − + ˜´ Table 2: Summary of the Ricker model analysis. P α ∈ α ˆ ,α ˆ and P (β ∈ [βˆ− , βˆ+ ]) give the proportion of datasets for which the true parameter values lie within the estimated confidence intervals of the estimated parameters. I˜ denotes the area covered by the confidence interval between Smin and Smax , averaged over all m runs. The “best” value in each column is highlighted.

4

D ISCUSSION AND CONCLUSIONS

In this paper we have applied five different methods to the estimation of parameter variances and fitted-line confidence intervals for two bivariate models (linear and Ricker). The results for these two cases are very similar – there is little to choose between different methods of estimating uncertainty. CIs estimated by the Bayes method tend to give better coverage of the true parameters, but this may only be the case because they tend also to be wider. With regards to implications for survey-based fisheries assessment and advice, it is important to guard against over-generalisation from these results. They are (to a large extent) conditional on a) the model used, b) the parameter estimation approach implemented, and c) the characteristics of the data-simulation paradigm. All the methods perform comparably well, but we should not necessarily extrapolate from these simple examples to more complicated survey-based assessment models. We have not yet had the opportunity to explore these ideas fully in the context of assessment models, although we do not anticipate difficulties with this and intend to complete the analyses in the very near future. However, it is possible to hypothesise that the width and utility of confidence intervals for fishery metrics, as generated by survey-based assessment models, will be more dependent on the data and the assessment model used, rather than the uncertainty-estimation method. The simple examples presented in this paper would certainly support this hypothesis, and it is important to know if it is true or not. If it is, it will cast doubt on the ability to manage fish stocks successfully through survey-based assessment approaches. Figure 9 gives an example of the confidence intervals about total mortality estimates from SURBA, which 14

200 150

Recruitment (R)

250

300

True Res. boot median Res. boot CI

0

50

100

200 150 100 0

50

Recruitment (R)

250

300

True Fit Delta CI

0

20

40

60

80

0

20

SSB (S)

80

60

80

200 150 0

0

50

100

150

200

Recruitment (R)

250

300

True Param sim median Param sim CI

100

250

300

True Data boot median Data boot CI

0

20

40

60

80

0

SSB (S)

20

40 SSB (S)

50

100

150

200

250

300

True Bayes median Bayes CI

0

Recruitment (R)

60

SSB (S)

50

Recruitment (R)

40

0

20

40

60

80

SSB (S)

Figure 5: Ricker model fits for a single realisation of the data simulation.

15

1.0

1.0

0.2

0.4

0.6

0.8


0.0

0.0

0.2

0.4

0.6

0.8

True Fit Delta CI

0

10

20

30

0

10

20

30

1.0

Residual bootstrap

1.0

Delta

0.2

0.4

0.6

0.8


0.0

0.0

0.2

0.4

0.6

0.8


0

10

20

30

0

10

20 Parameter simulation

1.0

Data bootstrap

0.0

0.2

0.4

0.6

0.8


0

10

20

30

Bayes

Figure 6: Standardised frequency distributions of α ˆ estimates for the Ricker model.

16

30

1.0

1.0

0.2

0.4

0.6

0.8


0.0

0.0

0.2

0.4

0.6

0.8

True Fit Delta CI

0.00

0.01

0.02

0.03

0.04

0.00

0.01

0.02

0.03

0.04

1.0

Residual bootstrap

1.0

Delta

0.2

0.4

0.6

0.8


0.0

0.0

0.2

0.4

0.6

0.8


0.00

0.01

0.02

0.03

0.04

0.00

0.01

0.02

0.0

0.2

0.4

0.6

0.8


0.00

0.01

0.02

0.03


1.0

Data bootstrap

0.03

0.04

Bayes

Figure 7: Standardised frequency distributions of βˆ estimates for the Ricker model.

17

0.04

0

50

100

Frequency

150

200

200 150 100

Frequency

50 0

4000

6000

8000

10000

12000

14000

16000

18000

4000

6000

8000

12000

14000

16000

18000

14000

16000

18000

200 150 0

0

50

100

Frequency

150 100 50

Frequency

10000

Residual bootstrap

200

Delta

4000

6000

8000

10000

12000

14000

16000

18000

4000

6000

8000

10000

12000


150 100 0

50

Frequency

200

250

Data bootstrap

4000

6000

8000

10000

12000

14000

16000

18000

Bayes

Figure 8: Histograms of the area I˜ =

“R

ˆ+ − R

R

” ˆ − of the confidence interval for the Ricker model fitted to J simulated datasets. R

18

uses the delta method in its current implementation (version 3.0). It is possible to draw a horizontal line through this plot which lies entirely within the confidence intervals (with the exception of one or two years), which would indicate that there has been no significant change in North Sea cod mortality for the last 24 years. This seems unlikely, and scientists and managers often raise the concern that models like SURBA are unable to detect trends in mortality because the confidence intervals are wide. Our hypothesis, which suggests that such wide intervals are not a function of any particular uncertainty-estimation method, but rather are determined by the combination of variable survey data and assumptions in the assessment model itself, would mean that this combination of model and data would be inappropriate for use as the basis of management advice, no matter what uncertainty estimation method was used. The hypothesis will be explored in future work, in which we will investigate which of the uncertainty estimation methods most faithfully captures the underlying uncertainty in simulated fish-population datasets when modelled using SURBA (and other models). From this, we suggest that we will be able to conclude whether survey-based assessment models can (or cannot) provide useful information on population dynamics to fisheries managers.

ACKNOWLEDGEMENTS We would like to express our thanks to Rob Fryer and Liz Clarke (both FRS Marine Laboratory, Aberdeen), for insightful observations on statistical methods.

R EFERENCES Beare, D, J., Needle, C. L., Burns, F. and Reid, D. G. (2005). Using survey data independently from commercial data in stock assessment: An example using haddock in ICES Division VIa, ICES Journal of Marine Science 62: 996–1005. Berger, J. (1985). Statistical Decision Theory and Bayesain Analysis, Springer-Verlag, New York. Box, G. E. P. and Tiao, G. C. (1973). Bayesian Inference in Statistical Analysis, Addison-Wesley, Reading MA. Cotter, A. J. R., Mesnil, B. and Piet, G. (2007). Estimating stock parameters from trawl CPUE-at-age series using year-class curves, ICES Journal of Marine Science. Gilks, W. R., Richardson, S. and Spiegelhalter, D. J. (1996). Markov Chain Monte Carlo In Practice, Chapman and Hall. Gudmundsson, G. (1994). Time series analysis of catch-at-age observations, Applied Statistics 43: 117–126. ICES (2007). Report of the Working Group on Methods of Fish Stock Assessment. 2007/RMC:04.

ICES CM

Jeffreys, H. S. (1961). Theory of Probability, OUP, Oxford. Kloek, T. and Van Dijk, H. K. (1978). Bayesian estimates of equation system parameters: An application of integration by monte carlo, Econometrica 46(1): 1–19. 19

Mean total mortality

2 1.8

Mean Z ( 2− 4)

1.6 1.4 1.2 1 .8 .6 .4 .2 0 1985

1990

1995

2000

2005

Year Figure 9: Time-series of total mortality for North Sea cod, estimated with SURBA from first and third quarter IBTS survey indices. Dotted lines give the approximate 95% confidence interval as estimated using the delta method.

20

Mesnil, B. (2003). The catch-survey analysis (CSA) method of fish stock assessment: An evaluation using simulated data, Fisheries Research 63: 193–212. Needle, C. L. (2003). Survey-based assessments with SURBA. Working Document to the ICES Working Group on Methods of Fish Stock Assessment, Copenhagen, 29 January – 5 February 2003. Needle, C. L. (2004). Absolute abundance estimates and other developments in SURBA. Working Document to the ICES Working Group on Methods of Fish Stock Assessment, IPIMAR, Lisbon 10–18 Feb 2004. Oehlert, G. W. (1992). A note on the delta method, American Statistician 46: 27–29. R Development Core Team (2006). R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. Ricker, W. E. (1958). Handbook of computations for biological statistics of fish populations, Bulletin of the Fisheries Research Board of Canada 119: 1–300 (full issue). Trenkel, V. M. (2007). A biomass random effects model (BREM) for stock assessment using only survey data: Application to Bay of Biscay anchovy. ICES CM 2007/O:03. Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with, fourth edn, Springer, New York.

A

A PPENDIX : C ASE - STUDY PARAMETER

VALUES

The data simulations used in the case studies in this paper were based on the following parameter values: Parameter

Value

Description

m n σ Smin Smax

1000 20 0.5 0 100

Number of simulated datasets Number of points in each dataset Standard error of simulated points Minimum spawning-stock biomass Maximum spawning-stock biomass

τ

0.1

J

1000

Step-size between successive values of S; used in plotting and integral approximations Number of bootstraps, parameter simulations and Bayes resamples carried out for each CI-estimation method and each dataset

B

A PPENDIX : T HE DELTA METHOD

The delta method (Oehlert 1992) is a technique used to obtain approximations to the variances and covariances of functions of estimated parameters. Consider a model relating variables y and x, with y = F(x; p) for parameter vector p = (p1 , p2 , . . .). Let G be a transformation of the model, so that G(y) = G (F(x; p)) = G(x; b) 21

(44)

where b = (b1 , b2 , . . .) are parameters of the transformed model. Denote further the variance-covariance matrix of the estimated parameters of the transformed model by Var [b], and let G0 be the matrix of partial derivatives of G with respect to the estimated parameters. The delta method then uses a first-order Taylor expansion to approximate the variance-covariance matrix of p via Var[p] = Var [G(b)] = G0 (b)Var [b] G0 (b)T .

(45)

This approximation will be close in general if the variance of the assumed lognormal distribution is small: otherwise, it may be misleading. Standard errors for the components of p are obtained by taking square roots of the diagonal elements of this matrix. We can also consider subsequently the confidence limits of the fitted line or curve. Once the parameter variances are obtained, the line can be plotted along with approximate pointwise 95% confidence bounds. Consider (without loss of generality) a two-parameter model fitted by y = F(x; p1 , p2 ). Then the variance of fitted yˆ (where hats denote estimated values) is given by µ Var [ˆ y ] = Var [pˆ1 ]

∂G ∂p1

¶2

∂G ∂G + 2Cov [pˆ1 , pˆ2 ] + Var [pˆ2 ] ∂p1 ∂p2

µ

∂G ∂p2

¶2 ,

(46)

and the bounds on the fitted curve are then + yˆ− = yˆ ± t0.975,n−2

22

p Var [ˆ y ].

(47)

Estimating uncertainty in nonlinear models: applications to ... - ICES

Estimating uncertainty in nonlinear models: applications to ... - ICES

Suggest Documents

Nonlinear Regression Models and Applications in ...

Bayesian Learning of Models for Estimating Uncertainty in Alert Systems

Quantification of uncertainty in nonlinear soil models at ... - Google Sites

Quantification of uncertainty in nonlinear soil models at ... - Google Sites

Nonlinear Models for Economic Forecasting Applications: An ...

Uncertainty Quantification with Applications to

Estimating Aversion to Uncertainty - Semantic Scholar

Nonlinear Regression Models with Applications in ... - Bentham Open

Estimating Nonlinear Models With Multiply Imputed Data - afcpe

Estimating Nonlinear Models With Multiply Imputed Data - Afcpe

Estimating species composition and quantifying uncertainty in ...

Progress in estimating the absolute abundance of anglerfish on ... - ICES

Nonlinear Decision Weights in Choice under Uncertainty

Characterizing Uncertainty Attributable to Surrogate Models

Causality in Nonlinear Models - CiteSeerX

Uncertainty Quantification of a Nonlinear

Estimating Nonlinear Parameters Present in OFDM ... - ScienceDirect

Combining hydrographical particles-tracking models with ... - ICES

ESTIMATING NONLINEAR INTERACTION STRENGTHS: AN ...

GA ICES - Georgia ICES

APPLICATIONS OF DIFFERENTIAL CALCULUS TO NONLINEAR ...

Automated Nonlinear Stellar Pulsation Calculations: Applications to ...

APPLICATIONS OF NONLINEAR DYNAMICS TO THE TURBO ...

APPLICATIONS OF DIFFERENTIAL CALCULUS TO NONLINEAR