Environ Ecol Stat (2014) 21:161–187 DOI 10.1007/s10651-013-0250-7
A permutation-based combination of sign tests for assessing habitat selection Lorenzo Fattorini · Caterina Pisani · Francesco Riga · Marco Zaccaroni
Received: 30 January 2012 / Revised: 18 January 2013 / Published online: 14 May 2013 © Springer Science+Business Media New York 2013
Abstract The analysis of habitat selection in radio-tagged animals is approached by comparing the portions of use against the portions of availability observed for each habitat type. Since data are linearly dependent with singular variance-covariance matrices, standard multivariate statistical tests cannot be applied. To bypass the problem, compositional data analysis is customarily performed via log-ratio transform of sample observations. The procedure is criticized in this paper, emphasizing the several drawbacks which may arise from the use of compositional analysis. An alternative nonparametric solution is proposed in the framework of multiple testing. The habitat use is assessed separately for each habitat type by means of the sign test performed on the original observations. The resulting p values are combined in an overall test statistic whose significance is determined permuting sample observations. The theoretical findings of the paper are checked by simulation studies. Applications to case studies previously considered in literature are discussed. Keywords Compositional data analysis · Johnson’s second order selection · Johnson’s third order selection · Monte Carlo studies · Multiple testing · Random habitat use Handling Editor: Ashis SenGupta. L. Fattorini (B) · C. Pisani Department of Economics and Statistics, University of Siena, Piazza S. Francesco 8, 53100 Siena, Italy e-mail:
[email protected] F. Riga Italian Institute for Environmental Protection and Research (ISPRA), Via Ca’ Fornacetta, 9, 40064 Ozzano Emilia, Italy M. Zaccaroni Department of Biology, University of Florence, Via Romana, 17, 50125 Florence, Italy
123
162
Environ Ecol Stat (2014) 21:161–187
Abbreviations RHU PAT PAHR CODA
Proportional or random habitat use Portion of animal trajectory Portion of animal home range Compositional data analysis
1 Introduction The analysis of habitat selection by animals is a crucial issue of wildlife management and conservation. Habitat selection is now a burning theme of ecological research owing to the recent advances in GPS technology which render available considerable amounts of telemetry data. Manly et al. (2002) provide a general introduction to habitat selection analysis while the special issue of the Journal of Wildlife Management (Strickland and McDonald 2006) gives a more updated review of habitat selection issues. More recently, general frameworks for the statistical analysis of habitat selection are furnished by Johnson et al. (2008), Kooper and Manseau (2009) and Kneib et al. (2011) through the use of weighted distributions, generalized estimating equations and categorical regression, respectively. The first and probably the main and most simple question to be addressed in habitat selection studies is if habitat types are all used proportionately to their availability (the so called proportional or random habitat use, henceforth RHU) or if there is preference/avoidance of some habitat types. As pointed out by Johnson (1980), the analysis can be performed at different levels of choices. In this framework, Aebischer et al. (1993) give a procedure to compare: a) the portion of each habitat within the home range versus the available portion within a delineated study area (Johnson’s second order selection); b) the portion of each habitat use versus the corresponding portion within the home range (Johnson’s third order selection). Despite the rising of a plethora of sophisticated models to analyse habitat selection, the procedure by Aebischer et al. (1993) is still in wide use, as proven by the number of citations in impacted journals which has been continuously increasing from 1993 to 2007, remaining stable at about seventy-eighty citations in the last years. The pioneering approach by Aebischer et al. (1993) has the merit of viewing habitat selection analysis as the assessment of a system of statistical hypotheses regarding the animal population under study. As such, it proceeds at animal level, i.e. taking animals rather than radio locations as sample units and considering the portion of animal trajectory (PAT) or the portion of animal home range (PAHR) within each habitat type as the interest variables. Since the trajectory of a single animal is unknown and is approximated by the sequence of radio-tracking data achieved for the animal at discrete times, if radio-tracking times are sufficiently frequent and suitably distributed throughout the monitoring time, the relative frequency of radio locations in each habitat constitutes an unbiased estimator of PAT in the habitat. At the same time, the areal distribution of radio locations, extrapolated by suitable statistical techniques (e.g. kernel smoothing, bivariate normal ellipses or minimum convex hull) constitutes an estimator of the animal home range from which PAHRs can be subsequently derived. In this context, serial
123
Environ Ecol Stat (2014) 21:161–187
163
correlation among radio tracking data of single animals may constitute a problem only for the estimation of PATs and PAHRs. Following Aebischer et al. (1993), the actual values of PATs and PAHRs are left undistinguished from their estimates achieved from the radio tracking data, supposing that the number of radio locations adopted for each animal are sufficiently large to give stable and accurate estimates of these quantities. Accordingly, if the radio-tracked animals act independently (e.g. they do not belong to the same flock or herd), the approach completely removes any correlation problem among data which would be instead present if radio locations were used as sample units. Despite these appealing features, the procedure by Aebischer et al. (1993) suffers from some drawbacks which are likely to render unreliable any conclusion about habitat selection. The main problems are induced by the use of compositional data analysis (henceforth CODA) adopted by the authors in order to handle the fact that PAT and PAHR data recorded from a sample of radio-collared animals are vectors of positive components subject to a unit-sum constraint. Thus, data are linearly dependent and give rise to singular variance-covariance matrices which, in turn, preclude the use of standard multivariate procedures such as MANOVA or other likelihood ratio tests (e.g. Aitchison 1986, 1994). On the other hand, by means of CODA, log-ratio transforms are used instead of the original data, thus achieving variance-covariance matrices which are positive definite with probability one and allowing for standard multivariate analysis. However, as pointed out by Aitchison (1994), hypotheses regarding compositional data should be consistently reformulated in terms of log-ratios before applying the standard tests. Unfortunately, in the framework of habitat selection analysis, the RHU hypothesis cannot be generally reformulated in terms of log-ratio expectations and then assessed by the familiar likelihood ratio test as actually proposed by Aebischer et al. (1993). As a consequence, the likelihood ratio test performed on log-ratio data does not necessarily assess the RHU hypothesis. Besides this main problem, the whole procedure tacitly presumes, at least, the symmetry of the distributions of log-ratios around their expectations, which does not necessarily hold. Moreover, in presence of null values of PATs and PAHRs, the use of log ratios necessitates the introduction of very arbitrary solutions. The purpose of this paper is to propose a nonparametric statistical procedure which avoids the use (and the problems) of CODA. The proposed procedure is simply based on the sign test performed on the original data. While the sign test is adopted for assessing RHU for each single habitat, the permutation procedure by Pesarin (2001) is applied to combine the p values resulting from the single tests for obtaining an overall statistic adopted for the simultaneous RHU assessment in all habitat types. The proposed procedure readily overcomes the problems entailed by the use of CODA only presuming a minimal set of assumptions on PAT and PAHR data.
2 Preliminaries and notation Given K habitat types, denote by XU = [X U 1 , . . . , X U K ]T the random vector in which the random variable X U j is the portion of the individual’s use of habitat j and
123
164
Environ Ecol Stat (2014) 21:161–187
denote by X A = [X A1 , . . . , X AK ]T the random vector in which X A j is the portion of the availability of habitat j ( j = 1, . . . , K ). If Johnson’s second order selection is analysed, then XU is the K -dimensional random vector of PAHRs while X A is a degenerate K -dimensional random vector of constants a = [a1 , . . . , a K ]T in which a j > 0 represents the portion of habitat j available in the whole study area. On the other hand, if Johnson’s third order selection is under study, XU is the K -dimensional random vector of PATs while X A is the K -dimensional random vector of PAHRs. In both cases, the difference between use and availability is given by the random vector D X = XU − X A = [D X 1 , . . . , D X K ]T , where D X j = X U j − X A j . As positive values of D X j should mean animal’s preference of habitat j while negative values should mean avoidance, the use of D X should be, in our opinion, the most natural way for analysing habitat selection. Owing to the compositional nature of XU and X A , their components are subject to the unit-sum constraints 1T XU = 1T X A = 1 where 1 is the vector of ones of adequate dimension. Accordingly, the components of D X are obviously subject to the zero-sum constraint (1) 1T D X = 0 As to the nature of the random variables X U j s and X A j s, they may virtually take all the values in the closed interval [0,1] but do not generally constitute continuous random variables in [0,1]. For example, when X U j represents the PAT in the habitat j which is customarily estimated by the relative frequency of animal’s radio locations in the habitat, then X U j necessarily takes discrete fractional values in the set {0/r, 1/r, . . . , r/r } where r is the number of radio locations adopted to approximate the animal’s trajectory. Moreover, when X U j or X A j represent the PAHR in the habitat j which is customarily achieved by spatial smoothing techniques performed on animal’s radio locations (see e.g. Worton 1989), then it may happen X U j = 0 or X A j = 0 if no location of the animal is observed in habitat j. On the other hand, the constants a j s may take all the values in the open interval (0,1), as no available habitat proportion can obviously be 0 (which would mean absence of the habitat) or 1 (which would mean presence of a unique habitat). As a consequence of these considerations the D X j s are not necessarily continuous random variables in [−1, 1]. Now suppose a sample of n radio-collared animals and denote by xU i = [xU 1i , . . . , xU K i ]T the vector in which xU ji is the portion of the use of habitat j for animal i and by x Ai = [x A1i , . . . , x AK i ]T the vector in which x A ji is the portion of the availability of habitat j for animal i (i = 1, . . . , n), in such a way that d Xi = xU i − x Ai = [d X 1i , . . . , d X K i ]T where d X ji = xU ji − x A ji constitutes the difference vector. Obviously, in the case of Johnson’s second order selection, x Ai = a for all i. Owing to relation (1), 1T d Xi = 0 for all i = 1, . . . , n, i.e. the d Xi s lie in a T (K − 1) hyperplane. Accordingly, their mean vector, say d X = d X 1 , . . . , d X K is such that 1T d X = 0 while the variance-covariance matrix, say S X , is of rank smaller than K , i.e. det (S X ) = 0. In order to avoid constrained variables and singular variance-covariance matrices, CODA is based on the arbitrary choice of a reference habitat, say h, and on the use of the log-ratio transforms YU = lr th (XU ) and Y A = lr th (X A ), where YU = [YU 1 , . . . , YU K ]T and Y A = [Y A1 , . . . , Y AK ]T are (K − 1) vectors having as
123
Environ Ecol Stat (2014) 21:161–187
165
components the log-ratios YU j = ln(X U j / X U h ) and Y A j = ln(X A j / X Ah ), respectively ( j = h = 1, . . . , K ). In this case, the habitat selection analysis proceeds by means of the difference vector DY = YU − Y A = [DY 1 , . . . , DY K ]T where DY j = YU j − Y A j , even if the differences are less straightforwardly interpretable. Indeed, DY j > 0 is equivalent to X U j / X A j > X U h / X Ah which means that, with respect to their availabilities, habitat j is used more intensively than the reference habitat h. It is at once apparent that YU and Y A depend on the choice of h. However, for simplicity of notation, throughout the paper any mention of the reference habitat is avoided if not essential. As the X U j s and X A j s are random variables on [0,1] or constants on (0,1), the YU j s and Y A j s are random variables or constants on the real axis. Moreover, no linear relation exists among them, in such a way that the DY j s constitute a set of linearly independent random variables. Thus, given a sample of n radio-collared animals, denote by yU i = [yU 1i , . . . , yU K i ]T the transformed vector yU i = lr th (xU i ) in which yU ji = ln(xU ji /xU hi ) and by y Ai = [y A1i , . . . , y AK i ]T the transformed vector y Ai = lr th (x Ai ) in which y A ji = ln(x A ji /x Ahi ) in such a way that dY i = yU i − y Ai = [dY 1i , . . . , dY K i ]T where dY ji = yU ji − y A ji constitutes the difference vector. Owing to the linear independence among the components of the dY i s, these vectors lies in the full (K − 1) Euclidean space, in such a way that their mean vector, say dY = T d Y 1 , . . . , d Y K is unconstrained while the variance-covariance matrix, say SY , is full of rank with a strictly positive determinant det(SY ). 3 A critical look at compositional analysis 3.1 Theoretical considerations Usually, statistical hypotheses deal with some aspects of the statistical distribution generating the quantities of interest (e.g., expectation, median, distribution function) which are assessed on the basis of a random sample of individuals from the population. In the present case, the hypothesis to be assessed is that the average member of the population (in the parlance of Aebischer et al. 1993) uses habitats proportionately of their availability. In a more formal framework, the null hypothesis (even if never explicitly mentioned by the authors) should be H X 0 : E(XU ) = a if PAHRs are compared with the constant vector of available proportions or H X 0 : E(XU ) = E(X A ) if PATs are compared with PAHRs. In both cases, the null hypothesis can be expressed as HX 0 : μX = 0
(2)
where μ X = E(D X ) and 0 denotes the vector of zeros of adequate dimension. On the other hand, Aebischer et al. (1993) propose a CODA-based procedure in which the hypothesis HY 0 : μY = 0
(3)
123
166
Environ Ecol Stat (2014) 21:161–187
is assessed by means of the likelihood ratio test statistic −2 ln λ, where μY =E(DY ) T and λ = det (SY )/det (SY + dY dY ). Under HY 0 and under the assumption that DY has a multivariate normal distribution, −2 ln λ is asymptotically (n large) distributed as a Chi-square with K − 1 degrees of freedom. Thus, HY 0 is rejected at a level α if 1 − FK −1 (−2 ln λ) ≤ α, where Fm denotes the Chi-square distribution function with m degrees of freedom. The fact that a reference habitat h is used as divisor in log-ratios does not cause problems as the likelihood ratio test (as other multivariate techniques) is invariant under the choice of h (Aitchison 1986, Chapter 6). However, as proven in “Appendix 1”, (3) does not coincide with the RHU hypothesis of type (2). There are some peculiar situations in which (2) and (3) are equivalent. The first situation occurs in second order selection, when the components of XU are identically distributed random variables and the components of a are all equal to 1/K ; another situation occurs in third order selection when both vectors XU and X A are constituted by identically distributed random variables. In more general (and more realistic) situations, μY = 0 even if μ X = 0. In these cases, the likelihood ratio test based on the dY i s gives rise to an uncontrollable increase of the probability of rejecting (2) when it is true over the nominal level α at which the assessment of (3) is performed. Obviously, such a probability tends to inflate as μY differs from 0. Accordingly, the unreliability of assessing (2) via the assessment of (3) can be roughly quantified by the Euclidean norm of μY , say μY , when μ X = 0. However, since μY varies with the choice of the reference habitat h, while the probability of rejecting (3) does not depend on h (as the likelihood ratio test is invariant with respect to h), a more objective measure of the unreliability of the CODA-based procedure is the averaged norm =
K 1 μY / h 2 K
1/2 (4)
h=1
where, with obvious notation, μY / h here denotes the expectation of DY when the reference habitat is h. Henceforth, will be referred for brevity to as the unreliability measure of CODA-based procedure. A further problem of the CODA-based procedure is that the determination of p values by means of the Chi-square distribution holds asymptotically only if the dY i s come from a multivariate normal distribution. As nothing ensures multivariate normality of DY , the authors propose a permutation procedure which (tacitly) presumes DY symmetrically distributed around μY . If symmetry holds and if μY = 0, than DY and −DY are identically distributed in such a way that each difference dY i can be randomized by attaching the scalar 1 or -1 with probability 1/2 (or by permuting yU i with y Ai , equivalently). Thus, for each data set dY 1 , . . . , dY n there are Q = 2n permutations of these data which may occur with the same probability, from which the permutation distribution of −2 ln λ can be determined. Then the p value of the test statistic achieved on the real data set can be obtained from the permutation distribution. Since for n large, 2n permutations may be prohibitive to be considered, the permutation distribution is usually estimated by a random sample of q permutations out of the 2n . However, once again, nothing ensures that DY is symmetrically distributed around μY . A case in which symmetry occurs is when XU and X A are identically
123
Environ Ecol Stat (2014) 21:161–187
167
and independently distributed. In this case the two vectors are exchangeable in such a way that DY and −DY are equivalent. Thus, even if less restrictive than the procedure based on the assumption of multivariate normality, the permutation procedure may give unreliable evaluation of the p values. As already pointed out in the Introduction, practical problems occur for the CODAbased procedure in presence of 0 s. Indeed, as emphasized in the previous section, the X U j s and X A j s are customarily quantified in the field by radio-tracking data in such a way that they may be 0 when no animal’s location is observed in the habitat j. In these cases, Aebischer et al. (1993) suggest substituting zeros with a “small positive value, less than the smallest recorded non zero proportion, as a zero numerator or denominator in the log-ratio transformation is invalid”. The solution seems quite arbitrary and it is likely to heavily impact on the assessment results when the presence of 0s is non negligible.
3.2 Simulation studies In order to confirm these theoretical considerations, two Monte Carlo studies were carried out. Firstly, in the framework of second order selection, K = 5 habitat types were presumed to partition the study area in accordance with a constant vector a. Five different situations were considered, ranging from a completely even partition of the study area into habitats of equal availability to a very unbalanced partition with a dominant habitat covering the 70 % of the study area and the remaining ones covering small percentages of 10 and 5 % (see Table 1). As the Dirichlet distribution represented the most familiar model to handle with compositional data (see “Appendix 2”), the vector XU was presumed to follow a Dirichlet distribution with parameter δU a where δU = 1, 10, 100 was an inverse index of variability of the marginal distributions of XU (see “Appendix 2”). In this way, E(XU ) = a irrespective of δU , i.e. the RHU hypothesis of type (2) was satisfied for each δU . Then, a sample of n = 15 radio-collared animals was presumed and, for each of the five situations and for each value of δU , 100,000 samples of size 15 were generated from the Dirichlet distribution with parameter δU a. For each sample, the likelihood ratio test statistic −2 ln λ was computed. The function compana (with parameters nrep = 1,000 e rnv = 10−18 ) of the package adehabitat (version 1.8.3) available in the R software (version 2.12.1) was used to assess HY 0 at the nominal levels α = 0.10, 0.05, 0.01 by means of both parametric and permutation procedures (Calenge 2006). Accordingly, HY 0 was rejected if 1 − F4 (−2 ln λ) ≤ α when the likelihood ratio test statistic was compared with the Chi-square distribution (parametric test) or if −2 ln λ was greater than the 1 − α quantile of the permutation distribution based on q = 1,000 permutations (permutation test). Finally, the probability of rejecting H X 0 was empirically determined as the fraction of times HY 0 was rejected. As the likelihood ratio test statistic was invariant with respect to the choice of the reference habitat, results did not depend on this choice. A similar Monte Carlo study was repeated in the framework of third order selection. Also in this case, K = 5 habitat types were used. Then the vector X A was presumed to follow a Dirichlet distribution with parameter δ A a, where δ A = 100 and a varies in
123
168
Environ Ecol Stat (2014) 21:161–187
Table 1 Type 1 error rates of the hypothesis of random habitat use H X 0 , in the case of Johnson’s second order selection for the CODA-based parametric and permutation tests in terms of habitat availability vector (a), variability index (δU ), unreliability measure () and nominal type 1 error rates (α) a
δU
0.10
(0.20, 0.20, 0.20, 0.20, 0.20)
100
(0.25, 0.25, 0.20, 0.15, 0.15)
Parametric test
Permutation test
α
α
0.05
0.01
0.10
0.05
0.01
0.00
0.19
0.11
0.03
0.10
0.05
0.01
0.02
0.19
0.11
0.03
0.10
0.05
0.01
(0.40, 0.25, 0.15, 0.10, 0.10)
0.05
0.21
0.12
0.04
0.11
0.06
0.01
(0.60, 0.10, 0.10, 0.10, 0.10)
0.05
0.23
0.14
0.04
0.13
0.07
0.01
(0.70, 0.10, 0.10, 0.05, 0.05)
0.12
0.28
0.17
0.06
0.16
0.09
0.02
0.00
0.19
0.11
0.03
0.11
0.05
0.01
(0.25, 0.25, 0.20, 0.15, 0.15)
0.22
0.22
0.14
0.04
0.13
0.07
0.01
(0.40, 0.25, 0.15, 0.10, 0.10)
0.58
0.38
0.26
0.09
0.25
0.14
0.04
(0.60, 0.10, 0.10, 0.10, 0.10)
0.62
0.58
0.43
0.19
0.42
0.28
0.09
(0.70, 0.10, 0.10, 0.05, 0.05)
1.46
0.83
0.70
0.39
0.70
0.53
0.22
0.00
0.21
0.13
0.04
0.13
0.07
0.02
(0.20, 0.20, 0.20, 0.20, 0.20)
(0.20, 0.20, 0.20, 0.20, 0.20)
10
1
(0.25, 0.25, 0.20, 0.15, 0.15)
3.25
0.38
0.25
0.09
0.26
0.16
0.05
(0.40, 0.25, 0.15, 0.10, 0.10)
8.40
0.89
0.77
0.46
0.77
0.63
0.32
(0.60, 0.10, 0.10, 0.10, 0.10)
8.97
1.00
0.98
0.85
0.98
0.94
0.74
(0.70, 0.10, 0.10, 0.05, 0.05)
20.11
1.00
1.00
0.99
1.00
0.99
0.93
accordance with the five situations considered in the previous experiment (see Table 2), while the vector XU was presumed to be independent to X A with a Dirichlet distribution with parameter δU a, where δU = 1, 10, 100. In this way, E(XU ) = E(X A ) = a irrespective of δ A and δU , i.e. the RHU hypothesis of type (2) was satisfied for each pair (δ A , δU ), even if δU = 1, 10 the variables quantifying habitat use had a greater variability than those quantifying habitat availability. Then, a sample of n = 15 radio-collared animals was presumed and, for each a and for each value of δU , 100,000 samples of size 15 were generated from the Dirichlet distribution with parameter 100a (availabilities) and coupled with samples of the same size independently generated from the Dirichlet distribution with parameter δU a (uses). For each couple of samples, the likelihood ratio test statistic −2 ln λ was computed. Once again the probability of rejecting H X 0 was empirically determined as the fraction of times HY 0 was rejected. During the simulation, Dirichlet random vectors were generated using the function rdirichlet available in the MCMCpack package (version 1.0–11) of the R software (version 2.12.1). For each combination of a and δU , Tables 1 and 2 report the unreliability measure theoretically determined by means of relations (17) or (18) respectively, as well as the rejection rate of (2) corresponding to type 1 error rates α = 0.10, 0.05, 0.01 at which the assessment of (3) is performed for both parametric and permutation tests. As expected, the simulation results completely confirm the concerns about the CODA-based procedure:
123
Environ Ecol Stat (2014) 21:161–187
169
Table 2 Type 1 error rates of the random habitat use hypothesis H X 0 in the case of Johnson’s third order selection for the CODA-based parametric and permutation tests in terms of expected habitat use and availability vector (a), variability index (δU ), unreliability measure () and nominal type 1 error rates (α) a
δU
Parametric test α 0.10
(0.20, 0.20, 0.20, 0.20, 0.20)
100
(0.25, 0.25, 0.20, 0.15, 0.15)
0.05
0.01
Permutation test α 0.10
0.05
0.01
0.00
0.19
0.11
0.03
0.10
0.05
0.01
0.00
0.19
0.11
0.03
0.10
0.05
0.01
(0.40, 0.25, 0.15, 0.10, 0.10)
0.00
0.19
0.11
0.03
0.10
0.05
0.01
(0.60, 0.10, 0.10, 0.10, 0.10)
0.00
0.19
0.11
0.03
0.10
0.05
0.01
(0.70, 0.10, 0.10, 0.05, 0.05)
0.00
0.18
0.11
0.03
0.10
0.05
0.01
0.00
0.19
0.11
0.03
0.10
0.05
0.01
(0.25, 0.25, 0.20, 0.15, 0.15)
0.20
0.21
0.13
0.04
0.12
0.06
0.01
(0.40, 0.25, 0.15, 0.10, 0.10)
0.53
0.34
0.22
0.08
0.21
0.12
0.03
(0.60, 0.10, 0.10, 0.10, 0.10)
0.57
0.50
0.35
0.14
0.34
0.21
0.06
(0.70, 0.10, 0.10, 0.05, 0.05)
1.34
0.74
0.59
0.29
0.59
0.41
0.15
0.00
0.21
0.13
0.04
0.13
0.07
0.02
(0.20, 0.20, 0.20, 0.20, 0.20)
(0.20, 0.20, 0.20, 0.20, 0.20)
10
1
(0.25, 0.25, 0.20, 0.15, 0.15)
3.23
0.38
0.25
0.09
0.25
0.16
0.05
(0.40, 0.25, 0.15, 0.10, 0.10)
8.35
0.88
0.77
0.45
0.76
0.62
0.31
(0.60, 0.10, 0.10, 0.10, 0.10)
8.92
1.00
0.98
0.85
0.98
0.94
0.73
(0.70, 0.10, 0.10, 0.05, 0.05)
19.99
1.00
1.00
0.99
1.00
0.99
0.92
i) when = 0, i.e. hypotheses (2) and (3) are equivalent, the rejection probabilities of (2) tend to be quite similar to the nominal type 1 error rates at which (3) is assessed, even if some discrepancies are still observed when the parametric test is used, owing to the lack of multivariate normality of the dY i s (see lines 1, 6, 11 of Table 1 and lines 1–5, 6 and 11 of Table 2); this problem is considerably reduced by the use of permutation test but discrepancies still remain owing to the lack of symmetry in the dY i s (see lines 6, 11 of Table 1 and line 11 of Table 2); as theoretically argued, the rejection probabilities of (2) coincide with the nominal type 1 error rate for (3) when XU and X A are independently and identically distributed (as for the first five cases of Table 2); ii) apart from these cases, when = 0, as generally happens in practical situations, the rejection probabilities of (2) turn out to be considerably greater than the nominal type 1 error rate of (3) and the differences tend to be more and more marked as increases; practically speaking, when the availability of habitat types (fixed or expected) is uneven and when XU and/or X A show a marked variability (as may occur when a limited number of radio locations are adopted to quantify PATs and/or PAHRs) (3) is rejected all the times even if RHU is true (see the last lines of Tables 1 and 2). The whole simulation study was repeated for n = 30 radio-collared animals and for K = 10 habitat types. As to these choices, it should be considered that radio-collars usually constitute very expensive devices. For this reason, in most habitat selection studies the number of radio-collared animals varies from 10 to 20. Accordingly, the
123
170
Environ Ecol Stat (2014) 21:161–187
sample size was initially fixed to 15 animals, as the intermediate value between 10 and 20. However, in order to check the behavior of the type 1 error rates as n increases, we have considered samples of n = 30 animals, which in the framework of radiotrack analysis may be viewed as an upper bound for realistic sample sizes. As to the number of habitats, it is worth noting that in most applications the number of habitats ranges from 4 to 12. Rarely (in order to avoid infrequent uses) a number of habitats greater than 15 is considered. Accordingly, the whole simulation study was repeated for K = 10 habitats, halving the availabilities and the expectations of the five habitats considered in the first part of the simulation study. Simulation results achieved with n = 30 confirm that the inflated type 1 error rates are mainly due to the divergence between the hypotheses (2) and (3). Indeed, for samples of size 30, the type 1 error rates improve (in the sense that they are closer to the nominal rates) when the hypotheses (2) and (3) are equivalent ( = 0) or nearly equivalent ( close to 0), while when the two hypotheses markedly differ the type 1 error rates even inflate toward one. That is quite obvious, since as n increases, the likelihood ratio test becomes more and more powerful in rejecting the hypothesis (3) when it is false. As to the simulation results achieved for K = 10 habitats they are quite similar to those achieved for K = 5 with type 1 error rates increasing with . Simulation results achieved for n = 30 and K = 10 are not reported for brevity. Tables of these results are available from the authors. 4 A simple permutation solution 4.1 Theoretical background The problems induced by the CODA-based procedure suggest using alternative assessments of the RHU hypothesis directly operating on D X . For this purpose a multivariate nonparametric test for assessing (2) is requested, which also avoids unrealistic distributional assumptions on D X . To the best of our knowledge no test of this type is available in literature, as nonparametric assessments on mean vectors invariably involve the symmetry of distributions around the mean vector as a minimal requirement (e.g. Pesarin 2001, section 3.5 emphasizes that these tests actually constitute multivariate tests of symmetry). In order to avoid distributional assumptions, the RHU hypothesis must be rephrased in such a way to render necessary only a minimal set of realistic assumptions. As to these assumptions, it is worth noting that in the case of second order selection, the X U j s represent the PAHRs quantified by spatial smoothing techniques performed on animal’s radio locations. As previously pointed out they may be 0 when no radio location is found on the habitat j but it is quite difficult that they may coincide with the available portion a j > 0. Accordingly it can be realistically assumed that Pr(X U j = a j ) = Pr(D X j = 0) = 0
(5)
On the other hand, in the case of third order selection, the X U j s represent the PATs quantified by the relative frequency of animal’s radio locations in the habitats while
123
Environ Ecol Stat (2014) 21:161–187
171
PAHRs play in this case the role of X A j s. Thus, if X A j > 0, it is quite difficult that it may coincide with the used portion X U j . Accordingly it can be realistically assumed that Pr(X U j = X A j |X A j > 0) = Pr(D X j = 0|X A j > 0) = 0
(6)
As opposite, if no location is observed in the habitat, it may happen that X A j = 0, in which case it obviously also happens that X U j = 0. Hence, Pr(D X j = 0|X A j = 0) = 1. On the basis of these considerations, a suitable hypothesis to be used for both second and third order selection is given by H0 :
K
π j = 0.5
(7)
j=1
where π j = Pr(D X j > 0|X A j > 0) and, in case of second order selection, the event X A j > 0 has probability one. Since π j represents the probability that habitat j, if available, is used more intensively than its availability, the π j s are quantities between 0 and 1 with π j > 0.5 when habitat j is preferred, π j < 0.5 when habitat j is avoided and π j = 0.5 in case of random use. Thus, the obvious sense of (7) is that each habitat type, when available, is used for a portion which has the same probability of being greater or less than the available portion. Even if (7) does not coincide in general with (2), no habitat selection or avoidance can be claimed for any habitat type if (7) is true. Thus (7) can be suitably taken as the RHU hypothesis to be assessed. 4.2 Combination of sign tests Since (7) is given by the intersection of the K partial hypotheses regarding each habitat use, say H0 j : π j = 0.5, the assessment of the partial hypotheses can be straightforwardly performed by means of the sign test, without no assumptions except (5) or (6). Thus, for each habitat j denote by n j the number of animals for which x A ji > 0 (note that in the case of second order selection the n j s are invariably equal to n) and by n +j the number of d X ji s strictly greater than 0 and adopt the quantity t j = max(n +j , n j − n +j ) as the test statistic. Under H0 j , n +j is the realization of a binomial random variable with parameters n j and 1/2 in such a way that t j ranges from n j /2 to n j for n j even and from (n j + 1)/2 to n j for n j odd while large values of t j denote failure of H0 j . Accordingly the p value corresponding to each t j is given by pj = 2
−n j +1
nj nj t=t j
t
(8)
in such a way that H0 j is rejected at level α when p j ≤ α. Since the test statistic t j is discrete, it has a finite number of available p values, usually referred to as natural p
123
172
Environ Ecol Stat (2014) 21:161–187
values of the test. Actually, if H0 j is rejected when p j ≤ α, the test is conservative, in the sense that the true level at which the test is performed coincides with the nearest natural p value smaller than or equal to α. By performing the randomization of the test, any α-level of interest could be achieved. However, as pointed out by Randles and Wolfe (1979), “this would not be a desirable practice”. It is also worth noting that the fraction f j = n +j /n j constitutes an unbiased and consistent (as n j increases) the sign test based on t j is equivalent to the test based on the estimator of π j . Indeed,
statistic f j − 0.5 . Now the key problem is the assessment of the whole hypothesis H0 at the same prefixed significance level α at which each H0 j has been assessed. Westfall and Young (1993) investigate the use of the minimum p value, say p = min ( p1 , . . . , p K )
(9)
as an overall test statistic to assess H0 . Subsequently, Pesarin (2001) proposes a more general procedure for multiple testing, considering a wide class of combining functions and referring to (9) as the Tippet combination algorithm. Accordingly, using Tippet combination, the crucial point reduces to determine the distribution of the minimum p value under H0 . Indeed, the analytical determination is prohibitive owing to the unknown dependence structure existing among the partial tests. Pesarin (2001, section 5.3) suggests the use of a permutation approach. The approach considers an equally likely random choice of the sign to be attributed to each difference d Xi in such a way that the random sign affects in the same way all the K differences related to the same animal, thus preserving their dependence relations. Also in this case, under H0 there are Q = 2n possible sign choices with the same probability. Accordingly, denote by t ∗jv the value of the sign test adopted for assessing the partial hypothesis H0 j computed on the v-th choice of signs, from which the corresponding p value, say p ∗jv , can be achieved by means of (8). Then, the sequence ∗ , . . . , p ∗ ) for v = 1, . . . , Q determines the of minimum p values pv∗ = min ( p1v Kv permutation distribution of (9), from which the overall p values for assessing H0 turns out to be p˜ =
Q 1 I ( p ≥ pv∗ ) Q v=1
where I (•) is equal to 1 if • is true and 0 otherwise. Accordingly, H0 is rejected at a level α if p˜ ≤ α. When Q is too large, p˜ can be approximated by using the same procedure performed on a random sample of q permutations out of Q. 4.3 Simulation studies In order to check the performance of the procedure based on the combination of sign tests as well as to perform comparisons with the CODA-based procedure, two Monte Carlo studies were carried out. In the framework of second order selection, K = 5 habitat types were presumed with the same availability vectors a considered in the
123
Environ Ecol Stat (2014) 21:161–187
173
previous simulations. Thus, the random vector XU was generated having a as the vector of expectations and medians of the X U j s, in such a way that both the RHU hypotheses H X 0 and H0 were true. Since this feature cannot be ensured by Dirichlet distributions, XU was generated as a + U where U = [U1 , . . . , U K ]T was a random vector in which the first K − 1 components were independent Beta random variables symmetrically distributed in the range (−w, w) with aK w = min a1 , . . . , a K −1 , K −1
(10)
and shape parameter β = 0.10, 0.25, 1 which constitutes an inverse index of variability, while the last component was given by U K = −(U1 +· · ·+U K −1 ) (see “Appendix 3”). During the simulation, Beta random variables were generated using the function rbeta available in the stats package (version 2.12.1) of the R software (version 2.12.1). Then, a sample of n = 15 radio-collared animals was presumed and, for each of the five situations and for each value of β, 100,000 samples of size 15 were generated. For each sample, the likelihood ratio test statistic −2 ln λ was used (as in the previous simulation studies) to assess HY 0 at the nominal levels α = 0.10, 0.05, 0.01 by means of both parametric and permutation procedures and the probability of rejecting H X 0 was empirically determined as the fraction of times HY 0 was rejected. At the same time, for each sample, the p values of the sign tests performed for each partial hypothesis H0 j was computed by means of (8) together with the overall p values p˜ determined on the basis of a random sample of q = 1,000 permutations out of Q = 215 . A similar Monte Carlo study was repeated in the framework of third order selection. Once again, K = 5 habitat types were presumed to partitioning the study area and the vector X A was presumed to follow a Dirichlet distribution with parameter δ A a, where δ A = 100 and a varies in accordance with the five situations considered in the previous simulations, while the vector XU was obtained as XU = X A + U where U was the vector of Beta variables adopted in the previous simulation with shape parameters β = 0.10, 0.25, 1. The unique exception was the range of the Beta variables, which in this case was given by the random variable XU K W = min X U 1 , . . . , X U K −1 , K −1
(11)
As shown in “Appendix 3”, both the vectors XU and X A had a as the vector of expectations and each component XU − X A has median 0 in such a way that both the RHU hypotheses H X 0 and H0 were true. Then, a sample of n = 15 radio-collared animals was presumed and, for each of the five situations and for each value of β, 100,000 samples of size 15 were generated from the Dirichlet distribution with parameter 100a (availabilities) and coupled with samples of the same size generated by adding the U j s to the X A j s (uses). For each couple of samples, the likelihood ratio test statistic −2 ln λ was computed and the probability of rejecting H X 0 was empirically determined as the fraction of times HY 0 was rejected. Moreover, for each sample the p values of the sign tests performed for
123
174
Environ Ecol Stat (2014) 21:161–187
Table 3 Type 1 error rates of the random habitat use hypotheses H X 0 and H0 in the case of Johnson’s second order selection for the CODA-based parametric and permutation tests and for the combination of sign tests in terms of habitat availability vector (a), variability index (β), unreliability measure () and nominal type 1 error rates (α) a
β
Parametric test
Permutation test
Combination of sign tests
α
α
α
0.10 0.05 0.01
0.10 0.05 0.01
0.10 0.05 0.01
(0.20, 0.20, 0.20, 0.20, 0.20) 1.00 0.05 0.20 0.12 0.04
0.10 0.05 0.01
0.03 0.03 0.00
(0.25, 0.25, 0.20, 0.15, 0.15)
0.04 0.20 0.13 0.04
0.11 0.06 0.01
0.03 0.03 0.00
(0.40, 0.25, 0.15, 0.10, 0.10)
0.40 0.30 0.19 0.07
0.18 0.10 0.02
0.03 0.03 0.00
(0.60, 0.10, 0.10, 0.10, 0.10)
0.36 0.30 0.19 0.06
0.19 0.10 0.03
0.03 0.03 0.00
(0.70, 0.10, 0.10, 0.05, 0.05)
0.43 0.29 0.18 0.06
0.17 0.09 0.02
0.03 0.03 0.00
(0.20, 0.20, 0.20, 0.20, 0.20) 0.25 0.13 0.20 0.12 0.04
0.11 0.05 0.01
0.04 0.04 0.00
(0.25, 0.25, 0.20, 0.15, 0.15)
0.11 0.22 0.14 0.05
0.12 0.06 0.01
0.04 0.04 0.00
(0.40, 0.25, 0.15, 0.10, 0.10)
2.24 0.60 0.42 0.17
0.43 0.26 0.07
0.04 0.04 0.00
(0.60, 0.10, 0.10, 0.10, 0.10)
1.93 0.69 0.51 0.20
0.58 0.40 0.14
0.04 0.04 0.00
(0.70, 0.10, 0.10, 0.05, 0.05)
2.33 0.56 0.38 0.14
0.39 0.22 0.05
0.03 0.04 0.00
(0.20, 0.20, 0.20, 0.20, 0.20) 0.10 0.26 0.20 0.12 0.04
0.11 0.06 0.01
0.04 0.04 0.00
(0.25, 0.25, 0.20, 0.15, 0.15)
0.23 0.23 0.14 0.05
0.13 0.07 0.02
0.04 0.04 0.00
(0.40, 0.25, 0.15, 0.10, 0.10)
6.61 0.78 0.60 0.25
0.72 0.52 0.17
0.04 0.04 0.00
(0.60, 0.10, 0.10, 0.10, 0.10)
5.55 0.92 0.80 0.40
0.88 0.75 0.38
0.04 0.04 0.00
(0.70, 0.10, 0.10, 0.05, 0.05)
6.77 0.71 0.51 0.20
0.62 0.40 0.10
0.04 0.04 0.00
each partial hypothesis were computed together with the overall p value determined on the basis of a random sample of q = 1, 000 permutations. For each combination of a and β, Tables 3 and 4 report the unreliability measure together with the rejection rate of (2) corresponding to type 1 error rates α = 0.10, 0.05, 0.01 at which the assessment of (3) is performed for both parametric and permutation tests as well as the rejection rate of (7) for the same type 1 error rates. As to , since its analytical determination was prohibitive, it was empirically determined by the Monte Carlo counterpart of (4). Moreover, since the rejection rates of the partial hypotheses by means of the sign test turns out to be very similar to those of the overall hypothesis (with differences at third decimal digit) they are omitted for brevity. While simulation results prove the adequacy of the procedure based on the combination of sign tests, they once again confirm the unreliability of the CODA-based procedure. Indeed: i) since is invariably greater than zero, the CODA-based procedure shows rejection probabilities usually greater than the nominal levels, with discrepancies which tend to markedly increase with ; ii) the procedure based on the combination of sign tests turns out to be conservative, showing rejection rates for both overall and partial hypotheses invariably smaller than the nominal type 1 error rate; it is worth noting that the discrepancies between
123
Environ Ecol Stat (2014) 21:161–187
175
Table 4 Type 1 error rates of the random habitat use hypotheses H X 0 and H0 in the case of Johnson’s third order selection for the CODA-based parametric and permutation tests and for the combination of sign tests in terms of expected habitat use and availability vector (a), variability index (β), unreliability measure () and nominal type 1 error rates (α) β
a
Parametric test
Permutation test
Combination of sign tests
α
α
α
0.10 0.05 0.01
0.10 0.05 0.01
0.10 0.05 0.01
(0.20, 0.20, 0.20, 0.20, 0.20) 1.00 0.01 0.19 0.12 0.04
0.11 0.05 0.01
0.04 0.04 0.01
(0.25, 0.25, 0.20, 0.15, 0.15)
0.01 0.20 0.12 0.04
0.11 0.06 0.01
0.04 0.04 0.01
(0.40, 0.25, 0.15, 0.10, 0.10)
0.24 0.25 0.15 0.05
0.15 0.08 0.02
0.04 0.04 0.01
(0.60, 0.10, 0.10, 0.10, 0.10)
0.18 0.23 0.14 0.04
0.14 0.07 0.02
0.04 0.04 0.01
(0.70, 0.10, 0.10, 0.05, 0.05)
0.27 0.25 0.14 0.04
0.15 0.08 0.02
0.04 0.04 0.01
(0.20, 0.20, 0.20, 0.20, 0.20) 0.25 0.12 0.20 0.12 0.04
0.11 0.06 0.01
0.04 0.04 0.01
(0.25, 0.25, 0.20, 0.15, 0.15)
0.12 0.22 0.14 0.04
0.13 0.06 0.02
0.04 0.04 0.01
(0.40, 0.25, 0.15, 0.10, 0.10)
1.04 0.36 0.23 0.07
0.24 0.13 0.03
0.03 0.03 0.01
(0.60, 0.10, 0.10, 0.10, 0.10)
0.70 0.34 0.21 0.06
0.25 0.14 0.04
0.03 0.03 0.01
(0.70, 0.10, 0.10, 0.05, 0.05)
1.24 0.38 0.23 0.06
0.26 0.14 0.03
0.04 0.04 0.01
(0.20, 0.20, 0.20, 0.20, 0.20) 0.10 0.24 0.20 0.12 0.04
0.11 0.05 0.01
0.03 0.03 0.00
(0.25, 0.25, 0.20, 0.15, 0.15)
0.22 0.22 0.14 0.04
0.13 0.07 0.01
0.04 0.04 0.00
(0.40, 0.25, 0.15, 0.10, 0.10)
2.66 0.43 0.27 0.09
0.29 0.16 0.04
0.03 0.03 0.00
(0.60, 0.10, 0.10, 0.10, 0.10)
1.68 0.43 0.26 0.07
0.31 0.18 0.05
0.04 0.04 0.00
(0.70, 0.10, 0.10, 0.05, 0.05)
3.33 0.47 0.29 0.09
0.31 0.17 0.04
0.03 0.03 0.00
nominal and actual levels are only due to the discrete nature of the sign tests statistic; indeed, the whole simulation was repeated by using the randomized version of the sign test and the resulting rejection rates (rounded at the second decimal digit) turned out to be invariably equal to the nominal type 1 error rates. Once again, the whole simulation was repeated for n = 30 animals and K = 10 habitats. The simulation results, not reported for brevity, confirm the results of Tables 3 and 4, with the type-1 error rates of CODA-based tests increasing with over the nominal rates, and those achieved by the combination of the sign tests which remain invariably smaller than the nominal levels. Tables of these results are available from the authors. 5 A tentative power comparison Some problems arise in comparing the power of CODA-based tests versus the combination of sign tests. Obviously, any power investigation has a statistical meaning when the actual type 1 error rate at which the test is performed is equal or below the nominal level. As theoretically argued and empirically confirmed by simulations, the similarity among actual and nominal type-1 error rates does not generally occurs for
123
176
Environ Ecol Stat (2014) 21:161–187
CODA-based tests. As to the parametric version of the likelihood ratio test, the type1 error rates are invariably superior to the nominal levels (see Tables 1, 2, 3, 4). That happens also when hypotheses (2) and (3) coincide, owing to the lack of multivariate normality and symmetry. On the other hand, as to the permutation solution of the likelihood ratio test, in the second order selection, similarity between actual and nominal type 1 error rates holds (at least approximately) when the availability vector a is equal to K −1 1 and XU is a vector of identically distributed random variables, while in third order selection, similarity holds when both vectors X A and XU are constituted by identically distributed random variables. Accordingly, power was investigated in these above-mentioned situations only for the CODA-based permutation test. For this purpose, in second order selection, the availability vector a was set equal to K −1 1 while XU was generated from a Dirichlet distribution with parameter δU b with b = K −1 1 and δU = 1, 10, 100. As to third order selection, X A was generated from a Dirichlet distribution with parameter 100a while XU was once again generated from a Dirichlet distribution with parameter δU b with b = K −1 1 and δU = 1, 10, 100. In both cases, as b approaches K −1 1, the powers of the CODA-based permutation test rightly approach the type 1 nominal error rates. As to the choice of b, it should be noticed that the variability parameter δU is likely to heavily impact on the power of tests, i.e. in presence of a small level of variability (δU = 100) even a very small difference between a and b is likely to be detected. On the other hand, for greater variability levels the test may be unable to detect even greater differences. Thus, different choices of b are used as alternatives to a = K −1 1 for different values of δU , in such a way to give a range of powers varying from 0 to 1. Then, K = 5, 10 habitats and samples of n = 15, 30 radio-collared animals were considered, and for each b and for each value of δU , 10,000 samples were generated from the corresponding alternative situation. For each sample, the likelihood ratio test statistic was computed and hypothesis HY 0 was assessed at the nominal levels α = 0.05, 0.01 by means of the permutation test. Thus, the probability of rejecting HY 0 (which in these cases coincided with H X 0 ) was empirically determined as the fraction of times HY 0 was rejected. For K = 5, n = 15, 30 and for each combination of b and δU , Tables 5 and 6 report the power of the CODA-based permutation test for type 1 error rates α = 0.05, 0.01. Results for K = 10 are quite similar to those of Tables 5 and 6 and are not reported for brevity. Tables of these results are available from the authors. The simulation study performed for assessing the power of the CODA-based permutation test cannot be used to investigate the power of the combination of the sign tests because hypotheses (7) does not hold when a = K −1 1 and XU has a Dirichlet distribution with E(XU ) = K −1 1 or when both X A and XU have Dirichlet distributions with expected vectors K −1 1. Accordingly, the power of the combination of the sign tests was checked in situations analogous—even if not equal—to those considered for the CODA-based permutation test. For this purpose, in second order selection a was set equal to K −1 1 and XU was generated as XU = b + U where b = K −1 1 and U is a random vector in which the first K − 1 components were independent Beta random variables symmetrically distributed in the range (−w, w) with shape parameter β = 0.10, 0.25, 1, where w is a constant given by expression (10) and U K = −(U1 + · · · + U K −1 ). Similarly, in the third order selection, X A was generated from a Dirichlet distribution with parameter 100a and XU was generated as Z + U
123
Environ Ecol Stat (2014) 21:161–187
177
Table 5 Power of the CODA-based permutation tests for assessing the hypothesis of random habitat use H X 0 in the case of Johnson’s second order selection in terms of expected habitat use vector (b), variability index (δU ) and nominal type 1 error rates (α) for K = 5 habitats, 15 and 30 radio-collared animals when the habitat availability vector is a = [0.20, 0.20, 0.20, 0.20, 0.20]T b
(0.22, 0.22, 0.20, 0.18, 0.18)
δU
n = 15
n = 30
α = 0.05 n = 15
n = 30
0.29
0.86
0.60
0.97
(0.23, 0.23, 0.20, 0.17, 0.17)
0.73
1.00
0.94
1.00
(0.24, 0.24, 0.20, 0.16, 0.16)
0.96
1.00
1.00
1.00
(0.25, 0.25, 0.20, 0.15, 0.15)
1.00
1.00
1.00
1.00 0.91
(0.25, 0.25, 0.20, 0.15, 0.15)
100
α = 0.01
0.21
0.73
0.50
(0.30, 0.25, 0.15, 0.15, 0.15)
10
0.49
0.98
0.80
1.00
(0.35, 0.25, 0.15, 0.15, 0.10)
0.86
1.00
0.98
1.00
(0.40, 0.25, 0.15, 0.10, 0.10) (0.25, 0.25, 0.20, 0.15, 0.15)
1
0.98
1.00
1.00
1.00
0.08
0.25
0.23
0.49
(0.30, 0.25, 0.15, 0.15, 0.15)
0.16
0.54
0.39
0.77
(0.40, 0.25, 0.15, 0.10, 0.10)
0.62
0.99
0.87
1.00
(0.60, 0.10, 0.10, 0.10, 0.10)
0.97
1.00
1.00
1.00
Table 6 Power of the CODA-based permutation tests for assessing the hypothesis of random habitat use H X 0 in the case of Johnson’s third order selection in terms of expected habitat use vector (b), variability index (δU ) and nominal type 1 error rates (α) for K = 5 habitats, 5 and 30 radio-collared animals when the expected habitat availability vector is a = [0.20, 0.20, 0.20, 0.20, 0.20]T b
δU
α = 0.01 n = 15
(0.22, 0.22, 0.20, 0.18, 0.18)
100
(0.23, 0.23, 0.20, 0.17, 0.17)
n = 30
α = 0.05 n = 15
n = 30
0.11
0.45
0.33
0.73
0.34
0.91
0.67
0.98
(0.24, 0.24, 0.20, 0.16, 0.16)
0.66
1.00
0.91
1.00
(0.25, 0.25, 0.20, 0.15, 0.15)
0.89
1.00
0.99
1.00 0.89
0.19
0.68
0.46
(0.30, 0.25, 0.15, 0.15, 0.15)
(0.25, 0.25, 0.20, 0.15, 0.15)
10
0.45
0.97
0.77
1.00
(0.35, 0.25, 0.15, 0.15, 0.10)
0.82
1.00
0.98
1.00
(0.40, 0.25, 0.15, 0.10, 0.10) (0.25, 0.25, 0.20, 0.15, 0.15)
1
0.97
1.00
1.00
1.00
0.08
0.24
0.23
0.48
(0.30, 0.25, 0.15, 0.15, 0.15)
0.16
0.53
0.40
0.77
(0.40, 0.25, 0.15, 0.10, 0.10)
0.61
0.99
0.87
1.00
(0.60, 0.10, 0.10, 0.10, 0.10)
0.97
1.00
1.00
1.00
where Z had a Dirichlet distribution with parameter 100b and U was a random vector of independent Beta random variables symmetrically distributed in the range (−W, W ) with shape parameter β = 0.10, 0.25, 1, where W is a random variable given by expression (11). In both cases, as b approaches K −1 1, the powers of the combination of the sign tests are invariably below the type 1 error rates. As to the choice of b,
123
178
Environ Ecol Stat (2014) 21:161–187
Table 7 Power of the combination of the sign tests for assessing the hypothesis of random habitat use H0 in the case of Johnson’s second order selection in terms of expected habitat use vector (b), variability index (β) and nominal type 1 error rates (α) for K = 5 habitats, 15 and 30 radio-collared animals when the habitat availability vector is a = [0.20, 0.20, 0.20, 0.20, 0.20]T b
β
α = 0.01 n = 15
(0.22, 0.22, 0.20, 0.18, 0.18)
1
n = 30
α = 0.05 n = 15
n = 30
0.08
0.29
0.29
0.53
0.27
0.77
0.62
0.93
(0.24, 0.24, 0.20, 0.16, 0.16)
0.60
0.99
0.90
1.00
(0.25, 0.25, 0.20, 0.15, 0.15)
0.91
1.00
1.00
1.00
(0.23, 0.23, 0.20, 0.17, 0.17)
(0.25, 0.25, 0.20, 0.15, 0.15)
0.25
(0.30, 0.25, 0.15, 0.15, 0.15)
0.11
0.40
0.36
0.65
0.15
0.52
0.42
0.74
(0.35, 0.25, 0.15, 0.15, 0.10)
1.00
1.00
1.00
1.00
(0.40, 0.25, 0.15, 0.10, 0.10)
1.00
1.00
1.00
1.00
0.04
0.12
0.15
0.27
(0.30, 0.25, 0.15, 0.15, 0.15)
(0.25, 0.25, 0.20, 0.15, 0.15)
0.1
0.07
0.27
0.23
0.46
(0.40, 0.25, 0.15, 0.10, 0.10)
0.80
0.99
0.93
1.00
(0.60, 0.10, 0.10, 0.10, 0.10)
1.00
1.00
1.00
1.00
the same vectors adopted in the power study of the CODA-based permutation test were adopted as alternatives to a = K −1 1. Then, K = 5, 10 habitats and samples of n = 15, 30 radio-collared animals were considered, and for each b and for each value of β, 10,000 samples were generated from the corresponding alternative situation. For each sample, the combination of the sign test was performed and hypothesis H0 was assessed at the nominal levels α = 0.05, 0.01, in such a way that the probability of rejecting H0 was empirically determined as the fraction of times H0 was rejected. For K = 5, n = 15, 30 and for each combination of b and β, Tables 7 and 8 report the power of the combination of the sign tests for type 1 error rates α = 0.05, 0.01. Results for K = 10 are quite similar to those of Tables 7 and 8 and are not reported for brevity. Tables of these results are available from the authors. It should be noticed that this simulation study cannot be used to investigate the power of the CODA-based permutation test. Indeed, even if hypotheses (2) and (7) are both true when a = K −1 1 and XU = a + U or when X A has a Dirichlet distribution with expected vectors K −1 1 and XU = X A + U, hypothesis (3) does not hold in these cases owing to the log-ratio transformation of the data. Accordingly the power comparison of CODA-based permutation test versus combination of the sign test is attempted comparing the results of Tables 5 and 7 and those of Tables 6 and 8, which are achieved under different but analogous situations. Results of these tables show that i) the likelihood ratio test seems more powerful than the combination of the sign tests for detecting difference among uses and availabilities under an even partition of the availabilities; ii) the power of both tests quickly approach one as uses diverge from the even partition of availabilities and as n increases.
123
Environ Ecol Stat (2014) 21:161–187
179
Table 8 Power of the combination of the sign tests for assessing the hypothesis of random habitat use H0 in the case of Johnson’s third order selection in terms of expected habitat use vector (b), variability index (β) and nominal type 1 error rates (α) for K = 5 habitats, 15 and 30 radio-collared animals when the expected habitat availability vector is a = [0.20, 0.20, 0.20, 0.20, 0.20]T b
β
α = 0.01 n = 15
(0.22, 0.22, 0.20, 0.18, 0.18)
1.00
(0.23, 0.23, 0.20, 0.17, 0.17)
n = 30
α = 0.05 n = 15
n = 30
0.02
0.08
0.12
0.20
0.07
0.26
0.27
0.50
(0.24, 0.24, 0.20, 0.16, 0.16)
0.16
0.56
0.47
0.80
(0.25, 0.25, 0.20, 0.15, 0.15)
0.31
0.83
0.68
0.95
(0.25, 0.25, 0.20, 0.15, 0.15)
0.25
0.18
0.59
0.49
0.82
0.21
0.65
0.55
0.86
(0.35, 0.25, 0.15, 0.15, 0.10)
0.59
0.98
0.89
1.00
(0.40, 0.25, 0.15, 0.10, 0.10)
0.85
1.00
0.98
1.00
(0.30, 0.25, 0.15, 0.15, 0.15)
0.13
0.47
0.40
0.71
(0.30, 0.25, 0.15, 0.15, 0.15)
(0.25, 0.25, 0.20, 0.15, 0.15)
0.1
0.14
0.49
0.42
0.74
(0.40, 0.25, 0.15, 0.10, 0.10)
0.74
1.00
0.95
1.00
(0.60, 0.10, 0.10, 0.10, 0.10)
1.00
1.00
1.00
1.00
Result ii) empirically confirms the consistency of the combination of the sign tests, which can be theoretically proven from the well-known consistency of any partial sign test joined with Theorem 2 by Pesarin (1992) regarding the combination of consistent tests. In addition to consistency, the combination of the sign test has the merit to hold in very general situations besides the particular case of equal availabilities of habitats, while the CODA-based permutation test can be rigorously adopted only in this situation.
6 Ordering habitat by use When the hypothesis of proportional habitat use is rejected, Aebischer et al. (1993) propose a next step for ranking the habitat type in order of use. Even if not explicitly mentioned, the ranking criterion adopted by the
authors is based on the num, in which E ln(X / X ) turns out to be greater than ber of times, say τ j U j U h
E ln(X A j / X Ah ) for h = j = 1, . . . , K . The τ j s are integers between 0 and K − 1 that should rank the habitats in order of what the authors call the increasing relative use where 0 is the worst and K − 1 is the best. As these quantities are actually unknown, the ranking is based on their sample counterparts, say r j , i.e. the numn ber of times in which y U ( j/ h) = n1 i=1 ln(xU ji /xU hi ) turns out to be greater than 1 n y A( j/ h) = n i=1 ln(x A ji /x Ahi ). Unfortunately, the ranking procedure suffers from the same drawbacks pointed out for the CODA-based assessment of RHU. Indeed, owing to the lack of nice results about expectation of ratios and logarithms, inconsistent
ranking may take place comparing E ln(X U j / X U h ) versus E ln(X A j / X Ah ) rather than E(X U j / X U h ) versus E(X A j / X Ah ) or E(X U j )/E(X U h ) versus E(X A j )/E(X Ah ).
123
180
Environ Ecol Stat (2014) 21:161–187
Moreover, as the CODA-based assessment, the procedure suffers from presence of zeros, which must be substituted by arbitrary small values or discarded. Once again a simple alternative solution can be found with the untransformed data, ranking habitats in accordance with the π j s. If the RHU hypothesis H0 is accepted, no other assessment is performed since all the π j s do not significantly differ from 0.5. On the other hand, if p˜ ≤ α and H0 is rejected, the p value of each partial hypothesis H0 j are considered in such a way that the whole set of K habitat types is partitioned into three disjoint sets: the set of habitat types for which p j ≤ α and f j > 0.5 which will be referred to as the set of preferred habitats or P-habitats; the set of habitat types for which p j ≤ α and f j < 0.5 which will be referred to as the set of avoided habitats or A-habitats and the set for which p j > α which will be referred to as the randomly used habitats or R-habitats. Practically speaking the partition induces a sort of habitat ordering based on the π j s, i.e the P-habitats having π j s greater than 0.5, the R-habitats having π j s all equal to 0.5 and the A-habitats with π j s smaller than 0.5. Since no ordering is necessary within the R-habitats, a further less formal ordering is suitable only within P- and A-habitats, conditional to the partition achieved by the assessment of H0 and without adjusting p-level for multiple testing. The ordering can be performed by assessing the hypothesis H j h : π j = π h for each h = j in the P- and A-sets by means of the test statistic t j h = f j − f h . Once again, the p value corresponding to t j h , say p j h , can be determined by using the permutations of sample data already adopted to determine p, ˜ as the fraction of permutations giving rise to a test statistic greater than t j h . If p j h ≤ α and f j > f h , then habitat j has an higher rank than h among P- or A- habitats; the opposite if f j < f h . 7 Some case studies The procedure based on the combination of sign tests was adopted to assess habitat selection on the data sets from Aebischer et al. (1993, Appendix 1) related to thirteen radio-tagged Ring-necked Pheasants (Phasanius colchicus) and seventeen radiotagged Gray Squirrels (Sciurus corolinensis), in such a way to compare the results with those achieved by the CODA-based procedure. For both the case studies, assessments were performed at type 1 error rate α = 0.05, thus rejecting random use only in presence of strong empirical evidence. Computations were performed using code in Fortran 77 compiled with Fortran Power Station 4.0. 7.1 Ring-necked pheasants Habitats type for pheasants were: scrub, broadleaved woodland, coniferous woodland, grassland, cropland. The comparison of PAHRs versus available area rejected the RHU hypothesis H0 with an overall p value p˜ = 0.001 determined on the basis of all the possible 213 = 8,192 permutations of sample data. Scrub was preferred while the remaining habitats were used at random (see Table 9). The CODA-based procedure rejected HY 0 (permutation p value achieved by q = 999 permutations smaller than 0.001) without giving the habitats responsible for the rejection, while habitat ordering gave rise to: scrub = broadleaf > conifer = grassland > cropland (Aebischer et al. 1993).
123
Environ Ecol Stat (2014) 21:161–187
181
Table 9 Combination of the sign tests for the assessment of random habitat use from the sample of 13 radio-tagged pheasants in Lion Estate (County Kildare, Ireland)
α
=
=
α
=
=
Values in parenthesis represent the f j s, diagonal values represent the p values for each individual hypothesis, non diagonal values represent p values for paired comparisons among habitat types; significant result are marked in italics
The comparison of PATs versus PAHRs rejected the RHU hypothesis H0 with an overall p value p˜ = 0.000. Scrub and broadleaf were preferred, grassland was avoided and coniferous and cropland were selected at random. The comparison of scrub versus broadleaf was not significant, so that the ordering was: scrub = broadleaves > coniferous = cropland > grassland (see Table 9). As to the CODAbased procedure, in order to avoid zeros the analysis was carried out on three habitat types always available for twelve individuals. The procedure rejected H0Y (permutation p value equal to 0.003) without giving the habitats responsible for the rejection while the habitat ordering gave rise to: scrub = broadleaf > grassland (Aebischer et al. 1993). 7.2 Gray squirrels As to squirrel study, habitats type were: young beech and spruce plantation, Thuja plantation, larch plantation, mature deciduous woodland, open ground. The comparison of PAHRs versus available area rejected the RHU hypothesis H0 with an overall p value p˜ = 0.000 determined on the basis of all the possible 217 = 131, 072 permutations of sample data. Larch and mature were preferred, Thuja and open were avoided, young was used at random. Since comparisons of larch versus mature and Thuja versus open were not significant the ordering was: larch = mature > young > T hu ja = open (see Table 10). The CODA-based procedure rejected H0Y (permutation p value smaller than 0.001) without giving the habitats responsible for the rejection but giving the same ordering achieved from the procedure of Sect. 6 (Aebischer et al. 1993). As to the comparison of PATs versus PAHRs, Thuja plantation was available for two individuals only. Since sign test performed on a sample of size 2 had poor statistical
123
182
Environ Ecol Stat (2014) 21:161–187
Table 10 Combination of the sign tests for the assessment of random habitat use from the sample of 17 radio-tagged squirrels in Elton Estate (Northamptonshire, UK)
α
=
=
α
=
=
Values in parenthesis represent the f j s, diagonal values represent the p values for each individual hypothesis, non diagonal values represent p values for paired comparisons among habitat types; significant result are marked in italics
sense, the habitat was excluded by the analysis. The H0 hypothesis was rejected with an overall p value p˜ = 0.002. Mature was preferred, young, larch were avoided and open was selected at random. The comparison between young and larch was not significant, so that the ordering was: mature > open > young = larch (see Table 10). Also in the CODA-based procedure, Thuja plantation was not considered and a modification of the procedure was performed in order to handle the presence of zeros, which occurred in ten animals out of seventeen. The procedure rejected H0Y (permutation p value equal to 0.012) without giving the habitats responsible for the rejection while the habitat ordering gave rise to inconsistent ranking results (Aebischer et al. 1993). 8 Discussion As opposite to CODA-based procedure the procedure based on the combination of sign tests assesses the RHU hypothesis without presuming unrealistic assumptions (such as multivariate normality or symmetry) about sample observations. The combination of sign tests is able to handle the presence of zeros in both availability and use data, without involving arbitrary reconstructions of sample data. Interestingly, the use of multiple testing allows to reject the overall hypothesis of RHU at a pre-fixed significance level also determining, at the same significance level, which habitat types are responsible for rejection. Simulation studies prove that the actual significance levels of this test are invariably near to nominal levels, with negligible discrepancies which are only due to the discrete nature of the sign test statistic while power of the test quickly approaches one as availabilities diverge from uses. On the other hand, in most
123
Environ Ecol Stat (2014) 21:161–187
183
situations the CODA-based procedure show actual rejection rates which turn out to be much greater than the nominal level. At the end of the proposed procedure, the set of habitat types is partitioned into preferred habitats, avoided habitats and randomly used habitats. Further orderings among preferred and avoided habitats are attempted even in a less formal way. Acknowledgments of the work.
The authors thank Luca Pratelli for his helpful suggestions in the theoretical aspects
Appendix 1: Different expressions for the hypothesis of random habitat use The RHU hypothesis (2) actually constitutes a multivariate hypothesis which can be rewritten as K
HX 0 :
E(X U j − X A j ) = 0
(12)
j=1
where E(X U j − X A j ) = 0 is the univariate hypothesis that the expected use of habitat j coincides with its expected (or constant) availability. The obvious sense of (12) is that H X 0 is true if all the univariate hypotheses are true. In turn, chosen a reference habitat h, (12) is equivalent to HX 0 :
K E(X A j ) E(X U j ) = E(X U h ) E(X Ah )
(13)
j=h=1
Indeed, if (2) is true, than for any habitat j it follows from (12) that E(X U j ) = E(X A j ) from which E(X U j )/E(X A j ) = 1. Accordingly, for the reference habitat h and for each j = h, it follows that E(X U j )/E(X A j ) = E(X U h )/E(X Ah ) from which E(X U j )/E(X U h ) = E(X A j )/E(X Ah ). As to the reverse, if (13) is true, then for the reference habitat h and for each j = h it holds that E(X U j )/E(X U h ) = E(X A j )/E(X Ah ) or equivalently E(X U j )/E(X A j ) = E(X U h )/E(X Ah ), i.e. for each j = 1, . . . , K it holds that E(X U j )/E(X A j ) = c or equivalently E(X U j ) = cE(X A j ). K But since Kj=1 E(X U j ) = j=1 E(X A j ) = 1, then c = 1, which obviously implies (2). In a similar way, chosen a reference habitat h, (3) constitutes a multivariate hypothesis which is equivalent to K
HY 0 :
E(YU j − Y A j ) = 0
j=h=1
or, more explicitly, to HY 0 :
K XU j X Aj E ln = E ln XU h X Ah
(14)
j=h=1
123
184
Environ Ecol Stat (2014) 21:161–187
From (13) and (14), it is at once apparent that (3) is equivalent to (2) if ln
X Aj E(X A j ) XU j E(X U j ) − E ln = 0 (15) − ln = E ln E(X U h ) E(X Ah ) XU h X Ah
for each j = h = 1, . . . , K . Since E {ln(X )} generally differs from ln E(X ), relation (15) does not generally hold. Appendix 2: Dirichlet distributions and log-ratio transforms The Dirichlet distribution is probably the most familiar model adopted for positive random vectors X = [X 1 , . . . , X K ]T subject to the constraint 1T X = 1. A K -variate random vector X is said to have a Dirichlet distribution with parameters δ > 0 and θ = [θ1 , . . . , θ K ]T with θ j > 0 for each j = 1, . . . , K if the joint probability density function at x = [x1 , . . . , x K ]T with 1T x = 1 is given by K
(δθ ) δθ −1 f (x) = K xj j
(δθ ) j j=1 j=1
where θ = 1T θ. As is well known (e.g. Fang et al. 1990), each marginal variable X j has a beta distribution on [0,1] with shape parameters δθ j and δ(θ − θ j ) in such a way that E(X j ) =
θj θ
and V(X j ) =
θ j (θ − θ j ) θ 2 (δθ + 1)
Accordingly, marginal expectations do not depend on δ and marginal variances increase as δ decreases. In the framework of habitat selection analysis, δ obviously accounts for the variability of portions of animal trajectories or home ranges within habitat types. However, when these quantities are estimated on the field by means of animal’s radio locations, δ also accounts for the number of radio locations adopted in the study, since marginal variances decrease as the ri s increase and estimates become close to the real values. If X has a Dirichlet distribution with parameters δ and θ, the log-ratio transform Y = lrh (X) is a random vector on R K −1 whose j-th marginal random variable Y j = ln(X j / X h ) has a generalized logistic distribution of type IV with expectation E(Y j ) = ϕ(δθ j ) − ϕ(δθh )
(16)
where ϕ(x) = ∂ ln (x)/∂ x denotes the digamma function (e.g. Johnson et al. 1995, p. 142, Fang et al. 1990, Problem 1.5).
123
Environ Ecol Stat (2014) 21:161–187
185
In the case of Johnson’s second order selection, denote by a the vector of portions of habitat types in the study area and suppose that XU has a Dirichlet distribution with parameters δU and a, in such a way that H X 0 is true. Thus, in accordance with (16), the squared value of the unreliability measure of CODA-based procedure turns out to be 2 =
K
2 1 φ(δU a j ) − φ(δU ah ) − ln(a j /ah ) K
(17)
h=1 j=h
In a similar way, in the case of Johnson’s third order selection, suppose that XU and X A have Dirichlet distributions with the same parameter a and variability parameters δU and δ A , respectively, in such a way that H X 0 is true. From (16), the squared value of unreliability measure is 2 =
K
2 1 φ(δU a j ) − φ(δU ah ) − φ(δ A a j ) − φ(δ A ah ) K
(18)
h=1 j=h
Appendix 3: Generating dependent compositional data It is worth noting that XU and X A arise from the choice of the same animal and as such they should be realistically presumed as dependent random vectors. However, the general problem of constructing dependent random vectors X1 = [X 11 , . . . , X 1K ]T and X2 = [X 21 , . . . , X 2K ]T subject to the constraint 1T X1 = 1T X2 = 1 is difficult to solve in the framework of Dirichlet model since any couple of subvectors X1 , X2 partitioning a vector X with a Dirichlet distribution turn out to be independent with marginal Dirichlet distributions (see Fang et al. 1990, Theorem 1.4). For this purpose, it is convenient to consider one vector, say X1 , distributed as a Dirichlet random vector with parameters δ > 0 and θ in such a way that 1T X1 = 1, and then obtaining X2 by means of X1 + U, where U is a random vector in which K − 1 components, say U1 , . . . , U K −1 , are random variables in the range (−W, W ) with X 1K W = min X 11 , . . . , X 1K −1 , K −1 and U K = −(U1 + · · · + U K −1 ). Indeed, after a straightforward algebra it can be proven that 0 < X 2 j < 1 for each j = 1, . . . , K while 1T X2 = 1 by construction. Obviously E(X 2 j ) = E(X 1 j ) + E(U j ), while V(X 2 j ) = V(X 1 j ) + V(U j ), providing that X1 and U are independent. If E(U) = 0, then X1 and X2 are dependent with the same mean vector. Moreover, if the U j s are symmetrically distributed around 0, than Pr(X 2 j > X 1 j ) = 0.5 for each j = 1, . . . , K . These two last features can be readily achieved if the U j s are independent beta variables on (−W, W ) with shape parameters both equal to β > 0 in such a way that they turn out to be symmetric around 0, with variance
123
186
Environ Ecol Stat (2014) 21:161–187
V(U j ) =
1 + 1)
4(β 2
Accordingly the U j s inflate the variances of the X 1 j by a term which increases as β approaches 0. If X1 coincides with the vector of constants a, then if E(U) = 0 and the U j s are symmetrically distributed around 0, E(X2 ) = a, Pr(X 2 j > a j ) = 0.5 and V(X 2 j ) = V(U j ) for each j = 1, . . . , K . Obviously, in this case the U j s varies on (−w, w) with K ). w = min(a1 , . . . , a K −1 , Ka−1 References Aebischer NJ, Robertson PA, Kenward RE (1993) Compositional analysis of habitat use from animal radio-tracking data. Ecology 74:1315–1325 Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London Aitchison J (1994) Principles of compositional data analysis. In: Anderson TW, Fang KT, Olkin J (eds) Multivariate analysis and its applications. Institute of Mathematical Statistics, Hayward, pp 73–81 Calenge C (2006) The package “adehabitat” for the R software: A tool for the analysis of space and habitat use by animal. Ecol Model 197:516–519 Fang KT, Kotz S, Ng KW (1990) Symmetric multivariate distributions. Chapman and Hall, London Johnson DH (1980) The comparison of usage and availability measurements for evaluating resource preference. Ecology 61:65–71 Johnson NL, Kotz S, Balakrishnan N (1995) Continuous univariate distributions, vol 2. Wiley, New York Johnson DS, Thomas DL, Ver Hoef TJ, Christ A (2008) A general framework for the analysis of animal resource selection from telemetry data. Biometrics 64:968–976 Kneib T, Knauer F, Küchenhoff H (2011) A general approach to the analysis of habitat selection. Environ Ecol Stat 18:1–25 Kooper N, Manseau M (2009) Generalized estimating equations and generalized linear mixed-effects models for modelling resource selection. J Appl Ecol 46:590–599 Manly BFJ, McDonald LL, Thomas DL, McDonald TL, Erickson WP (2002) Resource selection by animals. Kluwer, Dordrecht Pesarin F (1992) A resampling procedure for nonparametric combination of several dependent tests. J Italian Stat Soc 1:87–101 Pesarin F (2001) Multivariate permutation tests: with applications in biostatistics. Wiley, New York Randles RH, Wolfe DA (1979) Introduction to the theory of nonparametric statistics. Wiley, New York Strickland MD, McDonald LL (2006) Introduction to the special section on resource selection. J Wildl Manag 70:321–323 Westfall PH, Young SS (1993) Resampling-based multiple testing. Wiley, New York Worton BJ (1989) Kernel methods for estimating the utilization distribution in home-range studies. Ecology 70:164–168
Author Biographies Lorenzo Fattorini is Professor of Statistics at the Department of Economics and Statistics of the University of Siena. His field of interest is sampling theory with applications to environmental surveys and ecological diversity analysis. He is Associate Editor of Environmetrics and Environmental and Ecological Statistics and from 2006 to 2009 he has been Coordinator of the Italian Working Group on Sample Survey Methodology. Caterina Pisani is a researcher at the Department of Economics and Statistics of the University of Siena. She received her Ph.D in Applied Statistics from the University of Florence in 2001. Her research interests include the application and the theoretical development of design-based and nonparametric statistical methods for environmental and ecological sciences.
123
Environ Ecol Stat (2014) 21:161–187
187
Francesco Riga is employed at the Italian Institute for Environmental Protection and Research. He received his MD on ethology of primates and his PhD on ecology of red foxes in Italy, both at the University of Rome “La Sapienza”. He is currently involved in Ungulates and Lagomorph conservation and management. His research focuses on habitat suitability models, spatial ecology and habitat selection of mammals monitored by GPS radiotelemetry. Marco Zaccaroni is a researcher at the Department of Biology of the University of Florence. He did his PhD in the management of game species in an intensive agro ecosystem of central Italy. His research interests are focused on habitat analysis and population dynamics of mammals and birds.
123