Methods in Ecology and Evolution 2013, 4, 1–8
doi: 10.1111/j.2041-210x.2012.00249.x
Species interactions: estimating per-individual interaction strength and covariates before simplifying data into per-species ecological networks Konstans Wells1,2* and Robert B. O’Hara1 1
Biodiversity and Climate Research Centre (BiK-F), Senckenberganlage 25, D-60325 Frankfurt (Main), Germany; and Institute of Experimental Ecology, University of Ulm, Albert-Einstein-Allee 11, D- 89069 Ulm, Germany
2
Summary 1. Ecological network models based on aggregated data from species interactions are widely used to make inferences about species specialization, functionality and extinction risk. While increasing number of network data are available and are used in comparative studies, data quality and uncertainty have received little attention. Moreover, key individual-level information such as the proportion of individuals not involved in interactions and underlying processes driving interactions are ignored by aggregated data analysis. 2. We suggest an individual-level hierarchical interaction model as a more flexible approach to considering uncertainty, sampling effort and conditions under which interactions take place and from which network attributes can be derived. We performed a simulation exercise to compare inference under different sample sizes and from aggregated data matrices to those from our individual-level model. 3. Formalizing the process of network formation in an individual-level model made clear that per-species interaction frequencies are not independent of sample size and population pools and also ignore important information given by the proportion of non-interacting individuals. Hierarchical linear models are a possible solution to infer community-level attributes of network formation and allow various kinds of comprehensive model extensions to capture variation of per-individual interactions in space and time that shape upper level organization. 4. Individual-level hierarchical models provide the link between individual behaviour and interactions under variable environmental conditions and can be summarized into networks in a conceptually neat way. Such models may not only help to account for various sources of variation but also conceptualize aspects overlooked in aggregated data. In particular, the quantification of per-individual interactions under different sampling scenarios emphasizes that per-species interaction frequencies at the species level are not necessarily a surrogate of species abundance in natural systems under investigation.
Key-words: ecological fallacy, ecological networks, biotic interactions, hierarchical models, random graphs, species specialization
Introduction Plotting networks of multilateral relationships – such as animal–plant, host–parasite or mutualistic interactions – has been an instructive way of disentangling complex relationships between species: it describes the connections between species while leaving out all background information which is difficult to capture at a glance (Strogatz 2001). Quantitative descriptions of ecological network structure have led to new insights into aspects as diverse as community structures, extinction risks, the functional role of species, their complementarity in functions and the spread of agents such as seeds or pathogens through existing pathways (Williams & Martinez 2000; Bascompte, Jordano & Olesen 2006; Berlow *Correspondence author.
[email protected]
E-mail:
[email protected],
et al. 2009; Ings et al. 2009). Descriptions of networks are based on adjacency matrices, in which all interactions between possible combinations of agents are denoted as the presence/absence or frequencies. Adjacency matrices and networks thus reduce the complexity in the data by summarizing only those agents in a system which are observed to interact. Interest focuses on network topology, that is, how connections between pairs of agents are arranged. As species are the basic focal unit for investigating the complex structure in natural communities and their ecosystem function, it is no surprise that the majority of ecological network perspectives and ideas are centred around per-species interactions and in numerical terms around interaction matrices at the species-level. But interaction data are based on observations of individuals and are subject to an observation process with its various sources of bias and uncertainty (Boulinier et al. 1998; Dennis, Ponciano & Taper 2010).
© 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society
2 K. Wells & R.B. O’Hara Sampling effort, species detectability and observation error are widely acknowledged in models of species occupancy and abundances in population ecology and biodiversity studies (Gotelli & Colwell 2001; Golicher et al. 2006; MacKenzie et al. 2006), but most ecological systems studies that use readily available ecological network or species distribution data ignore these issues (Grimm et al. 2005; Rocchini et al. 2011). The naturally hierarchical structure in species interactions should be recalled here: inferences about whether species interact, or how often they do so, are made from the aggregation of interactions between individuals of the focal species, and whether these are observed depends on the sampling effort. Perindividual interactions may or may not take place according to their biology or the particular environmental conditions the individuals find themselves exposed to. The species-level perspective can be thus considered an upper level organization (Laska & Wootton 1998; Wootton & Emmerson 2005). Spatio-temporal variation in species abundance and/or environmental conditions are recognized as important drivers of network structure (Abrams 2001; Va´zquez et al. 2007; Carnicer, Jordano & Melian 2009). But in practice networks are constructed by aggregating data over time or space, and this averages over variation in the environment, removing information and making it more difficult to find explanations for the remaining variation. Here, we move back from aggregated species-level data to individual-level data. Because the observed interaction frequencies are not only a result of the affinity of species towards each other, but also the number of individuals exposed to each other, this can help us disentangle these two effects (Fig. 1). Definitions of interac-
tions strength based on aggregated data that have been used in recent ecological studies cannot be used to distinguish between changes in single species (i.e. the number of individuals involved in an interaction) and per-individual interaction strength (attraction of consumer towards resource independent of populations size) (Berlow et al. 2004; Bascompte, Jordano & Olesen 2006). Here, we develop a general individual-level model framed in a hierarchical Bayesian approach to demonstrate how various aspects considered in species occupancy and abundance models can be incorporated in ecological network analysis. In particular, we aim to show that beside stochastic effects due to overall sample size, heterogeneity in the number of individuals sampled per species and detection probability/ sampling error may influence inference in ecological network studies, while these effects can be teased apart in an individual-level model.
Materials and methods For simplicity, we use the terminology of a consumer-resource system, although the model outlined here applies to a large range of other species interactions (e.g. host–parasite, predator–prey or mutualistic interactions).
A PER-INDIVIDUAL INTERACTION MODEL
Assume we observe N resource individuals, with individual i being a member of species ri (1 ri R) and M consumer individuals, with individual j being a member of species cj (1 cj C). If no individual-level data are available for either resource or consumer species (e.g. if individual flowers are observed, with all pollinator visits being recorded without pollinators being distinguished), then we can lump all individuals of each species into one, and set N = R or M = C,
Fig. 1. Illustration of a simplified scenario of species interactions of three consumer species (C1–C3) and three resource species (R1–R3). During field studies (‘Field scenario’), some individuals of resource species (ellipses) are typically directly counted and observed for interactions with consumers, for which populations sizes and individual identity might be unknown (dashed lines). While R1 and R2 are characterized by the same actual connections and interaction frequencies, they differ in abundance. When aggregating interactions into adjacency matrices for network plotting and analysis (‘Network artwork’), information such as the number of observed resource individuals with zero encounters and the sampling intensity are lost, leading to misleading conclusions on interaction strength such as the similar representation of R1 and R2 in the network artwork. © 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8
Individual-level interactions and ecological networks and what follows simplifies (in the most simplified case, counts usually reduce to the presence/absence data and the model outlined below reduces to a logistic regressions model, Fig. 1). For each resource individual i, we observe Xij interactions with consumer individual j. We can model these as occurring at a rate ij, so that the total number of events observed over a time period t follows a Poisson distribution: Xij Poisðt kij Þ
eqn 1
where, ij is the per-individual interaction strength, so ij = 0 if the species never interact. We can model ij further, for example, by adding the effects of covariates or letting it vary between individuals. In particular,we can assume the interaction is affected multiplicatively by both species-specific components linked to ri and cj and their interaction as well as interaction-specific components linked to individuals i and j, so that logðkði; jÞÞ ¼ bðri ; cj Þ þ gði; jÞ
bðri ; cj Þ ¼ a þ qðri Þ þ vðcj Þ þ sðri ; cj Þ
eqn 3a
gði; jÞ ¼ iðiÞ þ /ðjÞ þ tði; jÞ;
eqn 3b
where q(ri) and v(cj) are the species-level effects on the interactions, for example, different activity rates (e.g. search efficiency or ‘hungriness’) in the different consumers would lead to variation in v(cj), and similarly different palatability or encounter rates of resources would make q(ri) vary. Thus, τ(ri, cj) is related to specialization: a large variance in τ(ri, cj) suggests extreme specialization, with some consumers preferring some resources much more than average. ι(i), φ(j) and t(i,j) have similar interpretations, but at the individual level, so (for example) variation in palatability between resource individuals can be modelled in ι(i). Constraints will usually have to be put on the parameters (e.g. that the terms sum to zero, so a is the overall average interaction rate), to avoid over-parameterization; here, we will make these constraints in the statistical analysis, but drop them in the mathematical development. If we only look at the species level, summing over individuals, we can define ~
eqn 4
~
where k(r,c) = exp(b(r,c)) is the species-level mean interaction strength and is independent of sample size. Λrc is the summary of interaction frequencies base on per-species interactions strength and the number of individuals sampled in the field and is what is commonly observed (e.g. in pollinator networks, when the visits of a pollinator species to a plant are recorded, but without individuals being marked). Equation 4 also illustrates the effects that abundances of the interactors have on the observed networks and makes it clear that changes in abundance can cause changes in the aggregated interaction network. This model has been developed in terms of counts, but we can convert counts to probabilities easily, for example prc ¼ Krc =
R X C X
is the same as used in classical statistics to move between logit and loglinear models (e.g. McCullagh & Nelder 1989).
MODEL FITTING
As much of the work on ecological networks is empirical, we want to be able to fit the model outlined above to field data. The data are typically counts of events, for example, the number of visits of a frugivore to a plant, or the number of animals of a species eaten by a consumer. The Poisson assumption naturally leads to modelling the counts with a log-linear model (McCullagh & Nelder 1989). At the simplest level, if we only have species-level information, we can use the following model: Count Consumer þ Resource þ Consumer : Resource þ offset(log(Time))
eqn 2
where b(ri, cj) is the average species-level and g(i,j) is the individuallevel effect. Some of the parameters can, of course, be modelled further as functions of various environment or species traits (e.g. phylogeny or functional type). In particular, it is convenient to decompose the species-level and individual-level effects:
Krc ¼ Pi2r;j2c kði; jÞ ¼ nr mc kðr; cÞ
3
Where the response is Count, Consumer and Resource are factors (r(ri) and v(cj) in eqn 3a) and Consumer : Resource is the interaction (τ(ri, cj) in eqn 3a), and offset() denotes an offset, that is, a covariate in a model with the regression coefficient fixed to 1 (see O’Hara 2009 for more details about writing this model). The offset term (which is t, see eqn 1) can be deleted if observation time is constant over all observed combinations of individuals; however, the model has then to be interpreted as the rates per observation time interval. From this, we can extract estimates of q(ri), v(ci), and τ(ri, cj) and calculate statistics which summarize the network structure. Two aspects of this experimental design should be noted: (1) the interaction is the same as the overdispersion (so specialization is confounded with any extra-Poisson variation) and (2) the model assumes that all species interact, but at varying rates. The first problem can only be overcome with more detailed data, for example replicate observations of several resource individuals (e.g. trees) can be used to model the overdispersion. More generally, the model can be developed to incorporate extra individual-level information, which should help to improve the species-level estimates that are usually the main focus. Mathematically this involves adding terms in eqn 3b into the model, and then possibly modelling them further. Interactions may not be observed for two reasons: they may genuinely not take place, or they do occur but are not observed. It is therefore likely that zeros in the data are a mixture of true and false values (Farewell & Sprott 1988). This motivates the use of zero-inflated distributions, where the rate ij is modelled as being either 0 (or, equivalently, that τ(ri, cj) is ∞) or an estimated positive value. This can be written by introducing an indicator, I(r, c) that resource r and consumer c interact, that is, if I(ri, cj) = 1, r interacts with c, and defining kij ¼ Iðri ; cj Þebðri ;cj Þþgði;jÞ :
eqn 6
We can then, if necessary, also build a model for I(ri, cj), using a logistic regression framework (e.g. Farewell & Sprott 1988). The complexity of the resulting model, and the desirability of estimating complex functions of the model, suggests using a Bayesian framework to fit the model. However, there is no necessity to do so and in practice simpler versions of the model are probably better fitted with classical techniques.
SIMULATION STUDY
Ksd
eqn 5
s¼1 d¼1
from which we can calculate any summary statistics, such as commonly used network statistics (e.g. Blu¨thgen & Menzel 2006). This conversion
We use simulated data to explore the effects of using our model outlined earlier, in particular to compare the behaviour of network statistics calculated from the model and from the raw data and see how sample size affected them.
© 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8
4 K. Wells & R.B. O’Hara (Blu¨thgen & Menzel 2006). As our focus is not on particular indices, we expect this index to perform reasonably well in providing a summary measure of how the differences in sample size influence inference of network attributes from aggregated data. The interest in the individual-level hierarchical model, in turn, is in how far such model is able to disentangle the different sources of variation and allows accurate parameter estimation and expression of uncertainty; in our simu~ lations study, we thus asked whether kðr; cÞ and uncertainty in the number of unobserved interactions on the species-level can be reasonably well estimated from the different sampling scenarios. For these models, we calculated the index H2 based on the posterior estimates ~ of each interaction strength kðr; cÞ and 1000 Poisson random draws for each of the 10 000 model iterations. As a comparative measure of H2 from ‘true’ values of interaction strength, we also calculated H2 ~ from interaction strengths kTRUE ðr; cÞ and 1000 Poisson random draws for each interaction, repeating this calculations 1000 times (i.e. estimating Poisson error). The model was fitted in a Bayesian framework, with the following priors, chosen to be vague: for all variance terms U(0,10) and for all sample means N(0,1). When I(ri,cj) = 0, the parameters that influence ij are not affected by the likelihood, so they are only updated from their prior distributions. This can adversely affect mixing (as their sampled values are only occasionally close to the posterior for when I(ri,cj) = 1), so we used G(1,100) pseudopriors (Carlin & Chib 1995) to improve the mixing. We fitted the model with Markov Chain Monte Carlo (MCMC) based on the Gibbs sampler in OpenBUGS 3.1.1 (Lunn et al. 2009). Posteriors were gathered by running 10 000 iterations of two chains after discarding 50 000 iterations. Convergence and mixing was assessed visually. MCMC results are given as posterior mode with 95% and 50% highest posterior density credible intervals (CI). The model code is given in Appendix S1.
We assume a network of 12 resource species, r, and nine consumer species, c, (e.g. fruiting trees visited by mammal species) and all individuals are marked and individually surveyed for interaction. We generated~ an arbitrary matrix of species-level interaction probabilities of 0 kTRUE ðr; cÞ 075 with the idea that some generalist consumer species are more frequently visiting some resource species than others, while specialist consumers are only attracted to a small range of resource species (Table 1). These values do not necessarily match properties of real-world interactions and ecological networks but are suitable for illustrating the effects of uncertainty due to low interaction probability and different kinds of specialization. We investigated the effects of sample size and per-species interaction strength on possible outcomes of observed interaction frequencies (the generated data), by arranging per-species interaction probabilities between species in a cluster design with each of three consumer species (‘generalist’, ‘opportunist’ and ‘specialist’) and four resource species (‘generalist’, ‘opportunist’, ‘specialist’ and ‘erratic’) ~ having the same combinations of kTRUE ðr; cÞ (see Table 1). Arranging the combinations of sample sizes of resource and consumer species into N 9 M matrices for all species, we generated individual-level interaction frequencies by drawing them from a Pois~ son distribution with kTRUE ðr; cÞ of the respective species pair. We assumed three replicate ‘surveys’ for each individual of resource species. We explored the effect of overall sample sizes on network inference by using equal sample sizes for all N and M of 3, 5, 10, 15, 20 and 25 individuals for all resource and consumer species. To demonstrate the effect of unequal sample size on network inference, we chose four scenarios with either 5, 10, 15 or 20 individuals for the three ‘generalist’ consumer species combined with constant sample sizes of three individuals for the six remaining consumer species and five individuals of all resource species. For each of these ten sampling scenarios, we generated 50 sets of simulated data. For illustration, we estimated the community-level index of specialization H2 based on Kullback–Leibler distances. This index was developed to account for heterogeneity in the number of observed interactions in plant–pollinator webs and is among the indices least affected by the number of interactions assembled into aggregated data, that is, sample size on the species-level of aggregated data
Simulation results The simulated interaction data included considerable uncertainty, as not all per-species interactions known to take place ~ ðkTRUE ðr; cÞ > 0Þ were represented in the data: with a sample
~
Table 1. Artificial values for interaction strength kTRUE ðr; cÞ for a hypothetical community of 12 resource and nine consumer species. For data generation, we assumed equal sample size N for different resource species. For consumer species, sample size M was either assumed to be equal for all species or variable within or among clusters. We assumed a detection probability of 1. Note that combinations and values are not based on any natural system or empirical data set Type
Generalist
Generalist
Opportunist
Specialist
Erratic
Detection probability Sample size Species N R1 N R2 N R3 N R4 N R5 N R6 N R7 N R8 N R9 N R10 N R11 N R12
Opportunist
Specialist
1
1
1
1
1
1
1
1
1
M/M (1) C1 075 075 075 02 02 02 001 001 001 005 005 005
M/M (2) C2 075 075 075 02 02 02 001 001 001 005 005 005
M/M (3) C3 075 075 075 02 02 02 001 001 001 005 005 005
M/M (4) C4 005 005 005 001 001 001 0 0 0 001 001 001
M/M (5) C5 005 005 005 001 001 001 0 0 0 001 001 001
M/M (6) C6 005 005 005 001 001 001 0 0 0 001 001 001
M/M (7) C7 0 0 0 0 0 0 02 02 02 0 0 0
M/M (8) C8 0 0 0 0 0 0 02 02 02 0 0 0
M/M (9) C9 0 0 0 0 0 0 02 02 02 0 0 0
© 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8
Individual-level interactions and ecological networks
ably well in estimating~ the parameters of interest, namely the interaction strength kðr; cÞ and uncertainty in interactions between species pairs with incomplete observations. Even with a small sample size of~ three individuals per species, posterior mode estimates of kðr; cÞ were close to the true values ~ kTRUE ðr; cÞ from the data generation process (all Pearson r > 09, P < 001). Detailed model outputs are omitted but in ~ most cases, credible~ intervals for all estimates of kðr; cÞ were symmetric around kTRUE ðr; cÞ. For all models, estimates of ~ kðr; cÞ were of the same magnitude than true values; mostly, ~ differences of estimates for kðr; cÞ were < 005 from true values for all sampling scenarios. When calculating H2 indices from ~ posterior estimates of kðr; cÞ from any sampling scenario with equal sample sizes, estimated values showed similar variation in magnitude than those calculated from aggregated data, showing that too small sample sizes give also poor estimates with the individual-level interaction model (Fig. 2). Nevertheless, the individual-level model captured the uncertainty in whether all-zero observations in interactions between species pairs were true absences of associations or due to lack of data (false zeros) with the aid of the species-level indicator function. Posterior estimates were around the true number of 72 species ~ associations ðkTRUE ðr; cÞ > 0Þ for equal sample sizes of three or five individuals per species but the model overestimated the number of species association with up to nine additional/nonexisting species associations for larger sample sizes (Fig. 3). For unequal sample sizes of consumer species, the estimation of the specialization index H2 from aggregated data was poor and became worse as the sample became more uneven, even though increasing unevenness increased the sample size (Fig. 4). In contrast, estimates of the specialization index H2 from the individual-level model were not biased by heterogeneity in sample size, improving with larger sample size despite increasing unevenness (Fig. 4).
5
10 15 20 25
3
5
10 15 20 25
Sampling scenario Fig. 2. Estimates of the network-level specialization index H2 from aggregated ~ data (circles) and from posterior estimates of interaction strength kðr; cÞ from an individual-level model (squares). Sampling scenarios included sample sizes of equal number of individuals from different resource and consumer species (‘3’, ‘5’, ‘10’, ‘15’, ‘20’, ‘25’, respectively). Error bars represent one standard deviation given by 50 sets of simulated data for aggregated data and 95% credible intervals from posterior estimates of the individual-level model. Thick grey bars represent 50% credible intervals. The dashed line and grey square refer to the calculation of H2 from the data generating values of interaction strength and exhaustive sample size, including Poisson error bars. In network studies, a lower H2 is interpreted as higher network-level specialization.
1500
0
500 0 55
60
65
70
75
80
85
15 Ind.
500 1000 1500 2000 2500 3000
10 Ind.
1000
1000 500 0
Frequency
In this study, we developed an individual-level interaction model to better account for sampling bias and the proportion of individuals involved in interactions in ecological network studies. The results of our simulation study show that ecologi-
5 Ind.
1500
3 Ind.
Discussion
500 1000 1500 2000 2500 3000
3
55
60
65
70
75
80
85
0
0·30 0·24
0·26
0·28
Interaction model
0·22
Network-level specialization index H2
size of only three individuals for each species, 20–36 out of the total of 72 true interactions were absent in the 50 sets of simulated data; the recording of all interactions was nearly complete with a sample size of 15 individuals per species, with a single interaction being absent only twice in the 50 sets of simulated data. The specialization index H2 calculated from aggregated data with 15 individuals for all species and equal sample size revealed little differences from H2 calculated from the exhaustive sample, but it was overestimated when the sample size was smaller (Fig. 2). The individual-level model performed reasonAggregated data
5
72
74
76
78
80
72
74
76
78
80
Number of estimated species associations
Fig. 3. Posterior density distribution of the estimated number of pairwise associations of consumer and resource species for four different sample sizes (3, 5, 10 individuals per species, respectively). The true number of associations in the simulated data is 72 of 108 consumer-resource species pairs (black bars), whereas data comprise incomplete samples with 20–36, 8–17, 0–4, and 0–1 associations absent from the simulated data, respectively. © 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8
6 K. Wells & R.B. O’Hara
0·15
0·20
0·25
0·30
Interaction model
0·10
Network-level specialization index H2
Aggregated data
5/3
10/3 15/3 20/3
5/3
10/3 15/3 20/3
Sampling scenario Fig. 4. Estimates of the network-level specialization index H2 from aggregated ~ data (circles) and from posterior estimates of interaction strength kðr; cÞ from an individual-level model (squares). Sampling scenarios included heterogeneous numbers of consumer individuals (‘5/3’, ‘10/3’, ‘15/3’, ‘20/3’, respectively). Error bars represent one standard deviation given by 50 sets of simulated data for aggregated data and 95% credible intervals from posterior estimates of the individual-level model. Thick grey bars represent 50% credible intervals. The dashed line and grey square refer to the calculation of H2 from the data generating values of interaction strength and exhaustive sample size, including Poisson error. In network studies, a decrease in H2 is interpreted as higher network-level specialization.
cal network metrics calculated from aggregated data can be biased by sample size. In particular, if biases in sample size are heterogeneous among species, the number of species interactions aggregated into matrices for network metrics may be as much determined by sampling procedure as any biological parameter of interest. These findings have important implications, as they suggest that current methods of estimating interaction frequencies and interaction strengths from aggregated species-level data in ecological networks could be biased and this might even make the networks meaningless for comparative studies. We suggest that an individual-level hierarchical interaction model is a first step to ecological network analysis as it lets us differentiate the various components of the hierarchical organization of individual behaviour, species interactions and community-level ecological networks in a more consistent statistical and biological framework. We emphasize that perindividual interaction strength needs to be distinguished from per-species interaction strength. In fact, per-species interaction strength is inevitably linked to the abundance of the species in the network regardless of how per-individual interactions vary. Under any scenario of the presence/absence or relative abundance, per-species interaction probabilities can be estimated independently of sample size if individual interaction probabilities are estimable. Notably, if estimates of local species abundances are available, realized interaction frequencies can then be expressed as a biologically meaningful parameter that does
not depend on sample size per se. Recognizing the importance of species abundances in network formation, we emphasize that future network studies would be more interpretable if it is made clear what kind of sampling effort or abundance estimates are used. While it has often been pointed out in population and community ecological studies that measures of abundance or species diversity needs proper estimation, or at least an accurate consideration of the underlying sampling design (Colwell & Coddington 1994; Boulinier et al. 1998), to the best of our knowledge this has not been rigorously considered in network studies. Disentangling the role of local abundance and interaction strength is also helpful for estimating uncertainty in whether the lack of observed interactions on the species-level is due to a low probability linked to either low local abundance or low per-individual interaction strength. Typically for many environmental data sets, interaction data are likely to come with many zeros that may either represent true zeros or zeros of interactions being missed (Farewell & Sprott 1988). While models that account for zero-inflation and mixtures can handle these zeros and use them as additional data source (Golicher et al. 2006; Ver Hoef & Jansen 2007), aggregated data fail to include information about zero encounters. This not only ignores a valuable source for estimating uncertainty in the data, but zeros at the individual level might include important biological information such as the proportion of individuals involved in interactions. While per capita interaction strengths have been quantified in various studies (Paine 1992; Wootton 1997; Berlow et al. 2009), it is important to note that species abundances in aggregated data do not necessarily match true abundance if only a fractions of individuals of any species are involved in interactions. Uncertainty and noise in networks have been considered in molecular, epidemiological or social science networks and graph theory (Nowicki & Snijders 2001; Han et al. 2005; Stumpf, Wiuf & May 2005; Annibale & Coolen 2011). An important point here is that most network studies from other disciplines concern the presence of connections between individuals such as actors in social science or molecules, whereas ecological networks are analysed at the species level (Berlow et al. 2004; Ings et al. 2009), and studies of ecological networks have thus mostly ignored uncertainty in information of the environmental conditions at the individual level. One source of uncertainty which we did not fully address is detection probability. The incomplete detection of true interactions might lead to false zeros in data; if detection probability varies across species, such heterogeneity in detection would bias network-level estimates as much as heterogeneity in sample size if not correctly accounted for. We would expect that in most interaction studies, where observations are directly focused on individuals rather than on plots or landscapes, detection is more complete. Nevertheless, while we can expect nearly complete detection of interactions in some studies such as between easily recognizable animals and flowers or fruits, the detection of some interactions might be incomplete, for example, the visits of cryptic birds and mammals to large fruit-
© 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8
Individual-level interactions and ecological networks ing trees or records of parasites from large host species that are difficult to sample. Given the dynamic nature of most types of interactions, discerning factors that influence either the process or the observation may be challenging, as interactions cannot be observed as true replicated counts as in monitoring studies. Therefore, the assumption of replicated counts such as in N-mixture models that distinguish observation and process model (Royle 2004) may not be reasonable for interaction studies. In many real-world scenarios with many individuals and species, our most general model – allowing all individuals from one group to interact with all individuals from the other – might be difficult to fit, as the data will be sparse. Hence, we expect that most applications of the model will aggregate one or both of consumer and resource, for example, if individual resources are observed, they can be analysed at the individual level, while consumers may not be individually identifiable and thus may be aggregated to the species level. Note that separate abundance estimates can still be used to correct the total counts of interactions (Λrc) down to individual counts (rc), through eqn 4. This aggregation will also help to make the model more scalable, if many more species are placed in the interaction matrix. With the growing interest in understanding network topology and functionality under variable environmental conditions (Olesen et al. 2008; Blu¨thgen & Klein 2011; Burkle & Alarcon 2011), our approach of analysing individual-level data offers solutions by giving us a way to extract the effects of changes in abundance and also include environmental covariates, species attributes and the various sources of error in describing processes and patterns in ecological networks. The flexibility of hierarchical models lets the model be extended in various ways. For example, detailed analyses of how individual-level attributes may impact network formation and why some individuals are only of little attraction while others from the same species are more frequently involved in interaction can be considered (Bolnick et al. 2003). We would expect that the link of a hierarchical model to network studies may well open new avenues of research in the near future. For example, complementary consumer species may respond to environmental changes in different ways with regard to their abundance and presence at sites as well as their attraction towards resources. Quantifying processes that determine consumer populations dynamics linked to local abundances during times of interactions may then explain whether environmental changes act on the species abundance or on their attraction towards particular resources. Whether or not abundances of consumer species are influenced by each other (Mutshinda, O’Hara & Woiwod 2009) and whether such dynamics impact interactions with resources is one example of multilateral and hierarchical interplay of species interactions that can only be quantified with models that incorporate some of the complex structure we expect to be at work. In practice, a first step to implementing our approach is to accurately collect and store individual-level data whenever possible. If visits of pollinators to flowers are recorded in the field, for example, counts of visits can be recorded along
7
with individual flower characters and environmental conditions for every flower individual, including also flowers not visited by any pollinator. Likewise, parasitic loads, for example, can be recorded for every host individual. If individuals are identifiable (e.g. distinguishable larger mammals in the field or marked animals in a laboratory experiment), their interaction with resources can be quantified on an individual-level basis. If detection of interactions are incomplete and differ among species, collecting additional information to quantifying detection probability is desirable. We suggest that network databases such as Interaction Web Database (http://www.nceas.ucsb.edu/interactionweb/) should encourage researchers to submit individual-level data along with details of their sampling protocols. Scientists may then more critically question sample size and data comparability in comparative ecological network studies. Ideally, however, studies should take advantage of all kind of available information and discern biological parameters from sampling effects through quantitative methods.
Acknowledgements This study was made possible through all those who dedicated much of their time and expertise to the freely available software of the BUGS and R projects and we appreciate the efforts of all core team members. We thank anonymous reviewers for constructive feedback on earlier drafts. The Centre of Scientific Computing (CSC) of the Goethe University in Frankfurt provided access to computation facilities. Funding was provided by the ‘Landesoffensive zur Entwicklung wissenschaftlich-o¨konomischer Exzellenz’ (LOEWE) of the state Hesse in Germany through the Biodiversity and Climate Research Centre (Bik-F) in Frankfurt a. Main.
References Abrams, P.A. (2001) Describing and quantifying interspecific interactions: a commentary on recent approaches. Oikos, 94, 209–218. Annibale, A. & Coolen, A.C.C. (2011) What you see is not what you get: how sampling affects macroscopic features of biological networks. Interface Focus, 1, 836–856. Bascompte, J., Jordano, P. & Olesen, J.M. (2006) Asymmetric coevolutionary networks facilitate biodiversity maintenance. Science, 312, 431–433. Berlow, E.L., Neutel, A.-M., Cohen, J.E., de Ruiter, P.C., Ebenman, B.O., Emmerson, M., Fox, J.W., Jansen, V.A.A., Jones, J.I., Kokkoris, G.D., Logofet, D.O., McKane, A.J., Montoya, J.M. & Petchey, O. (2004) Interaction strengths in food webs: issues and opportunities. Journal of Animal Ecology, 7, 585–598. Berlow, E.L., Dunne, J.A., Martinez, N.D., Stark, P.B., Williams, R.J. & Brose, U. (2009) Simple prediction of interaction strengths in complex food webs. Proceedings of the National Academy of Sciences of the United States of America, 106, 187–191. Blu¨thgen, N. & Klein, A.-M. (2011) Functional complementarity and specialisation: the role of biodiversity in plant–pollinator interactions. Basic and Applied Ecology, 12, 282–291. Blu¨thgen, N. & Menzel, F. (2006) Measuring specialization in species interaction networks. BMC Ecology, 6, 9. Bolnick, D.I., Svanba¨ck, R., Fordyce, J.A., Yang, L.H., Davis, J.M., Hulsey, C.D. & Forister, M.L. (2003) The ecology of individuals: incidence and implications of individual specialization. American Naturalist, 161, 1–28. Boulinier, T., Nichols, J.D., Sauer, J.R., Hines, J.E. & Pollock, K.H. (1998) Estimating species richness: The importance of heterogeneity in species detectability. Ecology, 79, 1018–1028. Burkle, L.A. & Alarcon, R. (2011) The future of plant-pollinator diversity: understanding interaction networks across time, space, and global change. American Journal of Botany, 98, 528–538. Carlin, B. & Chib, S. (1995) Bayesian model choice via Markov Chain Monte Carlo methods. Journal of the Royal Statistical Society, Series B, 57, 473–484.
© 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8
8 K. Wells & R.B. O’Hara Carnicer, J., Jordano, P. & Melian, C.J. (2009) The temporal dynamics of resource use by frugivorous birds: a network approach. Ecology, 90, 1958– 1970. Colwell, R.K. & Coddington, J.A. (1994) Estimating terrestrial biodiversity through extrapolation. Philosophical Transactions of the Royal Society of London B Biological Sciences, 345, 101–118. Dennis, B., Ponciano, J.M. & Taper, M.L. (2010) Replicated sampling increases efficiency in monitoring biological populations. Ecology, 91, 610–620. Farewell, V.T. & Sprott, D.A. (1988) The use of a mixture model in the analysis of count data. Biometrics, 44, 1191–1194. Golicher, D.J., O’Hara, R.B., Ruı´ z-Montoya, L. & Cayuela, L. (2006) Lifting a veil on diversity: a Bayesian approach to fitting relative-abundance models. Ecological Applications, 16, 202–212. Gotelli, N.J. & Colwell, R.K. (2001) Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecology Letters, 4, 379–391. Grimm, V., Revilla, E., Berger, U., Jeltsch, F., Mooij, W.M., Railsback, S.F., Thulke, H.-H., Weiner, J., Wiegand, T. & DeAngelis, D.L. (2005) Pattern-oriented modeling of agent-based complex systems: lessons from ecology. Science, 310, 987–991. Han, J.D.J., Dupuy, D., Bertin, N., Cusick, M.E. & Vidal, M. (2005) Effect of sampling on topology predictions of protein-protein interaction networks. Nature Biotechnology, 23, 839–844. Ings, T.C., Montoya, J.M., Bascompte, J., Blu¨thgen, N., Brown, L., Dormann, C.F., Edwards, F., Figueroa, D., Jacob, U., Jones, J.I., Lauridsen, R.B., Ledger, M.E., Lewis, H.M., Olesen, J.M., van Veen, F.J.F., Warren, P.H. & Woodward, G. (2009) Ecological networks – beyond food webs. Journal of Animal Ecology, 78, 253–269. Laska, M.S. & Wootton, J.T. (1998) Theoretical concepts and empirical approaches to measuring interaction strength. Ecology, 79, 461–476. Lunn, D., Spiegelhalter, D., Thomas, A. & Best, N. (2009) The BUGS project: evolution, critique and future directions. Statistics in Medicine, 28, 3049–3067. MacKenzie, D.I., Nichols, J.D., Royle, J.A., Pollock, K.H., Bailey, L.L. & Hines, J.E.. (2006) Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence. Elsevier, Amsterdam. McCullagh, P. & Nelder, J.A.. (1989) Generalized Linear Models. Chapman and Hall, London, New York. Mutshinda, C.M., O’Hara, R.B. & Woiwod, I.P. (2009) What drives community dynamics? Proceedings of the Royal Society B: Biological Sciences, 276, 2923– 2929. Nowicki, K. & Snijders, T.A.B. (2001) Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 96, 1077–1087. O’Hara, R.B. (2009) How to make models add up – a primer on GLMMS. Annales Zoologici Fennici, 46, 124–137. Olesen, J.M., Bascompte, J., Elberling, H. & Jordano, P. (2008) Temporal dynamics in a pollination network. Ecology, 89, 1573–1582.
Paine, R.T. (1992) Food-web analysis through field measurement of per capita interaction strength. Nature, 355, 73–75. Rocchini, D., Hortal, J., Lengyel, S., Lobo, J.M., Jimenez-Valverde, A., Ricotta, C., Bacaro, G. & Chiarucci, A. (2011) Accounting for uncertainty when mapping species distributions: the need for maps of ignorance. Progress in Physical Geography, 35, 211–226. Royle, J.A. (2004) N-mixture models for estimating population size from spatially replicated counts. Biometrics, 60, 108–115. Strogatz, S.H. (2001) Exploring complex networks. Nature, 410, 268–276. Stumpf, M.P.H., Wiuf, C. & May, R.M. (2005) Subnets of scale-free networks are not scale-free: sampling properties of networks. Proceedings of the National Academy of Sciences, 102, 4221–4224. Va´zquez, D.P., Melian, C.J., Williams, N.M., Blu¨thgen, N., Krasnov, B.R. & Poulin, R. (2007) Species abundance and asymmetric interaction strength in ecological networks. Oikos, 116, 1120–1127. Ver Hoef, J.M. & Jansen, J.K. (2007) Space-time zero-inflated count models of harbor seals. Environmetrics, 18, 697–712. Williams, R.J. & Martinez, N.D. (2000) Simple rules yield complex food webs. Nature, 404, 180–183. Wootton, J.T. (1997) Estimates and tests of per capita interaction strength: diet, abundance, and impact of intertidally foraging birds. Ecological Monographs, 67, 45–64. Wootton, J.T. & Emmerson, M. (2005) Measurement of interaction strength in nature. Annual Review of Ecology, Evolution, and Systematics, 36, 419–444. Received 3 June 2012; accepted 29 August 2012 Handling Editor: Robert Freckleton
Supporting Information Additional Supporting Information may be found in the online version of this article. Appendix S1. Model code for BUGS software (e.g. OpenBUGS as freely available at http://openbugs.info/w/) for an agent-based interaction model. As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials may be re-organized for online delivery, but are not copy-edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
© 2012 The Authors. Methods in Ecology and Evolution © 2012 British Ecological Society, Methods in Ecology and Evolution, 4, 1–8