Measuring and interpreting traitbased selection ... - Wiley Online Library

Journal of Vegetation Science 25 (2014) 55–65

Measuring and interpreting trait-based selection versus meta-community effects during local community assembly Bill Shipley

Keywords CATS; Community assembly; Dispersal; Maximum entropy models; Neutral theory; Niche theory; Traits Received 6 November 2012 Accepted 18 February 2013 Co-ordinating Editor: Jason Fridley

Shipley, B. ([email protected]): de D epartement de biologie, Universite Sherbrooke, Sherbrooke (Qc), Canada J1K 2R1

Abstract Questions: (1) How can one quantify the relative importance of meta-community processes related to immigration, local trait-based habitat filtering, and demographic stochasticity using the Community Assembly by Trait Selection (CATS) model in a general context? (2) How can this generalization be used to detect different strengths and directions in trait selection at the meta-community and local community levels? Methods: I describe a decomposition of the deviance between observed and predicted relative abundances based on a maximum entropy model including a meta-community prior (CATS) and generalize a previous decomposition of relative abundance using this model; corrections to avoid negative explained proportions of deviance are presented. Simulations of community assembly are used to explore its properties and elucidate its interpretation. In particular, this method quantifies the proportion of the total deviance between observed and predicted relative abundances attributed to: (1) pure trait-based local selection, (2) dispersal mass effect from the meta-community, (3) joint contributions of (1) and (2) that cannot be separated; and (4) residual deviance due to demographic stochasticity. Results and Conclusions: The previous decomposition, while giving correct values in that particular data set, requires modification in order to avoid nonsensical negative values. When the modifications described in this paper are made, the decomposition provides correct values. Furthermore, positive or negative values of the joint composition inform us of the importance and direction of correlations between local trait-based selection and processes occurring in the larger meta-community.

Introduction This paper describes a method of estimating, from observational data, the relative importance of three types of cause acting during the process of local community assembly: (1) measured trait differences that confer different adaptive advantages in the local environment, (2) processes occurring in the larger landscape that affect the differential influx of propagules of different species into the local community; and (3) unidentified causes beyond the former two. If all relevant traits have been measured, then such unidentified causes can be ascribed to demographic stochasticity within the local community. If some relevant traits are missing, or if the processes occurring in the larger landscape have not been completely captured in the data,

then component (3) of the decomposition only provides an upper bound on the importance of demographic stochasticity. The importance of these different classes of explanations for community assembly has been hotly debated by ecologists for a long time (Gleason 1926; Clements 1936; Grime 1979; Tilman 1982; Hubbell 2001). Empirical tests based on species abundance distributions have not resolved the debate because many different processes can generate statistically indistinguishable patterns (McGill et al. 2006; Nekola & Brown 2007). Experimental studies linking variation in plant traits to variation in probabilities of survival and reproduction in the field (Harpole & Tilman 2006; Poorter et al. 2008) can show the existence of local niche-based processes but cannot quantify the importance of these processes relative to other causes.

Journal of Vegetation Science Doi: 10.1111/jvs.12077 © 2013 International Association for Vegetation Science

55

Decomposing effects in community assembly

B. Shipley

This lack of quantification is problematic. Ecologists know that the key assumption of neutral models (equality of fitness across species) is not strictly true and so the real question revolves around the relative importance of nichebased and other factors in determining community assembly and how this relative importance might vary in time and space. Gilbert & Lechowicz (2004) and Cottenie (2005) used the Borcard et al. (1992) method of decomposing abundance patterns between local communities using spatial information of environmental gradients to estimate the relative importance of dispersal and local niche differences, but without incorporating explicit information on functional traits. Shipley et al. (2012), in studying community assembly in a tropical tree community, introduced a statistical decomposition of the correlation between observed and predicted relative abundances based on Shipley’s maximum entropy CATS model (Community Assembly by Trait Selection) that aims to quantify the relative importance of these different classes of causes using explicitly measured traits. However, the method described in Shipley et al. (2012), although appropriate for that specific data set, can yield incorrect answers under certain conditions if it is not appropriately modified. Furthermore, the general interpretation of the statistical results in terms of the underlying biological processes was incomplete in the original publication. The current paper therefore has two goals. First, I describe a modification of the original decomposition that is applicable beyond the specific conditions of the data in Shipley et al. (2012). Second, I apply the method to data generated by different simulated scenarios so that better ecological interpretations of the results can be obtained.

Methods Conceptual description of CATS The process of community assembly that is conceptualized by the CATS model (Shipley et al. 2006, 2012; Shipley 2009, 2010a) is an extension of Keddy’s (1992) notion of trait-based community assembly. The model considers vegetation existing at two spatial scales (local and landscape). A ‘local’ plant community consists of those plants found in an area that is sufficiently small in spatial scale such that there are no pronounced environmental gradients occurring within it. For herbaceous vegetation, this might be less than a few square metres while for trees this might be less than a hectare. The meta-community consists of the ensemble of local communities in the landscape that can potentially exchange propagules. Vegetation at this larger spatial scale will experience different environmental conditions. The list of the S species that occur in the metacommunity, i.e. that can potentially disperse into the local community and which can survive the abiotic conditions (but not necessarily the biotic conditions) of the local com-

56

munity, is the species ‘pool’ for this local community. Thus a local community is nested within the meta-community. A species can potentially be rare (or absent) from a local community while common in the meta-community. Similarly, a species can be rare (but not absent) from the metacommunity but common in the local community. To distinguish between those trait-based processes within the local community that confer different adaptive advantages in the local environment from other processes occurring in the larger landscape that affect the differential influx of propagules of different species into the local community, we first assume a model in which the per capita probabilities associated with all demographic rates (dispersal, germination, survival, reproduction) in a given local community are equal across species. Therefore, any differences in functional traits between species in this local community are independent of relative abundances. Given this model, the expected number of propagules immigrating from the meta-community to the local community is equal to the relative abundance of each species in the meta-community (‘dispersal mass effect’). Since all subsequent demographic rates between species in the local community are equal, the expected relative abundance of each species in the local community is also equal to the metacommunity relative abundance, and subsequent deviations from this expected relative abundance in each local community are due solely to random demographic stochasticity. This assumption is similar to those of neutral models (Bell 2000; Hubbell 2001), except that neutrality is not required in the larger meta-community. Now assume a model in which some functional traits do affect dispersal ability and subsequent probabilities of germination, survival, growth and reproduction in the local community. Given this assumption, species having traits that increase or decrease their dispersal ability will increase or decrease in relative abundance in a local community relative to this neutral expectation, because more or fewer propagules of such will arrive in the local community than expected given the abundance of the species in the metacommunity. Once propagules reach the local community, then those individuals having better-adapted traits for the local environment will have higher probabilities of survival, growth and reproduction. Species possessing such better-adapted individuals therefore increase in relative abundance relative to the neutral expectation, while species possessing poorly-adapted individuals decrease in relative abundance relative to the neutral expectation. The compositional structure of this local community is therefore determined both by the influx of immigrants from the meta-community and by trait-based selection from the local environment. If all individuals have the same probabilities of survival, growth and reproduction in the local community (i.e. if



B. Shipley

there is no trait-based local selection), and if all individuals in the meta-community have the same probabilities of immigration to the local community, then the structure of the local community will be the same as the structure of the meta-community plus random variation due to demographic stochasticity. In addition to demographic stochasticity, the only cause of local relative abundance of any given species is its abundance in the meta-community (i.e. dispersal mass effects). The vector of relative abundances for each of the S species in the meta-community is therefore called the ‘meta-community prior’ distribution; in previous publications this was called the ‘neutral’ prior (Shipley 2010a; Sonnier et al. 2010; Shipley et al. 2011, 2012). At the other extreme, if the probabilities of immigration, survival, growth and reproduction in the local community are entirely determined by trait-based selection, then the structure of the meta-community will be irrelevant to the local structure. This is because each species has, by definition, a non-zero probability of immigrating to the local community and of subsequently surviving the local abiotic conditions (otherwise it would not be part of the species pool). Once a species is present in the local community, then its subsequent survival, growth and reproduction are entirely determined by the traits of its individuals. Finally, local communities that are affected both by dispersal mass effects via the meta-community and by local trait-based selection will be located along a continuum between these two boundaries. The processes generating patterns in a local community and in the meta-community are partially overlapping but they are not the same. Explicitly modelling such processes at both scales and linking the two via immigration is very difficult without making unrealistic assumptions about the relative importance of these processes a priori, which is self-defeating when the purpose is to empirically measure them. Instead, the meta-community pattern is measured, not modelled, and is treated as prior information. I then model the local community by assuming (as endpoints along a continuum) that this prior information is either irrelevant (because local trait-based selection is dominant) or is the only relevant information available (because local trait-based selection is absent).

in the local community), (ii) a vector of n communityS P weighted trait values, t ¼ fti ; . . .; tn g; tj ¼ oi tij , estimating i¼1

the trait values of an average individual in the local community, and (iii) a prior probability distribution, q = {q1,…,qS} specifying the hypothesized contribution of the meta-community in determining the structure of the local community. Here, oi is the observed relative abundance of species i in the local community. Based only on these three inputs, the CATS model predicts the relative abundances, p = {p1,…,pS} in the local community of each of the S species in the species pool. This is done, following the maximum entropy formalism (Jaynes 2003), by choosing the unique vector of relative abundances that maximizes the relative entropy (Eq. 1) subject to the known constraints (Eq. 2) that are given by the community-weighted trait means and the normalization constraint. The solution is a generalized exponential distribution (Eq. 3) in which the k parameters measure the amount by which a unit increase in a trait value is associated with a proportional change in the relative abundance of the species in the local community when all other trait values are constant (Sonnier et al. 2011; Appendix 1). The values of qi are the values of the meta-community prior for species i. When qi = 1/S (a uniform prior) we are assuming that meta-community effects are absent, and when qi is the measured relative abundance of species i in the meta-community then we are assuming that the meta-community effects are present. Note also that when all kj are zero then there is no trait-based selection and pi = qi. Choosing the k values (Lagrange Multipliers) in Eq. 3 that maximize the relative entropy (Eq. 1) subject to the constraints (Eq. 2) is formally equivalent to choosing the k values that maximize the likelihood of Eq. 3 given the observed relative abundances and a multinomial error distribution (Shipley et al. 2012; Appendix 1) and so it is a type of nonlinear regression. RE ¼

S X i¼1

tj ¼

S X

pi pi ln qi

ð1Þ

oi tij ;

i¼1 S X

Mathematical description of CATS

ð2Þ

pi ¼ 1

i¼1

This conceptual model is translated into the quantitative CATS model based on the maximum entropy formalism of Jaynes (2003), as described in detail in Shipley (2010a). Here, I briefly outline its main points. The CATS model has three inputs: (i) a trait matrix, T = {tij}, of the j = 1, n chosen functional traits of each of the i = 1, S species known to occur in the species pool of the meta-community (these trait values can either be species averages or measured directly

n P

kj tij

qi e j¼1 pi ¼ T P kj tij S P qi e j¼1

ð3Þ

i¼1

The classical model R2, which is applicable to a multiple linear regression with a normal error structure, measures


57


B. Shipley

the proportional reduction in the model sum of squares (the model deviance given a normal error structure) due to the chosen regressors relative to an ‘intercept-only’ baseline model. This intercept-only baseline model is simply the mean value of the dependent variable: P P yi ¼ ðyi =N Þ þ ei ; y ¼ yi =N. In the context of Eq. 3 the proportion of the total deviance encoded in the observed relative abundances (oi) that is accounted for by the model is measured by the Kullback-Leibler index (R2KL , Eq. 4), which is a generalization of the classic R2 index for maximum likelihood estimation of a non-linear regression with a multinomial error structure (Cameron & Windmeijer 1997), which is formally equivalent to the maximum entropy solution in our case (Shipley et al. 2012, supplement). In the context of Eq. 3, the equivalent P baseline model is oi ¼ oi =S þ ei ¼ 1=S þ ei ¼ qo;i þ ei , where q0 is the maximally uninformative uniform prior. This baseline model is the expectation when there is no contribution from either the meta-community or from trait effects. The Kullback–Leibler index involves the ratio of two Kullback–Leibler divergences. The Kullback–Leibler P divergence, DKL ðojjmÞ ¼ oi ln moii , measures the amount of information lost when approximating the observed distribution of relative abundances, o, by another distribution m that has been obtained from some model; the larger the value of DKL(o||m) the more poorly m approximates o. Since q0 (the uniform distribution) is the prior distribution that encodes only the maximally uninformative information, and which allocates abundances to one of S mutually exclusive and unordered states (i.e. species in the species pool), any model producing p that predicts o better than does q0 will yield R2KL > 0; in this case the model is based on some correct information. A model producing p that perfectly predicts o will yield R2KL ¼ 1; in this case the

model is based not only on correct, but also complete information. If R2KL < 0 then this means that the model producing p predicts o even worse than one using the minimum amount of true information (i.e. q0); in this case the model is actually based on false information. The inclusion of regressors (i.e. the species traits in our case) can only improve the model fit relative to the baseline model (nonsignificantly so if the predictors are actually independent of the response variable), but never decrease it. However, the inclusion of priors other than q0 can decrease model fit relative to the baseline model if the information encoded in such priors is incorrect. S P oi ln poii R2KL ¼ 1 i¼1 ð4Þ S P oi ln qoo;ii i¼1

The decomposition of causes requires the fitting of data to the CATS model four times given different assumptions (Fig. 1); this can be done via the maxent and maxent.test functions of the FD library of R (R Foundation for Statistical Computing, Vienna, AT). The first model involves specifying a maximally uninformative (i.e. uniform) prior distribution, i.e. qi = 1/S for Eq. 3, and random permutation of the trait vectors among species. This forces the traits to be independent of the observed relative abundances due to the random permutations, while also ignoring any contribution from the meta-community. The distribution of values of the resulting R2KL statistic is obtained from many independent runs of the permuted trait vectors and is an estimate of the average value of fit under this null 2KL ðuÞ. This null distribution is used in the hypothesis R inferential test of significance of traits as described in Shipley (2010b). Since this estimate contains the minimum

2

RKL (u, t )u n i fo r m p r i o r + o b s e r ve d t r a i t s (3 ) 2 2 2 ΔRKL (t | φ ) = RKL (u, t ) − RKL (u )

2 2 2 ΔRKL ( n | t ) = RKL ( n, t ) − RKL (u , t )

2

2

R K L (n , t ) n e u t r a l p r i o r + o b s e r ve d t r a it s (4 )

RKL (u ) u n i fo r m p r i o r + r a n d o m l y p e r m u t e d t r a i t s (M o d e l b i a s ) (1 )

2 2 2 ΔRKL ( n | φ ) = RKL ( n ) − RKL (u )

2 2 2 ΔRKL (t | n ) = RKL ( n, t ) − RKL (n )

2

RKL ( n ) n e u t r a l p r i o r + random ly per m ut ed t rait s (2 )

Fig. 1. Graphical relationships between the four alternative model fits required for the decomposition.

58



B. Shipley

possible information from the prior, and no information from traits, it measures the fit due solely to model bias in the same way that the expected value of the classic model R2 under the null hypothesis is used to correct for model bias in a regression context. In the context of a classical multiple linear regression, the expected value of this model bias is known and is a function of the model degrees of freedom (df), and thus, the number of predictor variables, relative to the residual df, thus the total number of observations (Fisher 1925a). In the more general context of this paper, the analytic formula is not known and so it is estimated by permutation methods. The second model again involves specifying a uniform prior but now uses the observed trait vectors. One then calculates the proportion of the total deviance explained by this model ðR2KL ðu; tÞÞ, which estimates the total explanatory value of the traits when ignoring any meta-community effects. Note that if the set of measured traits is truly independent of the observed relative abundances in the local community (i.e. if there is no local trait-based selection) then R2KL ðu; tÞ will come from the permutation distribution described above. If this is the case, then R2KL ðu; tÞ could be less than the expected value of this permutation, 2KL ðuÞ, and due only to chance samwhich is estimated by R 2KL ðuÞ was pling variation. The possibility that R2KL ðu; tÞ < R not considered in Shipley et al. (2012) because the permutation test in that publication ruled it out, but this possibility must be accommodated in a general method. We therefore modify the original definition to be 2KL ðuÞ . R2KL ðu; tÞ ¼ max R2KL ðu; tÞ; R The third model involves specifying the meta-community prior but again randomly permuting the trait vectors between species, as in the first model, in order to measure the degree to which the meta-community abundance structure resembles the local abundance structure. This involves fitting a model to Eq. 3 in which q is the measured vector of meta-community relative abundances, but the traits are forced to be independent of the observed relative abundances due to the random permutations. When fitted using many independent runs of the permuted trait 2KL ðmÞ. However this vectors and averaging, one obtains R step, as originally described in Shipley et al. (2012), can also result in negative values of R2KL if not properly modified. Negative values of R2KL occur when the traits are, in fact, associated with the relative abundances but the direction of the association is in opposite directions in the local and meta-communities. By permuting the traits relative to the observed local relative abundances one is breaking any association between traits and local relative abundances. However, such permutations do not break any association between traits and meta-community relative abundances. Negative values of R2KL therefore occur because species with certain trait values cause them to have higher than

average relative abundances in the meta-community, but these same trait values result in lower than average relative abundances in the local community. To avoid this nonsensical result we therefore modify our definition 2KL ðmÞ ¼ maxðR 2KL ðmÞ; R2KL ðuÞÞ. to R The final model involves both using the meta-community prior and using the observed trait vectors between species. This includes the contribution both from the metacommunity prior and from the traits. Fitting this model yields R2KL ðm; tÞ. Again, contributions from the meta-community are either irrelevant, given the traits, or else they improve the fit. Because of this, I define R2KL ðm; tÞ ¼ maxðR2KL ðu; tÞ; R2KL ðm; tÞÞ. Given these four steps, we have two measures each of the contributions of the traits and dispersal mass effects via the meta-community. The increase in the explained deviance due to traits can be measured either by 2KL ðuÞ or by DR2KL ðtjmÞ ¼ R2KL ðm;tÞ DR2KL ðtjuÞ ¼ R2KL ðu;tÞ R 2 RKL ðmÞ. The first relation measures the increase in the explained deviance due to traits beyond that due solely to model bias, while the second relation measures the increase in explained deviance due to traits beyond that due to contributions (if any) made by the meta-community. This model bias is exactly equivalent to the model bias in the classic model R2 statistic of linear regression, in which the expected value of the classic model R2 is greater than zero even given independence between the dependent and predictor variables in any finite sample (Fisher 1925a,b). The increase in explained deviance due dispersal mass effects via the meta-community can be measured by either 2KL ðmÞ R 2KL;null ðuÞorDR2KL ðmjtÞ ¼ R2KL ðm;tÞ DR2KL ðmjuÞ ¼ R 2 RKL ðu;tÞ. The first relation measures the increase in the explained deviance (if any) due to the meta-community beyond that due to model bias, while the second measures the increase in the explained deviance due to the metacommunity, given the traits, relative to the explained deviance due only to the traits. A decomposition of the total deviance of a model consists of expressing this total deviance as the sum of a series of deviances due to mutually exclusive sources, as is the case in an ANOVA. There are different ways of performing such a decomposition. Here I consider a decomposition in which neither trait-based selection nor meta-community processes are assumed primary a priori. Note that the unexplained deviance of the most complete model is 1 R2KL ðt; mÞ. Since Shipley et al. (2012) did not provide proofs for the equations that follow, these are given in electronic Appendix S1. Decomposition: 1 ¼ DR2KL ðtjmÞþDR2KL ðmjtÞþDR2KL ðt þmÞ þbiasþunexplained. This decomposition has five components. The first component, DR2KL ðtjmÞ, measures the added deviance due to local trait-based selection beyond that due to the meta-community dispersal mass effects (i.e.


59


B. Shipley

the meta-community prior). The second component, DR2KL ðmjtÞ, measures the added deviance due to meta-community dispersal mass effects (i.e. the meta-community prior) beyond that related to the effects attributed to local trait-based selection. The third component, DR2KL ðt þmÞ, measures the joint contribution due to correlations between the two. This joint contribution can be equiva lently expressed as DR2KL ðt þmÞ ¼ DR2KL ðmjuÞDR2KL ðmjtÞ 2 ¼ DRKL ðtjuÞDR2KL ðtjmÞ . The total deviance due to local trait-based selection is the added effect due to trait-based selection plus some unknown proportion of the joint effect. The total deviance due to meta-community effects is the added effect due to meta-community effects plus some unknown proportion of 2KL ðuÞÞ the joint effect. Because the amount of model bias ðR will vary between studies depending on the number of species in the species pool and the number of traits used, it is preferable to standardize these values by dividing each by 2KL ðuÞ. Thus: the biologically relevant deviance: 1 R DR2KL ðtjmÞ 2KL ðuÞ 1R

1.

Pure trait effects :

2.

Pure meta-community effects :

3. 4.

Joint effects :

DR2KL ðmjtÞ 2KL ðuÞ 1R

DR2KL ðm þ t Þ 2KL ðuÞ 1R

1 R2KL ðm; tÞ Unexplained deviance : 2KL ðuÞ 1R

ð5Þ

ð6Þ

ewxi qi ¼ S¼10 P wx e ii

ð7Þ

ð9Þ

i¼1

ð8Þ

Simulations: The simulations generate distributions of relative abundance both in the meta-community and in the local community, and also specify the link between the two scales. Since all real relative abundance distributions are strongly uneven, with a few dominant species and many subdominants, this pattern is respected in the simulations. Furthermore, I keep these simulations as simple as possible in order to facilitate clarity in the underlying patterns. The relative abundances for each of ten species (the species pool) in the meta-community are generated using Eq. 9. In this equation the variable xi represents the value of some property (x) determining the relative abundance (qi) of species i in the larger meta-community, and w (set to 0.3 in these simulations) is a weight representing by how much a unit change in x would change relative abundance. Note that if w is zero then the meta-community prior reverts to a uniform distribution. Positive values of w mean that species having more of property x will have higher relative abundance, while negative values of w mean that species having more of property x will have

60

lower relative abundance. This property (x) represents some factor acting in the larger landscape and potentially (if w 6¼ 0) causing different species to have different abundances in this larger landscape. For instance, it could be the same trait as modelled in the local community, it could be an unmeasured trait, it could represent the preference of each species by humans in the past, resulting in different relative abundances of these species in the landscape, or it could be any other cause generating differences in relative abundance in the meta-community, including purely neutral processes such a random speciation events (Hubbell 2001). Because x could potentially be a trait, the metacommunity assembly is not necessarily neutral, which is why I qualify the neutral assumption to ‘local’ neutrality. Whatever the nature of this property, it could be independent of the causal factors determining relative abundance in the local community or it could be correlated with these local causal factors. In empirical studies the meta-community prior would come from the estimated relative abundances of each species measured at the meta-community level. For instance, if many local communities have been sampled, then the meta-community relative abundances would come from the pooled abundances of each species over all local communities.

The relative abundances (oi) of each of ten species in the local community are generated using Eq. 10. For simplicity, I use only a single trait. This is the same as Eq. 3 plus a random value (ui) representing demographic stochasticity, which is generated by a uniform random value between 1 and 10, and a is the weight (set to 0.2 in these simulations) associated with u; larger values of a result in more random variation in the local relative abundances around the value predicted by the CATS model. As previously stated, ti is the trait value of species i (here, ti = i) and k is a weight measuring by how much a unit increase in the trait will change the proportional relative abundance of species in the local community. If k is zero, then there is no trait-based selection in the local community and observed relative abundances are entirely determined by the random component. qi ekti eaui ewxi þkti þaui oi ¼ S¼10 ¼ S¼10 P P wx þkt þau qi ekti e0:2ui e i i i i¼1

ð10Þ

i¼1

The link between local and meta-community relative abundances is made in two ways. First, the local relative abundances are partly determined by the meta-community relative abundances (q), as specified by Eq. 10.



B. Shipley

Second, one can introduce an indirect link between the meta-community and local relative abundances by allowing a correlation between the landscape property (x) and the trait values (t). To produce a strong level of correlation between x and t, I let x = t + N(0,1), where N(0,1) is a random value drawn from a normal distribution with a mean of zero and a SD of 1. The simulations were done using the R language. The k parameters of Eq. 3 were estimated by entropy maximization using the Improved Iterative Scaling algorithm (Della Pietra et al. 1997) as implemented in the maxent function of the FD library in R. The values of 2KL ðuÞ and R2KL ðmÞ in each simulation run were estimated R from 500 independent permutations using the maxent.test function of the FD library. For each case, I ran 50 independent simulations. Appendix S2 provides a worked example of a simulation run and Appendix S3 provides the R script for the simulation. Appendices S4 and S5 provide the R script for the maxent and maxent.test functions.

Results The first scenario (A) models the case in which there is no local trait-based selection (k = 0) and no correlation between the trait values and the property (x) generating the differences in relative abundance in the meta-community. Table 1 lists the decomposition of R2KL after correcting for model bias. Figure 2a shows the result of a representative simulation. There is no correlation between the trait values and the relative abundances, either in the metacommunity or in the local community. However, there is a positive correlation (mean r = 0.78) between the relative abundances at the two spatial levels. One per cent of the total deviance in the local relative abundances is attributed solely to the trait, 67% is attributed to meta-community effects and 2% is attributed jointly to trait/meta-community effects, while the remaining 30% is unexplained. The second scenario (B) adds local trait-based selection in which individuals with larger values of the trait are selectively advantaged (k = 0.3). There is still no correla-

Table 1. Decomposition of the deviance in local relative abundances as a proportion of the total biologically relevant deviance. Scenario

Contribution by traits, given metacommunity

Contribution by metacommunity, given traits

Joint contributions

Unexplained

A B C D E

0.0128 0.4348 0.4256 0.1527 0.3165

0.6678 0.3244 0.3354 0.0500 0.2385

0.0219 0.0296 0.0518 0.7194 0.2386

0.2977 0.2111 0.1870 0.0779 0.6869

tion between trait values and the meta-community property (x). The decomposition (Table 1) now attributes 43% of the total deviance to trait selection, 32% to meta-community effects, and only 3% to joint trait/meta-community effects (Fig. 2b). Although the local and metacommunity relative abundances were still positively correlated (mean r = 0.82), species having low values of the trait have lower local relative abundances than they do in the meta-community, while species with larger trait values have higher local relative abundances than they do in the meta-community. While the trait values are uncorrelated with meta-community relative abundances, they are moderately positively correlated with local relative abundances. The third scenario (C) is the same as the second except that individuals with larger values of the trait are selectively disadvantaged (k = 0.3). We see the opposite trend when comparing Fig. 2c with Fig. 2b. The decomposition of the deviance is essentially the same as seen in the second scenario, with 43% being attributed to local trait selection, 34% to the meta-community and almost none (5%) to joint effects. The fourth scenario simulates the case in which there is positive trait selection in the local community (k = 0.3) and there is also a positive correlation (r = 0.95) between the trait and the meta-community property (x); i.e. larger trait values increase relative abundance both in the local community and in the meta-community. This is seen in Fig. 2d, where both the local and meta-community relative abundances are now positively correlated with the trait values, but where species with lower than average trait values have even lower local relative abundances than in the meta-community, while the opposite occurs in those species having larger than average trait values. The decomposition (Table 1) attributes little (15%) of the total deviance only to local trait selection, and even less (5%) only to the meta-community, but a large percentage (72%) of the deviance can be attributed jointly to trait/ meta-community effects. This occurs because the trait actually causes most of the variation in the meta-community property (x). Because selection is acting in the same direction in both the local and meta-communities, only 8% of the biologically relevant deviance is unexplained. The final scenario simulates the case in which there is a negative trait selection in the local community (k = 0.3) but a positive correlation (r = 0.96) between the trait and the meta-community property (x), meaning that there is a positive selection of individuals in the meta-community. The only difference between this scenario and the previous one is that selection at the local level is acting in the opposite direction to that occurring in the meta-community. However, the result is quite different (Fig. 2e). Although there is still a positive correlation between the trait value


61


B. Shipley

(a) 0.500

0.500

0.050

Relative abundance

0.500

Relative abundance

Relative abundance

(c)

(b)

0.050

0.005

0.005

0.005

0.001

0.001

0.001

2

4

6

8

10

2

4

Trait value

6

8

10

2

4

6

8

10

Trait value

Trait value

(d)

(e)

0.500

0.500

Relative abundance

Relative abundance

0.050

0.050

0.050 meta-community local community

0.005

0.005

0.001

0.001 2

4

6

8

10

2

4

Trait value

6

8

10

Trait value

Fig. 2. Simulation results showing the relationship between relative abundance in the meta-community and in a local community under five different scenarios; note the logarithmic scale of abundances. (a) No local trait-based selection and independence of trait values and meta-community process. (b) Positive local trait-based selection and independence of trait values and meta-community process. (c) Negative local trait-based selection and independence of trait values and meta-community process. (d) Positive local trait-based selection and positive correlation between trait values and metacommunity process. (e) Negative local trait-based selection and positive correlation between trait values and meta-community process.

and the meta-community relative abundances there is none in the local community; also, the correlation between the relative abundances at the two spatial scales is zero. None the less, negative trait selection is revealed by the fact that the local relative abundances of the species with the smallest trait values are higher than those in the meta-community, while the opposite pattern is found in the species with the largest trait values. The decomposition assigns 32% of the deviance to traits, 24% to the metacommunity and -24% to the joint trait/meta-community effect. As a consequence, the total deviance assigned to traits is 32–24 = 8% (i.e. assuming that all of the joint effect is really due to traits), and 24–24 = 0% of the total deviance is assigned to the meta-community (i.e. assuming that all of the joint effect is really due to the meta-commu-

62

nity). Table 1 of the Supplementary Information lists the mean and SE of the estimated values of the various statistics for each scenario.

Discussion Shipley et al. (2012) studied the relative contributions of trait-based selection and immigration from the meta-community in a species-rich tropical forest in French Guiana. To do this, they proposed a decomposition of the total deviance in the local relative abundances similar to that presented in this paper. Indeed, the results for that particular data set are identical to those obtained using the modifications in this paper. However, other researchers (D. Xing, G. Sonnier, pers. comm.) have found nonsensical results



B. Shipley

when using the original method, such as negative values of the Kullbeck–Liebler index when measuring the fit obtained when using a meta-community prior and per 2KL ðmÞÞ, and seemingly counter-intuitive muted traits ðR results, such as negative joint contributions of the traits and the meta-community. The modifications presented 2KL ðmÞÞ and the here correct the nonsensical values of ðR simulation results provide guidance in biologically interpreting the joint contribution in the decomposition. The meta-community prior estimates the expected contribution of the meta-community to the determination of local relative abundances through dispersal mass effects. 2KL ðmÞ, which is obtained by One estimate of this effect is R including the actual meta-community relative abundances but randomly permuting the observed local trait values. 2KL ðmÞ arises in the speThe nonsensical negative value of R cial case where there is selection of the trait both in the local community and in the meta-community but the direction of selection is opposite at the two levels. This was the case in simulation E; smaller values of the trait were selectively favoured in the local community, but the selective pressure favoured positive values of the meta-community property (x), which was positively correlated with the trait. By permuting the trait values relative to the local relative abundances, one removes any information about local traitbased filtering. However, because the actual relative abundances in the meta-community (i.e. the meta-community prior) are negatively correlated with the actual local relative abundances (because trait selection is acting in different directions at the two levels), the fitted values are now negatively correlated with the actual relative abundances. Thus, the fit is actually worse than that obtained when ignoring both trait selection and the meta-community relative abundances. This is not a special property of the R2KL statistic; the same result would occur if one used the classic R2 because it too is defined relative to an ‘intercept-only’ model. This is why it is necessary to modify the original def 2KL ðmÞ cannot be less than the average value inition so that R expected when one possesses the minimal possible amount of information about the prior distribution (i.e. the uniform prior) and when all traits are forced to be independent of the observed local relative abundances. Using the modifications presented here, one obtains decompositions that are biologically meaningful. In particular, as the simulations show, the joint trait/meta-community component measures the degree to which the same processes (or processes that are correlated) act at both spatial scales. A positive joint contribution indicates that the meta-community process is positively correlated with local trait-based selection. This will occur, for example, if the same traits are selecting species in the same direction at both spatial scales. A negative joint contribution indicates

that the meta-community process is negatively correlated with local trait-based selection. It is easy to envisage how a positive joint contribution might occur. If the same traits are selecting species in the same way in many different local communities in the landscape then the selectively advantaged species will be more abundant in the meta-community, and so they will contribute more immigrants to the focal local community. These same species will also be selectively favoured in the focal local community, thereby further increasing their relative abundance. Because the same process is occurring at both scales, it is not possible to disentangle the effects of the meta-community from those acting in the local community. This might occur, for instance, in a local community of understorey plants that is found in a larger forested landscape. The same traits of understorey plants that are advantageous in the larger forested landscape – and, therefore, that are associated with higher relative abundance in the larger landscape and therefore, are possessed by the most common immigrants into the local community – will also increase the probabilities of growth, survival and reproduction in the plants already found in the local understorey community. It is also easy to envisage how a negative joint contribution might occur. This might occur, for instance, in a local community of understorey plants growing in a small woodlot which is found in a larger landscape of open fields. The values of the functional traits that are adaptively advantageous are negatively correlated in these two different environments (open fields, understorey). The same trait values that would increase immigration mass flow of individuals from the open fields to the understorey, thereby increasing the understorey relative abundance of such species, would decrease the probabilities of growth, survival and reproduction of plants from such species when actually growing in the understorey of the woodlot, thereby decreasing their relative abundance. A zero joint contribution might occur if the processes determining relative abundance in the meta-community are independent of the measured traits or if the environmental conditions in the different local communities, which together form the meta-community, are very heterogeneous. It is also possible to detect certain aspects of a purely neutral signal. This would occur if the contributions of both traits and joint trait/meta-community effects were zero. However, such a signal simply excludes trait-based causes of relative abundance at both the local and meta-community levels, assuming that all relevant traits have been included and that that the appropriate statistical tests of significance have been done (Shipley 2010b), but does not exclude other causes of meta-community abundances such as past historical events, which are not truly neutral.


63


B. Shipley

The final component of the decomposition (Eq. 8) represents all those causes of differences in relative abundance at the local level that cannot be attributed either to trait differences acting in the local community or to processes in the larger landscape that are reflected in the meta-community relative abundances (and that might also involve functional traits). If all important functional traits are included, then this remaining component would represent local demographic stochasticity. Of course, any errors in estimating the first two components would contribute to this last component and so it should more properly be interpreted as an upper bound on the actual contribution of demographic stochasticity. In practice, one would normally have abundance data from many local communities within some larger landscape and either a single traitXspecies matrix or separate traitXspecies matrices measured in each local community. One could either perform the decomposition separately for each local community, thus obtaining information on how the relative importance of meta-community effects, trait effects, joint effects and unexplained effects vary over the landscape, or one could obtain predicted relative abundances (pi) from each local community given the four different models described in the Methods and combine them into single R2KL values (Eq. 4) by summing over each local community. This latter approach would quantify the average importance over the entire landscape. In conclusion, the original decomposition given in Shipley et al. (2012), while giving correct values in that particular data set, requires modification in order to avoid 2KL ðmÞ that could occur. nonsensical negative values of R When the modifications described in this paper are made, the decomposition provides correct values. Furthermore, positive or negative values of the joint composition inform us of the importance and direction of correlation between local trait-based selection and processes occurring in the larger meta-community.

Acknowledgements This research was financially supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery grant.

References Bell, G. 2000. The distribution of abundance in neutral communities. American Naturalist 155: 606–617. Borcard, D., Legendre, P. & Drapeau, P. 1992. Partialling out the spatial component of ecological variation. Ecology 73: 1045– 1055. Cameron, C.A. & Windmeijer, F.A.G. 1997. An R-squared measure of goodness of fit for some common nonlinear regression models. Journal of Econometrics 77: 329–342.

64

Clements, F.E. 1936. Nature and structure of the climax. Journal of Ecology 24: 252–284. Cottenie, K. 2005. Integrating environmental and spatial processes in ecological community dynamics. Ecology Letters 8: 1175–1182. Della Pietra, S., Della Pietra, V. & Lafferty, J. 1997. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 19: 1–13. Fisher, R.A. 1925a. The influence of rainfall on the yield of wheat at Rothamsted. Philosophical Transactions of the Royal Society of London B 213: 89–142. Fisher, R.A. 1925b. Statistical methods for research workers. Oliver & Boyd, Edinburgh, UK. Gilbert, B. & Lechowicz, M.J. 2004. Neutrality, niches, and dispersal in a temperate forest understory. Proceedings of the National Academy of Sciences of the United States of America 101: 7651–7656. Gleason, H.A. 1926. The individualistic concept of the plant association. Bulletin of the Torry Botanical Club 53: 7–26. Grime, J.P. 1979. Plant strategies and vegetation processes. John Wiley & Sons, New York, NY, US. Harpole, W.S. & Tilman, D. 2006. Non-neutral patterns of species abundance in grassland communities. Ecology Letters 9: 15– 23. Hubbell, S.P. 2001. The unified neutral theory of biodiversity and biogeography. Princeton University Press, Princeton, NJ, US. Jaynes, E.T. 2003. Probability theory. The logic of science. Cambridge University Press, Cambridge, UK. Keddy, P.A. 1992. Assembly and response rules: two goals for predictive community ecology. Journal of Vegetation Science 3: 157–164. McGill, B.J., Maurer, B.A. & Weiser, M.D. 2006. Empirical evaluation of neutral theory. Ecology 87: 1411–1423. Nekola, J.C. & Brown, J.H. 2007. The wealth of species: ecological communities, complex systems and the legacy of Frank Preston. Ecology Letters 10: 188–196. Poorter, L., Wright, S.J., Paz, H., Ackerly, D.D., Condit, R., Ibarra-Manriquez, G., Harms, K.E., Licona, J.C., MartinezRamos, M., Mazer, S.J., Muller-Landau, H.C., Pe~ na-Claros, M., Webb, C.O. & Wright, I.J. 2008. Are functional traits good predictors of demographic rates? Evidence from five neotropical forests. Ecology 89: 1908–1920. Shipley, B. 2009. Limitations of entropy maximization in ecology: a reply to Haegeman and Loreau. Oikos 118: 152–159. Shipley, B. 2010a. From plant traits to vegetation structure: chance and selection in the assembly of ecological communities. Cambridge University Press, Cambridge, UK. Shipley, B. 2010b. Inferential permutation tests for maximum entropy models in ecology. Ecology 91: 2794–2805. . 2006. From plant traits to plant Shipley, B., Vile, D. & Garnier, E communities: a statistical mechanistic approach to biodiversity. Science 314: 812–814. Shipley, B., Laughlin, D.C., Sonnier, G. & Otfinowski, R. 2011. A strong test of the maximum entropy model of trait-based community assembly. Ecology 92: 507–517.



B. Shipley

Shipley, B., Paine, C.E.T. & Baraloto, C. 2012. Quantifying the importance of local niche-based and stochastic processes to tropical tree community assembly. Ecology 93: 760–769. Sonnier, G., Shipley, B. & Navas, M.L. 2010. Plant traits, species pools and the prediction of relative abundance in plant communities: a maximum entropy approach. Journal of Vegetation Science 21: 318–331. Sonnier, G., Shipley, B., Fayolle, A. & Navas, M.L. 2011. Quantifying trait selection driving community assembly: a test in herbaceous plant communities under contrasted land use regimes. Oikos 121: 1103–1111. Tilman, D. 1982. Resource competition and community structure. Princeton University Press, Princeton, NJ, US.

Supporting Information Additional supporting information may be found in the online version of this article: Appendix S1. Proof of the decompositions of the Kullback–Leibler index. Appendix S2. Example of simulating community assembly and extracting results. Appendix S3. The R script for the function ShipleyJVS2013 to perform the simulations. Appendix S4. The R script for the function maxent2. Appendix S5. The R script for the function maxent.test2.


65