Journal of Marine Systems 60 (2006) 153 – 166 www.elsevier.com/locate/jmarsys
Gaining insight into food webs reconstructed by the inverse method Julius K. Kones a,*, Karline Soetaert b, Dick van Oevelen b, John O. Owino a, Kenneth Mavuti c a
University of Nairobi, Department of Mathematics, P.O. Box 30197, 00100 Nairobi, Kenya b Netherlands Institute of Ecology, PB 140, 4400 AC Yerseke, The Netherlands c University of Nairobi, Department of Zoology, P.O. Box 30197, 00100 Nairobi, Kenya
Received 16 June 2005; received in revised form 6 December 2005; accepted 9 December 2005 Available online 7 February 2006
Abstract The use of the inverse method to analyze flow patterns of organic components in ecological systems has had wide application in ecological modeling. Through this approach, an infinite number of food web flows describing the food web and satisfying biological constraints are generated, from which one (parsimonious) solution is drawn. Here we address two questions: (1) is there justification for the use of the parsimonious solution or is there a better alternative and (2) can we use the infinitely many solutions that describe the same food web to give more insight into the system? We reassess two published food webs, from the Gulf of Riga in the Baltic Sea and the Takapoto Atoll lagoon in the South Pacific. A finite number of random food web solutions is first generated using the Monte Carlo simulation technique. Using the Wilcoxon signed ranks test, we cannot find significant differences in the parsimonious solution and the average values of the finite random solutions generated. However, as the food web composed of the average flows has more attractive properties, the choice of the parsimonious solution to describe underdetermined food webs is challenged. We further demonstrate the use of the factor analysis technique to characterize flows that are closely related in the food web. Through this process sub-food webs are extracted within the plausible set of food webs, a property that can be utilized to gain insight into the sampling strategy for further constraining of the model. D 2005 Elsevier B.V. All rights reserved. Keywords: Inverse method; Carbon flow; Food webs; Parsimonious solution; Mean solution; Factor analysis (FA); Principal components (PC)
1. Introduction A food web structure determines how primary production is channeled between compartments in a food chain. One major problem in ecological research is to quantify the exchange of mass or energy between all the food web components. Although our understanding of food webs would be substantially enhanced if we
* Corresponding author. E-mail address:
[email protected] (J.K. Kones). 0924-7963/$ - see front matter D 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.jmarsys.2005.12.002
were able to estimate all these fluxes, this is very difficult or impossible to do experimentally or in the field. To overcome this data deficiency, various mathematical models that estimate the unmeasured quantities in a food web have been developed. The inverse method is such a mathematical technique that has been applied to estimate food web fluxes in undersampled environments. The structure of the model is such that it can be used to deduce flow networks that conserve mass, comply with any rate measurements performed, satisfy basic biological constraints, and are compatible with the observed structure of the food web
154
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
(Vezina and Platt, 1988). Typically the problem is written as a set of linear equations: Ax ¼ b Gxzh
ð1Þ
where A contains the linear mass balance and data functions, x are the (unknown) flows, b are the rates of change and measurements, G is a matrix of the inequality coefficients and h is a vector of inequality bound constants. The equalities in ‘ (1) contain the mass balances for each constituent (i.e. sources = sinks) and any measurements that have been performed. The inequalities are used to: (i) ensure that the flows are positive and (ii) include the dprior informationT on the model parameters, expressed as duncertaintiesT. See Donali et al. (1999) for a worked out example. In typical ecological applications, there are more unknown flows than equations, the ratio in practice is often about 4 : 1 (Vezina and Pahlow, 2003). This makes the model underdetermined, such that an infinite number of equally plausible food web solutions exist. Although ecologists are aware of this underdeterminacy, many past studies proceed to generate one solution, and ignore the existence of other solutions. This one solution is normally selected by minimizing a goal function based on the principle of parsimony, i.e. the simplest food web is selected. The goal could be the P function 2 sum of squared flows min , in which case the x n least distance programming technique (Lawson and Hanson, 1995) is applied to solve P the model; or it could be the sum of flows, min n jxj , and in this case the model is solved by linear programming techniques (Vanderbei, 1996). The validity of this minimization step is yet to be adequately justified. A few recent studies did question the minimization in the inverse method as well as the reliability of the solutions generated. Vezina and Pahlow (2003) investigated whether the steady state assumptions and flow minimization principle underlying inverse modeling introduce distortions into the reconstructions of ecosystem flows. They show that the accuracy of the inverse model depends on the structural and dynamic features of the ecosystem. Vezina et al. (2004) explored objective functions other than the minimum norm. They found that the inverse problems are too underdetermined to permit maximization of goal functions. Furthermore, simultaneously minimizing the squared flows and the squared differences between flows has a smoothing effect that makes the inversion flows as even as possible, and this smoothed norm is most robust in comparative analyses.
However, the question of interest is how we can comprehensively present and interpret the various model solutions that result from the underdeterminacy of the inverse method. This challenge is probably a reason why alternate solutions to a food web problem are generally not discussed in the scientific literature. In other words, can we use the infinite number of different solutions describing the food web of one system, to enhance our understanding of the food web functioning. Thus, we wish to assess whether there exist certain patterns in these food webs, i.e. if a certain group of flows in the food web are correlated. This paper examines the food web solutions of two published planktonic food webs generated by the inverse linear model, using statistical techniques. First we test whether the mean of the various model solutions is an alternative to the parsimonious solution. We then explore the use of factor analysis to characterize the flows that are closely inter-related. This work intends not only to offer an alternative way of presenting the solutions of an inverse analysis of ecosystem flows, but also to provide a (complementary) method to quantitatively investigate the potential biases of the inverse solutions. 2. Materials and methods 2.1. The inverse method, constraints and parameters The most widely used representation of food web structure and dynamics is the compartmental model (Platt et al., 1981). Budgets of food webs can be constructed from the bottom up, with prior estimates of the flows combined to form a network whose state of balance is left uncontrolled (Vezina and Platt, 1988). In contrast, the inverse method demands that the inferred flows are consistent with the observed flows and with certain constraint relationships and that mass balance is assured. Inverse methods can also be called forward modeling since it uses observations to reconstruct a model of natural systems that are potentially observable but difficult to measure (Vezina et al., 2004). Inverse methods are well established in physical sciences (Wunsch, 1996; Scales et al., 2001). The power of the inverse analysis for ecological application lies in the constraint relationships where properties, relating to the organism’s carbon or nutrient flows, and for which extensive historical or literature data is available, can be formulated. The importance of these constraints is that they allow a complete and realistic food web to be inferred from measured flows, organism abundances, and environmental properties.
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
155
The use of the inverse analysis method to estimate flows of underdetermined ecological systems was pioneered by Vezina and Platt (1988). They applied the inverse method in the English Channel and in the Celtic Sea systems, both of which had incomplete sets of in situ measurements. In this work, the least squares method was used to estimate carbon and nitrogen flows in a food web. The inverse technique has since been applied by various ecologists and modellers. For instance, Vezina (1989) applied it to data from the Warm Core Rings Program by Ducklow et al. (1989). Donali et al. (1999) used it to estimate carbon flows in planktonic food web of the Gulf of Riga in the Baltic Sea. Other recent applications of the method on ecosystem flows estimation include Jackson and Eldridge (1992), Niquil et al. (1998, 1999), Vezina and Savenkoff (1999) and Leguerrier et al. (2003).
closed system since it is connected to the main part of the ocean by a few shallow channels. The study of the functioning of the planktonic community took place between 1990 and 1994 (Niquil et al., 1998) and the model of the carbon flows was solved using the inverse analysis method. Results of this model were subsequently analysed by means of network analysis methods (Niquil et al., 1999). This model considers 7 compartments: phytoplankton, bacteria, protozoa, micro and mesozooplankton, detritus and dissolved organic carbon (DOC). The inverse model comprised 32 flows and was constrained by 7 mass balances, 8 equations and 26 inequalities. The 8 measurement equations consisted of gross primary production, bacterial production and of meso- and microzooplankton production, food uptake and respiration rates.
2.2. Brief description of the food webs
2.3. Monte Carlo procedure
We considered two published cases. 2.2.1. The Gulf of Riga planktonic food web (Donali et al., 1999) The Gulf of Riga is a highly eutrophic system in the Baltic sea with high nutrient loadings from municipalities and the surrounding agricultural areas (Laznik et al., 1999). Between 1993 and 1995 data were gathered during 10-day sampling periods in spring, summer and autumn. Donali et al. (1999) used these data to reconstruct a carbon flux model of the planktonic food web using the inverse analysis method. The model comprised 7 functional compartments, namely picoautotrophs, non-picoautotrophs, heterotrophic nanoflagellates, zooplankton, bacteria, detritus including virus, and dissolved organic carbon (DOC). These compartments were connected with 26 flows. The inverse model equations consisted of the 7 mass balances that were complemented with 6 in situ measurements. These measurements comprised gross primary production, total community respiration, net bacterial production and sedimentation rates. Furthermore, 25 additional biological constraints that relate the different flows in the food web were imposed. In what follows we will only consider the food web model for autumn. 2.2.2. The Takapoto Atoll lagoon planktonic food web (Niquil et al., 1998) The Takapoto Atoll lagoon is located in the French Polynesia of the South Pacific. It is about 81 km2 with mean depth of 23 m. It is considered almost a
For each of the food webs described above, multiple solutions were generated using the FEMME software (Soetaert et al., 2002) as follows. First the range spanning each of the flows was assessed, by sequentially minimizing and maximizing each flow using the linear programming method. Then the values of several flows were imposed on the solution and randomly varied, assuming a uniform probability density function, within these predefined ranges. As there exist strong (multi-)linear relationships between flows, it is not feasible to randomly vary all flows, as in this case the probability of finding a solution that fits the original equations becomes virtually zero. Thus only 9 and 12 flows were assigned a random value in the Gulf of Riga and Takapoto Atoll, respectively. The other flows were solved using quadratic minimization. The model was run 100 000 times (Gulf of Riga) or 25 000 times (Takapoto). Only those flow combinations that complied with the original equality and inequality equations were retained. 2.4. Factor analysis The multiple solutions generated by the Monte Carlo procedure were interpreted by means of factor analysis (FA). The purpose of factor analysis is to reduce huge number of variables (in our case flows) to a lesser number of underlying factors that are being measured by variables. It is used in two ways: first, as an exploratory tool to determine the dimensions (factors) that account for patterns of collinearity among the
156
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
variables, and second as a confirmatory tool to determine if a group of variables (flows) each with multiple measurements exhibits a particular common behavior (Afifi et al., 2004). Some of the various applications of FA as a confirmatory tool are in Bartholomew and Knott (1999); the technique has also been used as an exploratory tool in linear structural models (Long, 1983), or in the natural sciences (Reyment and Joreskog, 1996), and in the health sciences (Pett et al., 2003). In the current context, FA is applied to explore the inter-correlations of a set of randomly generated food web flows that satisfies given constraints. An insight into the nature of the inter-relationships among variables is reflected in the inter-correlation matrix. Variables which measure the same dthingT are expected to correlate well with each other. All variables that don’t correlate with any other variable are therefore excluded before the factor analysis is run. This includes all fully constrained and hence invariant variables that are therefore also excluded from the analysis. On the other hand, factor analysis is also affected by too high correlation (multicollinearity) (Chartterjee et al., 1999). Although mild multicollinearity is not a problem for FA it is important to avoid extreme multicollinearity as well as singularity (variables that are perfectly correlated). Singularity causes a problem in FA because it becomes impossible to determine the unique contribution to the factor of the variables that are highly correlated. In practice, any variables that have a correlation value greater than 0.8 are excluded from FA (Chartterjee et al., 1999). 2.4.1. Testing for factorability of a correlation matrix The inter-correlation matrix assumes that highly correlated variables measure the same factors. The factor analysis involves extracting latent factors from among the variables. The procedure involves computing factor loadings, which are the measures of correlation between the variable and the factor that has been extracted from the data. The sum of the squared factor loadings for a particular variable, the communality, measures the variance of the variable that is accounted for by all the factors. Before one proceeds to run the FA, it is of importance to evaluate if the inter-correlation matrix is indeed factorable. An inter-correlation matrix is said to be totally non-factorable if the variables are totally noncollinear, meaning that we can extract as many factors as the variables. There are two tests of determining the factorability of an inter-correlation matrix. The Bartlett’s test of sphericity is one such method (see Afifi et al., 2004 and references therein for other methods). The test
is based on the null hypothesis that the inter-correlation matrix comes from a population in which the variables are non-collinear (i.e. an identity matrix) and that the non-zero correlations in the sample matrix are due to sampling error. The Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy is another method for testing the factorability of an inter-correlation matrix. This method considers that if two variables share a common factor with other variables, their partial correlation will be small due to the unique variance they share (and KMO 6 1). Conversely if the partial correlation approaches 1 or 1, then KMO 6 0 and the variables are not measuring a common factor. 2.4.2. Factor extraction Various methods exist for extracting an initial solution from an inter-correlation matrix. These include principal component analysis (PCA), the maximum likelihood method, unweighted least squares method, generalized least squares method, the alpha method and the image factoring method. Of all these methods the PCA is most widely used. The procedure of PCA is such that each variable is standardized to have a mean of zero and a standard deviation of F 1.0 (Fisher and van Belle, 1993). Because of this transformation, the total variance to be explained is equal to the number of variables under investigation. Therefore, since a variable is to account for one unit of variance, a useful factor is the one that accounts for more than one unit of variance, or has an eigenvalue k N 1.0. Otherwise the factor extracted explains no more variance than a single variable. 2.4.3. Factor rotation Sometimes one or more variables may load about the same on more than one factor (a situation referred to as complex structure), making the interpretation of the factors ambiguous (Harris, 2001; Jolliffe, 2002). An ideal situation would be that each variable loads high on one factor and approximately zero on all other factors. Factor rotation is a technique applied to clarify a factor pattern by computing new factors whose loadings are easier to interpret. Rotation helps to maximize the loading of each variable on one of the extracted factors while at the same time minimizing the loading on all the other factors (Gorsuch, 1983). Various techniques for factor rotation exist including, amongst others, varimax, quartimax, equimax and oblique rotation. We hereby only describe how the varimax rotation method works given that it is the most commonly used
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
method. The varimax procedure restricts the new axes to being orthogonal to each other. The orthogonality of the rotated factors ensures that they are uncorrelated with each other. Computationally, the varimax rotation is achieved by maximizing the sum of the variances of the squared factor loadings within each factor. These factor loadings are adjusted by dividing each of them by the communality (the variance accounted for) of the corresponding variable, a procedure referred to as Kaiser normalization (Harman, 1976). This has the effect of equalizing the impact of variables such that variables with higher communalities would not highly influence the final solution. 2.4.4. Goodness of fit of a factor solution We can evaluate the goodness of fit of the FA by comparing the inter-correlation matrix generated by the factor solution with the original inter-correlation matrix. The factor score computed for each variable is used to compute the new inter-correlation matrix. For a good fit, most of the residuals between the observed and the reproduced bivariate correlations are significantly small (b 0.05). All statistical analyses were performed using the SPSSn software version 10 (SYSTAT Software Inc.). 3. Results and discussion 3.1. Monte Carlo results The Monte Carlo technique for the Gulf or Riga food web and Takapoto food web generated 5633 and 1499 valid food webs (out of 100 000 and 25 000 runs), respectively. Thus 5.5% to 6% of the random flow combinations produced a food web that satisfied the food web equations and constraints. Fig. 1a and b represents the relationships for the Gulf or Riga food web and the Takapoto food web (only the first 1000 food webs represented, but figures are similar when plotting all food webs). Ranges of the food web flows, the mean value (F standard deviation) and the parsimonious solution can be found in Table 1a (Gulf of Riga) and 1b (Takapoto). Note that not all flow combinations within the min–max range give plausible food webs, and part of the area does not contain solutions (Fig. 1a,b). The mean solution norm (Ax 2) for the different Gulf of Riga food webs was 1.74 105 (Table 1a) and varied between 1.41 105 and 2.11 105. The norm for the parsimonious food web, which is the minimum solution norm, was lower (1.35 105) than the minimum generated through the Monte Carlo technique. Ranges were
157
larger for the Takapoto food web (1.0804.44 106), mean 2.27 106. The Gulf of Riga food web is found to be more constrained than the Takapoto food web. First of all, the domain of plausible food webs is more sparsely populated in the former (Fig. 1a) than in the latter (Fig. 1b). Second, the relationship of the standard deviation versus the mean when fitted with a power function in both cases (Fig. 2) shows that the power for the Takapoto (0.7) is more than double the power for the Riga food web (0.3), whilst the offset is similar, which means that for flows with similar magnitude, the standard deviation for the Takapoto food web is much higher than for the Riga food web. 3.2. Comparison of the mean and the parsimonious solution We compared the parsimonious solution, estimated by minimizing the sum of squared flows, with the average values of the simulated random food web solutions generated by the Monte Carlo technique, hereafter denoted as the dmean solutionT. In both cases, the mean solution satisfied the imposed model constraints, just as it is true for the parsimonious solution; thus it constitutes a valid food web. There were 26 and 32 values of the mean and parsimonious solution of the Gulf of Riga and Takapoto food web, respectively (as shown in Tables 1a and 1b) corresponding to the estimates of the flows. Each set of these values was tested for normality. Results proved that in both of the systems they were not normally distributed. Therefore, comparison on the differences between the mean and parsimonious solutions was based on the non-parametric Wilcoxon signed ranks test. In both cases, it was concluded that they are not significantly different ( P = 0.094 for Takapoto Atoll lagoon and P = 0.338 for Gulf of Riga). Is this result a justification of using the minimization method to obtain one (parsimonious) solution? The basis of generating the parsimonious solution stems from the assumption that the best solution is the one that minimizes the sum of squares of the flows (Niquil et al., 1999). Its choice, therefore, is not based on the functioning of the ecosystem, and yet no justification has been put forward to support the minimization step as opposed to any other method (e.g. maximization method). It is quite possible that another of the many other solutions, though not giving the minimum sum of (squared) flows, actually best describes the food web flows in a system if they were measured. The main justification of the minimization
158
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
method seems to be that it allows comparison between different systems. As could be expected, our results show that the parsimonious solution and the mean of the many random solutions do not differ significantly from each other, i.e. we could adopt either as the solution. This
result was true for both systems and for various sample sizes of the random solutions (results not shown). This similarity, however, is true only regarding the ensemble of all system’s flows and doesn’t measure differences in flows at compartmental level. Moreover, this conclusion is probably not universal, as this result might be violated
Fig. 1. a. Pair-wise scatter plot of 1000 food web solutions, generated by the Monte Carlo technique, for the Gulf or Riga food web. Symbols X1. . .X23 denote food web flows; their meaning can be found in Table 1a. Flows that are constant are not depicted. On the diagonal is the histogram of the distribution of each flow; their minimum and maximum value can be found in Table 1a. As an example, the uppermost scatter plot depicts the (linear) relationship between flow X1 and flow X2. b. Pair-wise scatter plot of 1000 food web solutions, generated by the Monte Carlo technique, for the Takapoto food web. Symbols X1. . .X28 denote food web flows; their meaning can be found in Table 1b. Flows that are constant are not depicted. On the diagonal is the histogram for each flow; their minimum and maximum value can be found in Table 1b.
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
159
Fig. 1 (continued).
if the model is less constrained. Indeed, it appears that the more the model is constrained, the smaller the differences between the mean and the parsimonious solution (e.g. we obtain a lower significance value for the Takapoto Atoll lagoon model compared to the Gulf of Riga model value). The convergence of the parsimonious and mean solution under different levels of constrainment could be best studied by a twin experiment approach using an idealized food web structure. A close examination of the parsimonious solution reveals that in many cases it takes on the boundary
(maximum or minimum) flow value (see Tables 1a and 1b). Ignoring the constant flows, 8 out of 23 flows are at the minimum level, whilst 6 attain their maximum values for the Riga food web; for the Takapoto web this is 7 and 1 out of 28 flows at the minimum and maximum level, respectively. This characteristic makes the estimation of the solution unrealistic in many cases as it may lead to underestimation or overestimation of the true system’s flows. For instance, in the Takapoto Atoll lagoon model, the mean values (in mg C m 2 day 1) of four flows (detritus to bacteria,
160
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
Table 1a The mean (Fstandard deviation), the parsimonious solution, and the minimum and maximum flow values (mg C m 3day 1) of the various flows for the Gulf of Riga planktonic food web Flow (from Y to)
Mean (F SD)
Parsimonious solution
Minimum
Maximum
Symbol
CO2 Y picoautotrophs CO2 Y larger autotrophs Picoautotrophs Y CO2 Picoautotrophs Y DOC Picoautotrophs Y nanoflagellates Picoautotrophs Y zooplankton Larger autotrophs Y CO2 Larger autotrophs Y detritus Larger autotrophs Y DOC Larger autotrophs Y zooplankton Nanoflagellates Y CO2 Nanoflagellates Y DOC Nanoflagellates Y zooplankton Zooplankton Y CO2 Zooplankton Y detritus Zooplankton Y DOC Bacteria Y CO2 Bacteria Y nanoflagellates Bacteria Y sedimentation Detritus Y DOC Detritus Y zooplankton Detritus Y sedimentation DOC Y bacteria Picoautotrophs Y sedimentation Larger autotrophs Y sedimentation Zooplankton Y sedimentation Solution norm (Ax 2) / 105
25.80 59.98 6.66 8.09 2.19 10.96 14.25 8.14 16.53 20.53 6.36 1.94 0.58 28.72 0.47 13.07 279.78 6.86 2.58 3.41 12.48 11.34 295.97 0.10 0.34 0.78 1.74
31.32 54.46 17.23 1.57 4.12 10.49 29.95 4.46 2.72 16.80 13.40 0.00 0.00 30.20 3.18 3.96 244.99 9.44 0.00 0.00 12.34 13.92 261.18 0.10 0.34 0.78 1.35
19.16 54.46 0.96 0.96 0.09 0.00 2.72 0.00 2.72 0.00 0.75 0.00 0.00 16.09 0.00 2.20 244.99 0.63 0.00 0.00 0.00 5.11 261.18 0.10 0.34 0.78 1.41
31.32 66.62 17.23 31.32 4.12 30.19 36.64 59.43 62.76 59.43 13.40 7.59 7.59 79.57 90.43 54.42 314.64 9.44 8.81 7.11 147.09 13.92 330.83 0.10 0.34 0.78 2.11
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23
(3.4) (3.4) (3.7) (5.4) (1.3) (6.2) (8.0) (4.1) (10.1) (7.4) (3.5) (1.9) (1.3) (9.3) (1.1) (7.4) (10.9) (2.4) (2.4) (2.1) (2.4) (10.9) (3.9) (0.0) (0.0) (0.0) (0.13)
Symbol is as in Fig. 1a.
detritus to DOC, detritus to protozoa and bacteria to DOC) were 125.0, 17.5, 14.2 and 205.6, respectively (as obtained by the Monte Carlo sampling). The parsimonious solutions were all zero for these flows, meaning that a significant proportion of the system throughput is underestimated by this approach. The same is true for the Gulf of Riga model, with the parsimonious solution consisting of zeroes in four flows: bacteria to sediment, detritus to DOC, nanoflagellates to DOC, and nanoflagellates to zooplankton, although the respective mean flow values (in mg C m 3 day 1) were 2.58, 3.41, 1.94 and 0.58. In addition, the fact that the parsimonious solution does not have the associated standard deviations for estimated flows makes it more unattractive as compared to the mean estimates. It can be argued, in favor of the parsimonious solution that, at least when quadratic minimization is used, the parsimonious solution is unique at all times for a given set of model constraints (Lawson and Hanson, 1995), and is relatively straightforward to calculate. However, the mean solution also converges to a constant value as the number of random solutions becomes sufficiently large. To this extent we are led to conclude that
the averaging procedure provides a more realistic estimate of the food web flows than the parsimonious solution. With respect to the infinitely many solutions, either they should be incorporated in the analysis of the dynamics of the system being modeled, such as to gain a better understanding of its functioning, otherwise a justification has to be given for their exclusion. To some extent, part of this problem has already been addressed above, by interpreting the food web in terms of the mean and standard deviations of the infinitely many solutions. In this case, the mean is assumed to collapse the many possible ways at which the system can function into one simple value, and the level of uncertainty is expressed by the standard deviations. Another way of looking at this is by assessing whether, within the entire spectrum of the generated estimates of the food web flows, there exist particular patterns that certain groups of flows of the system exhibit. An insight into such patterns existing in a food web system may help in aggregating some system’s functional components into larger compartments and representing the food web in a more simple way. This is the subject of
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
161
Table 1b The mean (F standard deviation), the parsimonious solution, and the minimum and maximum flow values (mg C m 2 day 1) of the various flows for the Takapoto Atoll planktonic food web Flow (from Y to)
Mean (F SD)
Parsimonious solution
Minimum
Maximum
Symbol
CO2 Y phytoplankton Phytoplankton Y CO2 Phytoplankton Y detritus Phytoplankton Y DOC Phytoplankton Y mesozooplankton Phytoplankton Y microzooplankton Phytoplankton Y protozoa Protozoa Y CO2 Protozoa Y detritus Protozoa Y DOC Protozoa Y mesozooplankton Protozoa Y microzooplankton Microzooplankton Y detritus Microzooplankton Y DOC Mesozooplankton Y detritus Mesozooplankton Y DOC Bacteria Y CO2 Bacteria Y detritus Bacteria Y DOC Bacteria Y protozoa Detritus Y bacteria Detritus Y DOC Detritus Y protozoa Detritus Y mesozooplankton Detritus Y microzooplankton Detritus Y sedimentation DOC Y bacteria DOC Y protozoa Microzooplankton Y CO2 Mesozooplankton Y grazing Mesozooplankton Y CO2 Microzooplankton Y mesozooplankton Solution norm (Ax 2) / 106
1070.41 158.83 370.73 250.39 101.65 152.05 36.74 22.38 29.07 8.46 19.00 31.88 34.52 29.48 48.84 87.15 412.33 45.89 205.60 30.11 125.04 17.54 20.36 106.37 158.06 107.85 568.91 29.72 147.00 75.00 147.00 131.00 2.27
863.16 244.76 117.82 43.16 168.97 207.30 81.16 46.55 0.00 15.36 6.89 45.22 56.94 7.06 87.49 48.51 75.91 56.33 0.00 19.67 51.02 0.00 0.00 51.14 89.48 126.94 100.89 13.20 147.00 74.50 147.00 131.00 1.03
863.16 43.16 0.00 43.16 0.00 0.00 0.00 1.34 0.00 0.44 0.00 0.00 0.00 0.00 0.00 48.51 1.34 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 147.00 74.50 147.00 131.00 1.08
1640.00 492.00 776.84 820.00 227.00 342.00 114.03 87.49 56.57 547.58 68.42 68.42 64.00 64.00 87.49 136.00 684.00 68.42 68.42 76.00 547.58 547.58 114.03 227.00 342.00 540.15 820.00 114.03 147.00 74.50 147.00 131.00 4.44
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24 X25 X26 X27 X28
(144.5) (73.4) (134.1) (144.5) (57.9) (90.3) (26.2) (15.6) (16.1) (6.5) (18.6) (19.4) (18.2) (18.2) (24.4) (24.4) (121.3) (20.9) (137.2) (20.9) (87.5) (31.7) (14.3) (62.1) (90.3) (79.1) (165.3) (28.0) (0.0) (0.0) (0.0) (0.0) (0.6)
Symbol is as in Fig. 1b.
the next section, where we explore the food web flow patterns by the factor analysis method. 3.3. Factor analysis 3.3.1. Data screening Based on the conditions for factorability of an intercorrelation matrix described earlier the following flows were excluded from the analysis: 3.3.1.1. Gulf of Riga data. The fixed flows excluded are sedimentation of picoautotrophs, larger autotrophs and zooplankton (Table 1a). The flows with very high correlations with one or more other flows excluded from the analysis (See Fig. 1a) are: nanoflagellates grazing on bacteria, X18 (q = 0.8 with nanoflagellate respiration, X11), sedimentation of bacteria, X19 (q = 1 with nanoflagellate grazing on bacteria, X18),
primary production of larger autotrophs, X2 (q = 1 with primary production of picoautotrophs, X1), sedimentation of detritus, X22 (q = 1 with bacterial sedimentation, X19), zooplankton grazing on detritus, X21 (q = 0.895 with zooplankton grazing on larger autotrophs, X10) and bacterial consumption of DOC, X23 (q = 1 with bacterial respiration, X17). These high correlations are attributed to the constraints imposed based on field measurements. For instance, it was imposed that the total sedimentation from bacteria and detritus compartments equals a fixed value (13.92 mg C m 3 day 1), therefore, there is a perfect positive correlation between these two flows. Also, as the total gross primary production is constant and is contributed by the only two sources of primary productivity (picoautotrophs and larger autotrophs), means that the two primary producers are perfectly negatively correlated.
162
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
Fig. 2. Best fit power regression on the relationship between the mean value and the standard deviation of the flows of the food web of Riga (circles) and Takapoto (squares). Remark that the units for the Riga food web are in mg C m 3, for Takapoto food web in mg C m2.
After the exclusion of three constant flows and six highly correlated flows, we were now left with 17 flows for use in the analysis.
suitable method for analyzing the flows with similar patterns in a wide spectrum of probable food web solutions.
3.3.1.2. Takapoto Atoll data. There were four constant flows excluded from the factor analysis, namely: grazing of mesozooplankton, micro and mesozooplankton respiration, and grazing of mesozooplankton on microzooplankton (Table 1b). There were also four flows excluded because they were highly correlated with one or more other flows (Fig. 1b). They are gross primary production of phytoplankton, X1 (q = 1 with DOC release by phytoplankton, X3), microzooplankton flow to detritus, X13 (q = 1 with its own DOC release, X14) and mesozooplankton flow to detritus, X15 (q = 1 with its own DOC release, X16) and protozoan consumption of bacteria, X20 (q = 1 with bacterial flow to detrital matter, X18). These high correlations are also attributed to the constraints imposed based on field measurements. Thus, a total of eight flows were excluded and 24 flows remained for the analysis.
3.3.3. Factor extraction and rotation
3.3.2. Testing goodness of fit of FA The KMO value is 0.25 for both data sets indicating that there is 25% possibility of groups of flows explaining similar dthingT. Although this value appears to be low, for both cases, the Bartlett’s test of sphericity shows a high level of significance ( P b 0.01), indicating that some of the retained flows are highly related. This then gives credence of the use of FA analysis as a
3.3.3.1. The Gulf of Riga food web. The principal component analysis (PCA) was used as a first extraction method in the FA analysis. By only considering variables with eigenvalue greater than one, the PCA extracted eight principal components. These explain about 91% of the total variance. Out of this, the first factor accounts for 20%, followed by factor two with 18.6%. Summary results of the consequent factor rotation are shown in Table 2. We only selected components with factor loadings of over 0.5 as the ones being highly correlated, i.e. explaining the same thing. The diagrammatic representation of the sub-food webs described by each factor component is shown in Fig. 3. As earlier mentioned, up to six highly correlated flows were extracted before the factor analysis. By factor analysis, we were able to pinpoint four other dualities, expressed as sub-food webs in the form of factor components. The first factor component represents carbon loss by the larger autotrophs to three main compartments: DOC, detritus and zooplankton. In this carbon pathway, the loss to DOC is negatively correlated to the loss to detritus and also to its grazing by zooplankton. But the result shows positive correlation between the flow from the autotrophs to the detritus and the grazing of autotrophs by zooplankton. Thus, either carbon from the
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
163
Table 2 Total variance explained after the varimax rotation of the Gulf of Riga planktonic ecosystem Factor component
1
2
3 4 5 6 7 8
Flow description (from Y to)
Rotation sums of squared loadings a
X10: Larger autotrophs Y zooplankton X8: Larger autotrophs Y detritusa X9: Larger autotrophs Y DOCb X15: Zooplankton Y detritusa,c X14: Zooplankton Y CO2a X16: Zooplanton Y DOCb X13: Nanoflagellates Y zooplanktonc X12: Nanoflagellates Y DOCd X4: Picoautotrophs Y DOCa X6: Picoautotrophs Y zooplanktonb X11:Nanoflagellates Y CO2a X5: Picoautotrophs Y nanoflagellatesa X7: Larger autotrophs Y CO2a X17: Bacteria Y CO2b X3: Picoautotrophs Y CO2 X1: CO2 Y picoautotrophs X20: Detritus Y DOC
Total variance
% of variance
Cum. %
3.379
19.878
19.878
2.895
17.031
36.909
2.002
11.774
48.683
1.919
11.291
59.975
1.756
10.332
70.307
1.236 1.173 1.066
7.268 6.902 6.270
77.575 84.476 90.746
The flow symbols Xa refer to the symbols as used in Fig. 1a. Only factors with eigenvalues more than one are shown here. Flows with similar letter in a component are positively correlated, otherwise they are negatively correlated.
autotrophs is channeled through DOC at the expense of the grazing and detrital losses or vice versa. The second factor shows two paths: one comprises the loss of carbon by zooplankton to DOC, detritus and respiration, and the second is a loop describing the loss from nanoflagellates to zooplankton and DOC and the loss from zooplankton to DOC. In the first path, carbon loss to DOC by zooplankton has negative correlation with respiration and loss to detritus, but respiration and flow to detritus are found to be positively correlated. In the second path, nanoflagellates’ loss to DOC is negatively correlated to the grazing by zooplankton on
nanoflagellates and the loss of zooplankton to detritus. There is however, a positive correlation between the grazing of nanoflagellates by zooplankton and the loss of zooplankton to DOC. This may indicate that most of the nanoflagellate carbon make it to the DOC pool through being grazed by zooplankton. The third factor is similar to the first, but now centers around the small autotrophs. It represents the loss of carbon by picoautotrophs into two major compartments, to DOC and its grazing by zooplankton (there is no flow between picoautotrophs and detritus). The grazing of zooplankton on picoautotrophs is neg-
Fig. 3. Diagrammatic representation of the flows in each factor component for the Gulf of Riga: P1 = picoautotrophs, P2 = larger autotrophs, Zoo = zooplankton, Det = Detritus, Nan = nanoflagellates, CO2 = respiration.
164
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
atively correlated to the picoautotrophs flow to DOC i.e. the more zooplankton graze on picoautotrophs the less there is carbon loss to DOC and vice versa. The fourth factor is the representation of carbon flow from picoautotrophs to nanoflagellates and the respiration of nanoflagellates. These flows are positively correlated meaning that the more nanoflagellates graze on picoautotrophs, the more they (nanoflagellates) respire. The fifth factor relates the bacterial respiration and larger autotrophs respiration. The other factors do not distinctively identify groups of other related flows. The existence of these sub-food webs shows that the data, used in the inverse analysis, do not allow discriminating among them. However, it also means that simply measuring one of the flows will constrain the other flows of the sub-food web, as they are strongly correlated. Therefore, it gives insight into the sampling strategy that might further constrain the food web model. 3.3.3.2. The Takapoto Atoll lagoon food web. Considering variables with factor loading greater than one, the principal component analysis method extracted
eleven principal components. However, the sixth and tenth factors did not classify more than one flow belonging to that factor. In essence there were thus nine factor components explaining about 77% of the total variance. Out of this, factor 1 accounts for about 12%. The summary results of the factor rotation are shown in Table 3. The first factor represents the relationships between protozoa grazing on detritus and the dissolution of detritus into DOC. The positive relationship between both indicates that detritus losses are almost equally proportioned between these two paths. The second factor links protozoa excretion, respiration and egestion. Protozoa excretion and respiration are significantly positively correlated, whilst they are both significantly negatively correlated to egestion. The third factor describes two flow paths. The positive relation between the flow from phytoplankton to detritus and from detritus to microzooplankton suggests that most of the carbon that enters into the detrital compartment from phytoplankton is grazed on by microzooplankton. This flow path (phytoplankton–detritus–microzooplankton) is negatively correlated with the microzooplankton
Table 3 Total variance explained after the varimax rotation of the Takapoto Atoll lagoon ecosystem Factor component
Flow (from Y to)
Rotation sums of squared loadings Total
% of variance
Cumulative %
1
X22: Detritus Y DOCa X23: Detritus Y protozoaa X9: Protozoa Y detritusa X10:Protozoa Y DOCb X8: Protozoa Y CO2a X25: Detritus Y microzooplanktona X6: Phytoplankton Y microzooplanktonb X3: Phytoplankton Y detritusa X24: Detritus Y mesozooplanktona X5: Phytoplankton Y mesozooplanktonb X27: DOC Y bacteriaa X17: Bacteria Y CO2a X3: Phytoplankton Y DOCa X2: Phytoplankton Y CO2 X26: Detritus Y sedimentationa X21: Detritus Y bacteriab X19: Bacteria Y DOCb X12: Protozoa Y microzooplanktona X11: Protozoa Y mesozooplanktonb X7: Phytoplankton Y protozoaa X28: DOC Y protozoaa X18: Bacteria Y detritus X16: Mesozooplankton Y DOCa* X14: Microzooplankton Y DOCa*
2.893
12.055
12.055
2.496
10.401
22.455
2.437
10.152
32.608
2.294
9.557
42.164
2.174
9.059
51.223
2.032 2.007
8.466 8.361
59.690 68.051
1.603
6.679
74.730
1.515
6.313
81.043
1.237 1.075
5.155 4.479
86.198 90.677
2
3
4 5
6 7
8 9 10 11
The flow symbols Xa refer to the symbols as used in Fig. 1b. Only factors with eigenvalues more than one are shown here. Flows with similar letter in a component are positively correlated, otherwise they are negatively correlated, *indicate non-significant correlations at a = 0.05 level.
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
grazing on phytoplankton. Indeed, microzooplankton gets its carbon from either of the two paths. The fourth factor links mesozooplankton grazing on detritus and on phytoplankton. The negative relationship demonstrates selective grazing on these two food sources. Remark that, from the estimated average flows, mesozooplankton feeds more on detritus than on phytoplankton, whilst the parsimonious result indicates the contrary. The fifth factor is a path that links phytoplankton excretion of DOC to the uptake of DOC by bacteria and the respiration of bacteria. All flows are positively correlated; hence the more phytoplankton excretes, the higher the uptake of DOC by bacteria and the more it respires. Factor seven relates the sedimentation of detritus, use of detritus by bacteria and the excretion of bacteria. The relationships between the flows reveal a duality concerning the fate of detritus, which is either taken up by bacteria followed by bacterial excretion, or sediments out of the system. The eighth factor evidences negative correlations between the grazing on protozoa by the microzooplankton and the mesozooplankton, indicating competition between both grazers, whilst the ninth component shows negative correlation between protozoa grazing on phytoplankton and DOC. Finally, in the eleventh factor the excretions of DOC by micro and mesozooplankton are positively correlated, but this correlation is not significant. It is therefore safe to conclude, and since these are known to be independent flows, that the positive correlation is attributed to random error in the data. 4. Conclusion The use of the inverse linear method is still one of the best methods for estimating unmeasured flows of a food web in undersampled environments. However, the structural form of the inverse method often results in underdetermined sets of equations whose solutions will not be unique. The underdeterminacy of the model equations has led many modelers to resort to the principle of parsimony which assumes that the best solution is the simplest. Here we have challenged this view, suggesting that the mean of the infinite number of solutions may provide a better alternative. The parsimonious solution has the advantage that it is straightforward to calculate, and it assures comparability between different systems. However, as we show here, it underestimates or even overestimates the values of some flows of a food web. Some flows are plainly zeroed, i.e. they do not exist in the parsimonious solution, even
165
where in reality it is known that they do exist. This underestimation poses critical questions on the reliability of the conclusions derived from such solutions. In spite of the fact that the true estimates of the flows are not known, the parsimonious food web is a deterministic solution, which is as close as possible to a web with all flows zeroed. Using the (minimization) criterion only to single out a particular solution of the null space reduces the power of the inverse method. If there are reasons to assume that one solution is better than the other, then this solution should enter as explicit constraints in the inversion process, and the null space should be treated bhonestlyQ as null space about which we cannot say anything. It is in this view that we recommend that, where there are no good reasons to justify the use of the parsimony principle during the inverse analysis, good practice is to also report the average (and standard deviations) of a sufficiently large number of randomly generated solutions of the food web. We have demonstrated that, for the two cases considered, the average value also satisfies all the constraints that are imposed on the model. The averaging method produces more realistic estimation of the flows, as opposed to the parsimonious solution, i.e. it does not generate null-flows. Moreover, in contrast to the parsimonious solution which is just one value, the randomly generated solutions can be used to estimate uncertainties (standard deviations) of the flows. In addition, we argue that a food web analysis should discuss the realm of the infinite number of solutions. The method of FA has been used to identify groups of flows in different food web solutions that are highly correlated. The correlated flows are grouped into factors. For simple inverse models, factor analysis may not be necessary as its main purpose is to reduce the dimensionality of the food web, and/or identify sets of related flows in a multivariate space. However, for more complex models, the importance of this grouping is to prompt explanations as to why certain patterns of flows are inter-related. In this way, researchers may be guided on the direction that research is supposed to take in order to explain the obtained scenario. The results of FA may also be used to obtain better constrained food webs. The piecemeal analysis of the food web in terms of the factors in turn aids in giving more insight to otherwise complex food webs. Acknowledgements This work was supported by the VLIR-IUC-UON research grant to J. Kones. We would like to thank Ann Vanreusel for critically reading the manuscript. This is
166
J.K. Kones et al. / Journal of Marine Systems 60 (2006) 153–166
publication 3751 of the Netherlands Institute of Ecology (NIOO-KNAW). Temel Oguz and one anonymous reviewer are thanked for many useful suggestions and additions. References Afifi, A., Clark, V.A, May, S., 2004. Computer-Aided Multivariate Analysis, 4th ed. Chapman and Hall, CRC. Bartholomew, D.J., Knott, M., 1999. Latent Variable Models and Factor Analysis. 2nd ed. Charles Griffin and Company, London. Chartterjee, S., Hadi, A.S., Price, B., 1999. Regression Analysis by Example, 3rd ed. Wiley, New York. Donali, E., Olli, K., Heiskanen, A.S., Andersen, T., 1999. Carbon flow patterns in the planktonic food web of the Gulf of Riga, the Baltic Sea: a reconstruction by the inverse method. J. Mar. Syst. 23, 251 – 268. Ducklow, H.W., Fasham, M.J.R., Vezina, A.F., 1989. Derivation and analysis of flow networks for open ocean plankton systems. In: Wulff, F., Field, J.G., Mann, K.H. (Eds.), Network Analysis in Marine Ecology. Springer-Verlag, Berlin, pp. 159 – 205. Fisher, L.D., van Belle, G., 1993. Biostatistics: A Methodology for the Health Sciences. Wiley, New York. Gorsuch, R.L., 1983. Factor Analysis, 2nd ed. L. Erlbaum Associates, Hillsdale, New York. Harman, H.H., 1976. Modern Factor Analysis. University of Chicago Press, Chicago. Harris, R.J., 2001. A Primer of Multivariate Statistics, 3rd ed. Academic Press, New York. Jackson, G., Eldridge, P., 1992. Food web analysis of a planktonic system off Southern California. Prog. Oceanogr. 30, 223 – 251. Jolliffe, I.T., 2002. Principal Components Analysis, 2nd ed. SpringerVerlag, New York. Lawson, C.L., Hanson, R.J., 1995. Solving least squares problems. Siam Classics in applied mathematics. Philadelphia. Laznik, M., Stalnacke, P., Grimvall, A., Wittgren, H.B., 1999. Riverine input of nutrients to the Gulf of Riga — temporal and spatial variation. J. Mar. Syst. 23, 11 – 25. Leguerrier, D., Niquil, N., Boileau, N., Rzeznik, J., Sauriau, P.G., Moine, O.L., Bacher, C., 2003. Numerical analysis of the food web of an intertidal mudflat ecosystem on the Atlantic coast of France. Mar. Ecol., Prog. Ser. 246, 17 – 37.
Long, J.S., 1983. Confirmatory Factor Analysis. Sage, Newbury Park, CA. Niquil, N., Jackson, G.A., Legendre, L., Delesalle, B., 1998. Inverse model analysis of the planktonic food web of Takapoto Atoll (French Polynesia). Mar. Ecol., Prog. Ser. 165, 17 – 29. Niquil, N., Ernesto, J., Gonzalez, A., Delesalle, B., Ulanowicz, R.E., 1999. Characterization of the planktonic food web of Takapoto Atoll lagoon, using network analysis. Oecologia 118, 232 – 241. Pett, M.A., Lackey, N.R., Sullivan, J.J., 2003. Making Sense of Factor Analysis: the Use of Factor Analysis for Instrument Development in Health Care Research. Sage Publication, Inc., Thousands Oaks, CA. Platt, T., Mann, K.H., Ulanowicz, R.E., 1981. Mathematical models in biological oceanography. Monographs on Oceanographic Methodology, vol. 7. Unesco Press, Paris, France. Reyment, R.A., Joreskog, K.G., 1996. Applied Factor Analysis in the Natural Science, 2nd ed. Cambridge University Press, New York. Scales, J., Smith, M., Treitel, S., 2001. Introductory Geophysical Inverse Theory. Samizdat Press. Soetaert, K., deClippele, V., Herman, P.M.J., 2002. FEMME, a flexible environment for mathematically modeling the environment. Ecol. Model. 151, 177 – 193. Vanderbei, R.J., 1996. Linear Programming: Foundations and Extensions. Kluwer Academic Publishers. Vezina, A.F., 1989. Construction of flow networks using inverse methods. In: Wulff, F., Field, J.G., Mann, K.H. (Eds.), Network Analysis in Marine Ecology: Methods and Applications. SpringerVerlag, Berlin, pp. 62 – 81. Vezina, A.F., Pahlow, M., 2003. Reconstruction of ecosystem flows using inverse methods: how well do they work? J. Mar. Syst. 40– 41, 55 – 77. Vezina, A.F., Platt, T., 1988. Food web dynamics in the ocean: I. Bestestimates of flow networks using inverse methods. Mar. Ecol., Prog. Ser. 42, 269 – 287. Vezina, A.F., Savenkoff, C., 1999. Carbon and nitrogen flows in the surface layer of NE Pacific. Deep-Sea Res. 46, 2909 – 2939. Vezina, A.F., Berreville, F.B., Loza, S., 2004. Inverse reconstructions of ecosystem flows in investigating regime shifts: impact of the choice of objective function. Prog. Oceanogr. 60, 321 – 341. Wunsch, C., 1996. The Ocean Circulation Inverse Problem. Cambridge University Press.