Ecological Modelling 189 (2005) 305–314
Artificial neural network application for multi-ecosystem carbon flux simulation Assefa M. Melesse a,∗ , Rodney S. Hanley b a
Department of Environmental Studies, Florida International University, 11200 SW 8th Street Miami, FL 33199, USA b Department of Earth System Science and Policy and Earth Systems Science Institute, Box 9007, University of North Dakota, Grand Forks, North Dakota 58202-9007, USA Received 11 June 2004; received in revised form 9 February 2005; accepted 29 March 2005 Available online 9 June 2005
Abstract The need for carbon dioxide (CO2 ) flux estimations covering larger areas and the limitations of the point eddy covariance technique to address this requirement necessitates the modeling of CO2 flux from other micrometeorological variables. The non-linearity of the relationship between CO2 flux and other micrometeorological flux parameters such as energy fluxes limits the applicability of carbon flux models to accurately estimate the flux dynamics. Black box models such as the artificial neural network (ANN) provide a mathematically flexible structure to identify the complex non-linear relationship between inputs and outputs without attempting to explain the nature of the phenomena. A multilayer perceptron ANN technique with an error back propagation algorithm was applied to a CO2 flux simulation study on three different ecosystems (forest, grassland and wheat). Energy fluxes (net radiation, latent heat, sensible heat and soil heat flux) and temperature (air and soil) were used to train the ANN and predict the flux of CO2 . Diurnal hourly fluxes data from 15 days of observations were divided into training and testing. Results of the CO2 flux simulation show that the technique can successfully predict the observed values with R2 values between 0.75 and 0.94. Predictions from the forest and wheat field show higher promise than the grassland site. The technique is reliable, efficient and highly significant to estimate regional or global CO2 fluxes from point measurements and understand the spatiotemporal budget of the CO2 fluxes. © 2005 Elsevier B.V. All rights reserved. Keywords: Artificial neural network; CO2 ; AmeriFlux; Energy flux
1. Introduction
∗ Corresponding author. Tel.: +1 305 348 6518; fax: +1 305 348 6137. E-mail address:
[email protected] (A.M. Melesse).
0304-3800/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2005.03.014
Knowledge of the amount of carbon dioxide (CO2 ) flux into and out of the atmosphere is important for understanding carbon sinks and sources. Monitoring, mapping and modeling of carbon fluxes in different
306
A.M. Melesse, R.S. Hanley / Ecological Modelling 189 (2005) 305–314
terrestrial ecosystems are essential for understanding the contribution of these ecosystems to the global carbon budget. This information will be useful for decision making regarding various carbon-related climate change mitigation strategies. It is also imperative that climate change policies and future CO2 flux reduction strategies benefit from this information. The net carbon exchange of terrestrial ecosystems is the result of a balance between uptake (photosynthesis) and losses (respiration). This balance has diurnal, seasonal, and annual variability. Land-use changes (afforestation, reforestation, and deforestation) have effects on the net carbon exchange and this balance by determining the rate of soil organic carbon storage and decomposition (Valentini, 2003). The AmeriFlux and other similar networks of carbon flux monitoring programs are established to address these issues. 1.1. AmeriFlux The AmeriFlux network of flux towers was established to quantify variation in carbon dioxide and water vapor exchange between terrestrial ecosystems and the atmosphere, and to understand the underlying mechanisms responsible for observed fluxes and carbon pools. Similar regional networks (CarboEurope, AsiaFlux, OzFlux, and Fluxnet Canada) participate in synthesis activities across larger geographic areas (Baldocchi et al., 2001; Law et al., 2002). The specific objectives of AmeriFlux1 are to quantify the magnitude of net annual CO2 exchange in major ecosystem/biome types (natural and managed), determine the response of CO2 fluxes to changes in environmental factors and climate changes, provide information on processes controlling CO2 flux and net ecosystem productivity, and provide site-specific calibration and verification data for process-based CO2 flux models. The AmeriFlux network uses eddy covariance flux measurement techniques. 1.2. Eddy covariance techniques The eddy covariance technique determines the exchange rate of CO2 across the interface between the 1
http://public.ornl.gov/ameriflux/about-objective.shtml.
atmosphere and a plant canopy by measuring the covariance between fluctuations in vertical wind velocity and CO2 mixing ratio. The eddy covariance method is most accurate when the atmospheric conditions (wind, temperature, humidity, CO2 ) are steady, the underlying vegetation is homogeneous and it is situated on flat terrain for an extended distance upwind. Recently, the eddy covariance technique has emerged as an alternative way to assess ecosystem carbon exchange (Running et al., 1999; Canadell et al., 2000; Geider et al., 2001). The most important reasons are (1) it is a scale-appropriate method because it provides ecosystem scientists with a method to assess net CO2 exchange of a whole ecosystem, (2) the eddy covariance technique produces a direct measure of net CO2 exchange across the canopy–atmosphere interface. This task is accomplished by using micrometeorological theory to interpret measurements of the covariance between vertical wind velocity and scalar concentration fluctuations (Baldocchi et al., 1988; Verma, 1990; Desjardins, 1991; Lenschow, 1995), (3) the area sampled with this technique, called the flux footprint, possesses longitudinal dimensions ranging between 100 m and a few kilometers (Schmid, 1994), and (4) the technique is capable of measuring ecosystem CO2 exchange across a spectrum of time scales, ranging from hours to years (Baldocchi et al., 2001; Wofsy et al., 1993). Eddy covariance CO2 flux data represent point measurement with a footprint ranging from a few meters to a kilometer depending on the tower height. The need to understand CO2 flux at other locations requires observations at a large number of points or modeling approach. Resources and practical limitations to monitor large areas justify a modeling approach to understand carbon fluxes. The fact that CO2 flux is correlated to the energy fluxes and environmental variables such as temperature (air, soil, and surface), soil moisture content and others makes it easier to apply statistical learning techniques based on machine learning for CO2 simulation. The artificial neural networks (ANNs) are among the techniques based on pattern recognition capable of modeling non-linear processes. The application of ANN techniques to ecosystem carbon estimation is one of the new areas of data-driven modeling utilizing the relationship among the micrometeorological variables. Properly designed input–output neural network in carbon flux simulation can learn from the data and pro-
A.M. Melesse, R.S. Hanley / Ecological Modelling 189 (2005) 305–314
vide a reasonable estimate of carbon flux. Van Wijk and Bouten (1999), Van Wijk et al. (2002) and Papale and Valentini (2003) have demonstrated the potential of ANN to carbon flux dynamics study for European forest ecosystem. 1.3. Artificial neural networks Neural networks use machine learning based on the concept of self-adjustment of internal control parameters. An artificial neural network is a non-parametric attempt to model the human brain. Artificial neural networks are flexible mathematical structures that are capable of identifying complex non-linear relationships between input and output data sets. The main differences between the various types of ANNs are arrangement of neurodes (network architecture) and the many ways to determine the weights and functions for inputs and neurodes (training) (Caudill and Butler, 1992). The multilayer perceptron (MLP) neural network has been designed to function well in non-linear phenomena. A feed forward MLP network consists of an input layer and output layer with some number of input and output neurons respectively with one or more hidden layers in between the input and the output layer with some number of neurons on each. The artificial neuron in a typical ANN architecture (Fig. 1) receives a set of inputs or signals (x) with weight (w), calculates a weighted average of them (z) using the summation function and then uses some activation
Fig. 1. A typical multilayer perceptron ANN architecture.
307
function (f) to produce an output (y). Where z=
n
xi w i
(1)
i=1
The connections between the input layer and the middle or hidden layer contain weights, which are usually determined through training the system. The hidden layer sums the weighted inputs and uses the transfer function to create an output value. The transfer function (local memory) is a relationship between the internal activation level of the neuron (called activation function) and the outputs. A typical transfer function is the sigmoid function (Eq. (2)), which varies from 0 to 1 for a range of inputs (Caudill and Butler, 1992). A function f(x) will be a sigmoid function if it is bounded and the value of a sigmoid function always increases as x increases (Smith, 1993). A number of different functions have these characteristics and thus qualify as sigmoid functions. The sigmoid logistic non-linear function is described using Fig. 2: f (x) =
1 1 + e−x
(2)
Fig. 2. Graphical description of the sigmoid logistic transfer function.
A.M. Melesse, R.S. Hanley / Ecological Modelling 189 (2005) 305–314
308
In time series prediction, supervised training is used where the ANN is trained in such way to minimize the difference between the network output and the target (observed). Therefore, training is the process of weight adjustment that tries to obtain a desirable outcome with least squares residuals. The most common training algorithm used in the ANN literature is called back propagation (BP). 1.4. BP algorithm Back propagation, or backdrop was developed and popularized by Rumelhart et al. (1986) and it is widely implemented of all neural network paradigms. It is based on a multi-layered feed forward topology with supervised learning. Back propagation uses a type of gradient descent method, following the slope of the error surface downwards toward its minimum. If the subscripts i (i = 1,2,. . .,I) and j (j = 1,2,. . .,J) are used to identify a particular neuron, k (k = 1,2,. . .,K) to identify the layer and the superscripts n (n = 1,2,. . .,n) to denote the step of adjustment, the change in weights are given by Eq. (3). To start the process, small random numbers are set as weights and then Eqs. (3)–(7) are used to adjust them: wn+1 ij,k = ηδj,k yi,k−1
(3)
For output layer k, the value of δ for neuron j in layer k can be computed as T δj,k = yj,k (1 − yj,k )(yj,k − yj,k )
(4)
For hidden layer k: δj,k = yj,k (1 − yj,k )
k+1
(wi,j,k δi,k+1 )
(5)
i=1 n+1 where wn+1 ij,k and wij,k is the change in weight and weight values between neuron i and j in layer k at step n + 1, respectively, η the learning parameter (usually between 0.01 and 1.0), yjk the output of neuron j in T the target value for y . layer k, and yj,k jk A high learning parameter corresponds to rapid learning which may push the training toward a local minimum or cause oscillation. In turn, when applying small learning rates, the time to reach a global minimum will be considerably increased (Khan et al., 1993). The momentum factor can help in choosing the learn-
ing rate. The role of the momentum term is to smooth out the weight changes, which helps to protect network learning from oscillation (Anderson et al., 1993). Inclusion of a momentum constant helps achieve a faster adjustment in the BP algorithm. It consists of adding an autoregressive feature to the algorithm, so that it has now a “memory” of previous adjustments. Therefore, one needs to modify slightly the algorithm to add a new term that is proportional to the previous period weight adjustment to obtain n wn+1 ij,k = ηδj,k yi,k−1 + α(wij,k )
(6)
where α is a momentum constant. The new values of the weights are computed as n+1 n wn+1 ij,k = wij,k + wij,k
(7)
Van Wijk and Bouten (1999) used ANN in the top–down approach to model CO2 and water fluxes from six different coniferous forests in Europe. The study used global radiation, air temperature and vapor pressure deficit to predict carbon flux. The results indicate that independent predictions of forest ecosystem fluxes were equally satisfying as empirical models and both the water and carbon fluxes can be modeled without detailed physiological and site-specific information. In another study Van Wijk et al. (2002) used similar input parameters and compared four datadriven models in carbon dioxide estimation in European forest. Papale and Valentini (2003) also used eddy covariance flux data to train ANN and provide 1 km × 1 km and weekly carbon fluxes for European forests at continental scale. They concluded that the ANN approach can be useful for gap filling and carbon flux spatialization. In the above studies, partitioned energy fluxes were not used in the neural network design. The effect of land-cover and partitioning of the net radiation to soil heat flux, sensible and latent heat energy on the carbon flux were not the focus of study. In ecosystems where soil respiration is greater and vegetative covers are minimal, soil temperature, soil heat flux and sensible heat energy are useful in correlating energy fluxes to carbon dynamics. The studies also applied ANN in forest ecosystem and application of the technique to others such as grassland and cropland ecosystems was not considered.
A.M. Melesse, R.S. Hanley / Ecological Modelling 189 (2005) 305–314
309
A neural network capable of correlating energy fluxes to carbon has a potential to be used for spatial mapping of the later using data from remote sensing. Using data from visible, near-infrared and thermal infrared bands of remote sensing satellites and surface energy budget method, spatial map of energy fluxes (net radiation, sensible heat, latent heat and soil heat flux) at different spatiotemporal scales can be mapped. This can be easily coupled with ANN technique to provide spatial map of carbon flux. In this study, the application of ANN to three different ecosystems (forest, grassland and cropland) using partitioned energy fluxes, air and soil temperature as input variables to predict carbon flux is presented. Fig. 3. The study sites on the map of USA.
1.5. Objectives The overall objective of this study is to evaluate the performance of ANN modeling of CO2 flux from energy flux and temperature (air and soil). The specific objectives of this modeling exercise are to (1) assess the applicability of ANN-based CO2 flux simulation for various ecosystems, (2) identify micrometeorological variables which have close correlation to carbon flux, and (3) evaluate the performance of the prediction and compare the modeled flux to observed values.
275 m above mean sea level, amsl), is an extensively managed forest with a total area of 95.3 km2 . The site has low topography with 0.23% slope. The area has a variety of forest with more than 75% of the total made up of sugar maple (Acer saccharum), tulip poplar (Liriodendron tulipifera), sassafras (Sassafras albidum), white oak (Quercus alba), and black oak (Quercus nigra) based on the basal areas species (Ehman et al., 2002). Schmid and Lloyd (1999) and Schmid et al. (2000) provide details of data processing and other measurements conducted at the site.
2. Methods
2.3. Fort Peck (grassland)
2.1. Sites descriptions
The station monitored by the Oak Ridge National Laboratory (ORNL), is located on the Fort Peck Indian Reservation (Latitude: 48◦ 18.473 N, 105◦ 6.032 W) in Montana. The site has a temperate climate with relatively flat terrain at an elevation of 634 m amsl. The flux tower2 , having a height of 3 m, was installed in November 1999 on grassland and monitors micrometeorology on a continuous basis.
Three different ecosystems (forest, grassland, and cropland) at different locations (Morgan-Monroe, Indiana; Fort Peck, Montana; Ponca City, Oklahoma) were used in this study (Fig. 3). The sites are part of the AmeriFlux project where CO2 , water vapor, energy fluxes and other biospheric flux and micrometeorological data are monitored continuously. Flux and micrometeorological data from these sites were used to train the neural network and predict CO2 flux. The site description, environmental variables monitored and instrumentation for each site is indicated in Table 1.
2.4. Ponca City (winter wheat)
2.2. Morgan-Monroe (mixed forest)
The wheat field covering an area of 25 ha is located about 16 km north of Ponca City, Oklahoma (36◦ 45 N, 97◦ 5 W) at an elevation of 310 m amsl. Established in 1996, the site has a flat topography (