Predicting Water Quality in Unmonitored Watersheds Using Artificial ...

3 downloads 0 Views 2MB Size Report
May 11, 2010 - paper, we present an artificial neural network (ANN)–based methodology to predict WQ parameters in watersheds with no prior WQ data.
TECHNICAL REPORTS: SURFACE WATER QUALITY

Predicting Water Quality in Unmonitored Watersheds Using Artificial Neural Networks Latif Kalin* and Sabahattin Isik Auburn University Jon E. Schoonover Southern Illinois University B. Graeme Lockaby Auburn University Land use and land cover (LULC) play a central role in fate and transport of water quality (WQ) parameters in watersheds. Developing relationships between LULC and WQ parameters is essential for evaluating the quality of water resources. In this paper, we present an artificial neural network (ANN)–based methodology to predict WQ parameters in watersheds with no prior WQ data. The model relies on LULC percentages, temperature, and stream discharge as inputs. The approach is applied to 18 watersheds in west Georgia, United States, having a LULC gradient and varying in size from 2.96 to 26.59 km2. Out of 18 watersheds, 12 were used for training, 3 for validation, and 3 for testing the ANN model. The WQ parameters tested are total dissolved solids (TDS), total suspended solids (TSS), chlorine (Cl), nitrate (NO3), sulfate (SO4), sodium (Na), potassium (K), total phosphorus (TP), and dissolved organic carbon (DOC). Model performances are evaluated on the basis of a performance rating system whereby performances are categorized as unsatisfactory, satisfactory, good, or very good. Overall, the ANN models developed using the training data performed quite well in the independent test watersheds. Based on the rating system TDS, Cl, NO3, SO4, Na, K, and DOC had a performance of at least “good” in all three test watersheds. The average performance for TSS and TP in the three test watersheds were “good.” Overall the model performed better in the pastoral and forested watersheds with an average rating of “very good.” The average model performance at the urban watershed was “good.” This study showed that if WQ and LULC data are available from multiple watersheds in an area with relatively similar physiographic properties, then one can successfully predict the impact of LULC changes on WQ in any nearby watershed.

Copyright © 2010 by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America. All rights reserved. No part of this periodical may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. J. Environ. Qual. 39:1429–1440 (2010) doi:10.2134/jeq2009.0441 Published online 11 May 2010. Received 4 Nov. 2009. *Corresponding author ([email protected]). © ASA, CSSA, SSSA 5585 Guilford Rd., Madison, WI 53711 USA

L

and use and land cover (LULC) play a crucial role in driving hydrological processes in watersheds (Schoonover et al., 2006). They affect water quality (WQ) by altering sediment, chemical loads, and watershed hydrology. Due to land use practices and rapid land use changes, nonpoint-source pollution loading becomes a serious threat to WQ in streams (Basnyat et al., 2000). Many studies have shown that agricultural land use adversely impacts stream WQ by increasing nutrient levels, such as nitrogen and phosphorus, and sediment loadings (Hill, 1981; Arnheimer and Liden, 2000; Ahearn et al. 2005). Urban areas have similar negative impacts on WQ (Osborne and Wiley, 1988; Arnold and Gibbons, 1996; Basnyat et al., 1999; Sliva and Williams, 2001, Schoonover et al., 2006). Some researchers attributed these to point sources such as wastewater effluents. A study conducted in southern Ontario, for example, found no correlation between urban land use and stream water phosphorus levels originating from nonpoint sources once the contribution from wastewater discharges were removed (Hill, 1981). Ahearn et al. (2005) studied the impact of LULC on sediment and nitrate loadings in both dry and normal years in the waterways of the Cosumnes River watershed in California. They found that geographic variables have the greatest control on WQ in the Cosumnes watershed and population density does not have a strong influence on stream nitrate loading until a wastewater treatment plant is built within the basin. However, agriculture had a significant influence on both total suspended sediment and nitrate loading. Basnyat et al. (1999) examined a methodology to assess the relationships between multiple land use activities and nitrate–sediment concentrations in streams in south Alabama. Their results indicate that forests act as a sink or an active transformation zone, and as the proportion of forest increases (or agricultural land decreases), nitrate levels decrease. They identified residential–urban–built-up areas as the strongest contributors of nitrate. Sliva and Williams (2001) found that urban land use had the greatest influence on river WQ within three local southern Ontario watersheds.

L. Kalin, S. Isik, and B.G. Lockaby, School of Forestry and Wildlife Sciences, Auburn Univ., 602 Duncan Dr., Auburn, AL 36849-5126; J.E. Schoonover, Dep. of Forestry, Southern Illinois Univ., Carbondale, IL 62901-4411. Assigned to Associate Editor Ying Ouyang. Abbreviations: AIC, Akaike’s information criterion; ANN, artificial neural network; BIC, Bayesian information criterion; DOC, dissolved organic carbon; EV, evergreen; IS, impervious surfaces; LULC, land use and land cover; MI, mixed forest; MLR, multiple linear regression; NMSE, normalized mean square error; PA, pasture; TDS, total dissolved solids; TP, total phosphorus; TSS, total suspended solids; UG, urban grass; WQ, water quality.

1429

The effects of LULC on water quality and quantity can be explored through various techniques varying from regressionbased methods, such as linear and multilinear regression, to watershed models. Linear regression is an important tool for the statistical analysis of water resources data (Helsel and Hirsch, 2002). Multiple linear regression (MLR) is the extension of simple linear regression to the case of multiple explanatory variables. The MLR relates one dependent variable y to k independent variables or predictors xi (i = 1, . , k). The result is an equation that can be used for estimating y as a linear combination of the predictors xi. The main weakness of MLR models is that transformations include a priori assumptions about the type and consistency of the relation between two parameters that may not be met completely (Brey et al., 1996). Many researchers (e.g., Basnyat et al., 1999; Ahearn et al., 2005; Schoonover and Lockaby, 2006; Schoonover et al., 2007) have used regression analysis to study the LULC and WQ linkages. Watershed models also are used in estimating the effects LULC on water quality and quantity. Even though, at least in theory, some watershed models can be relied on in the absence of measured WQ data, in practice even the physically based watersheds models are often calibrated or fine-tuned (Fohrer et al., 2001; Di Luzio et al., 2002). If high-quality datasets of sufficient duration exist, then artificial neural networks (ANNs) could be effectively used in predicting the effects of LULC on WQ. Artificial neural networks are parametric models that are generally considered lumped (Dawson and Wilby, 2001). Neither a detailed understanding of a watershed’s physical characteristics nor an extensive data preprocessing is required for ANNs. Artificial neural networks provide a novel and appealing solution to the problem of relating input and output variables in complex systems. (Dawson and Wilby, 2001). The main advantage of using ANNs for prediction purposes is that there are no a priori assumptions about the relations between the independent and dependent variables. However, those relations learned by an ANN are hidden in its neural architecture and cannot be expressed in traditional mathematical terms (Brey et al., 1996). A neural network is more of a “black box” that delivers results without an explanation of how the results were derived. Thus, it is difficult or impossible to explain how decisions were made based on the output of the network. The use of ANNs in predicting WQ parameters is not new (Maier and Dandy, 2000; Chau et al., 2002; Muttil and Chau, 2006; Anctil et al., 2009; Amiri and Nakane 2009; Dogan et al., 2009; Singh et al., 2009). Singh et al. (2009), for instance, constructed an ANN-based WQ model for the Gomti River (India) and demonstrated its application to predict WQ parameters. They used 11 WQ parameters as inputs to forecast dissolved oxygen and biochemical oxygen demand. Similarly, Dogan et al. (2009) investigated the abilities of an ANN model to improve the accuracy of biochemical oxygen demand estimation in the Melen River (Turkey). Both studies relied on other measured WQ parameters to predict the WQ parameters of interest. Anctil et al. (2009) applied ANNs to simulate daily nitrate and suspended sediment fluxes from a small agricultural catchment. They used hydroclimatic variables, such as streamflow, rainfall, and soil moisture index, and 1430

historical mean nitrate and suspended sediment values to drive their ANN model. All of the aforementioned ANN-based studies were geared toward predicting WQ parameters using input data such as rainfall, streamflow, temperature, soil moisture index, and some other WQ parameters. To the best of our knowledge, few studies exist that incorporated the effect of LULC into ANNs to predict WQ. Amiri and Nakane (2009) attempted to involve LULC percentages into an ANN model, while Ha and Stenstrom (2003) used land use types as their target data. Amiri and Nakane (2009) developed ANNs and MLR approaches to predict monthly average total nitrogen concentrations in Chugoku district of Japan by using LULC percentages and human population density in 21 river basins as inputs. They compared the performance of an ANN-based model to that of the MLR modeling approach and found better estimation with the ANN. The main objective of this paper is to develop an ANN-based approach to examine the relationship between LULC and various WQ parameters and use it to predict WQ in nearby ungauged and/or unmonitored watersheds with similar characteristics. Similar to Amiri and Nakane (2009), we used LULC percentages as one of the key model drivers supplementing temperature and streamflow. A key difference between this study and Amiri and Nakane’s study (2009) is that while our study totally relies on measured data, they generated most of their data through Monte Carlo simulations. We applied the ANN model to 18 watersheds in the Piedmont physiographic region of western Georgia. The WQ parameters used in the study were total dissolved solids (TDS), total suspended solids (TSS), chlorine (Cl), nitrate (NO3), sulfate (SO4), sodium (Na), potassium (K), total phosphorus (TP), and dissolved organic carbon (DOC). The input variables (i.e., independent variables) were LULC percentages, temperature, and streamflow. We limited the number of input parameters to the ANN model since we want a model that can be used in predicting WQ parameters in watersheds with no prior WQ measurements. First, we explain the methodology used in developing the ANN model, which is followed by description of the study area and data. Next, the application of the ANN model to the study area is followed by discussion of results.

Materials and Methods Artificial Neural Networks An ANN is a machine (tool) designed to model the manner in which the human brain performs a particular task or function of interest. To achieve good performance, neural networks use a massive interconnection of simple computing cells referred to as neurons or processing units. Artificial neural networks are capable of mapping input–output relationships for natural complex problems and were developed to model the brain’s interconnected system of neurons so that computers could be used to imitate the brain’s ability to sort patterns and learn from trial and error, thus observing relationships in data (Haykin, 1999). Artificial neural networks can be categorized on the basis of the direction of information flow and processing. In a feedforward network, the nodes are generally arranged in layers, starting from a first input layer and ending at the final output layer. Information passes from the input to the output side. Journal of Environmental Quality • Volume 39 • July–August 2010

A synaptic weight is assigned to each link to represent the relative connection strength of two nodes at both ends in predicting the input–output relationship (ASCE Task Committee, 2000). Artificial neural networks are highly data intensive for training the network. The primary goal of training is to minimize a predefined error function by searching for a set of connection strengths and threshold values so that the ANN can produce outputs that are equal or close to target values (ASCE Task Committee, 2000). One of the commonly used error function is the mean square error (MSE): MSE

1 n Si − Oi ni 1

2

[1]

where Si is the ANN output (simulated) and Oi is the target (observation). Fig. 1. General architecture of artificial neural network (ANN) model. IS, imperviSince ANNs are the “black-box” class of models, ous surfaces; EV, evergreen; MI, mixed forest; PA, pasture; UG, urban grass; Teff, effect; Q, streamflow discharge; TDS, total dissolved solids; TSS, total they do not require detailed knowledge of the internal temperature suspended solids; TP, total phosphorus; DOC, dissolved organic carbon. functions of a system to recognize relationships between inputs and outputs (Ha and Stenstrom, 2003). Feedmeasure that combines information from multiple watersheds. forward neural networks with back propagation are successfully Because we have multiple watersheds with varying size and varyapplied to hydrological and environmental problems. In this ing number of measurements, MSE is not a suitable measure. study, three-layer feed-forward neural networks with Levenberg– Thus, a NMSE was used for this purpose and is given by Marquardt back-propagation learning were constructed for the 2 relationship between LULC percentages and WQ parameters. S j ,i − O j ,i 1 m nj The proposed feed-forward neural network has three main [2a] NMSE 2 j 1 i 1 Oj nj layers: input, hidden, and output layers. The hidden layer also has multiple sublayers. The number of sublayers in the hidden layers varies with WQ parameters. The architecture of neural network is or shown in Fig. 1. The percentages of five dominant LULC types 2 S1,i − O1,i (impervious surface [IS]; evergreen forest [EV]; mixed forest [MI]; 1 n1 NMSE 2 i 1 pasture [PA]; and urban grass [UG]), temperature effect (Teff), and O1 n1 stream discharge (Q) constitute the neurons of the input layer. The [2b] 2 WQ parameters (TDS, TSS, Cl, NO3, SO4, Na, K, TP, and DOC 1 nm Sm ,i − Om ,i !! 2 i 1 loadings) are the output parameters. Om nm The size of a hidden layer is one of the most important considerations when solving actual problems using multilayer feedwhere m is the total number of watersheds; nj is the total number forward networks. No unified theory exists for determining data in watershed j; Oj,i and Sj,i are the ith observed and simusuch an optimal ANN architecture (ASCE Task Committee, lated values in watershed j, respectively; and O j is the average 2000). The exact analysis of the issue is rather difficult because of observed values in watershed j. There are two reasons for the of the complexity of the network mapping and due to the nonuse of NMSE for a given WQ parameter: to minimize the effect deterministic nature of many successfully completed training of sample number and to minimize the effect of large and small procedures (Zurada, 1992). Determination of the optimum observations from watersheds. Note that we combined data from number of layers is usually a matter of experimentation. A several watersheds in training the ANN model. The number of trial-and-error approach is the most commonly used method observed data from each watershed is not the same. If we simply to find the number of hidden neurons and layers. In this study, use MSE, then watersheds having more observed data will be the number of hidden layers and hidden neurons were searched given more weight. Further, watersheds having high observed from 1 to 2 and from 1 to 10, respectively. The commercial values (e.g., due to differences in their size) will also carry higher software MATLAB (The MathWorks, Inc., Natick, MA) was weights in the simple MSE formula. used in developing the ANN models. The AIC and the BIC are commonly used in the literature to find optimal ANN architectures (Qi and Zhang, 2001; Ren Model Selection and Zhao, 2002; Zhao et al., 2008). Information-based criteria Normalized mean square error (NMSE), Akaike’s information such as AIC and BIC penalize large models that often tend to criterion (AIC), and Bayesian information criterion (BIC) are overfit (Qi and Zhang, 2001). Various forms of AIC and BIC used as selection criteria in determining optimal input and hidden are used in the literature. We used the one proposed by Qi and neurons. We define a revised form of MSE in this study due to Zhang (2001): the nature of the problem. In this application, we need an error Kalin et al.: Predicting Water Quality in Unmonitored Watersheds

1431

AIC

log

AIC

log

2

2m n if n m 1

MLE

2

40

[3a]

2m (n − m − 1) if n m 1

MLE

40

[3b]

where n is the number of data and m is the number of param eters in the model. The term 2MLE denotes the maximum likelihood estimate of variance of the residual term or simply the MSE between the observed and simulated data. Qi and Zhang (2001) give BIC as BIC

log

2

m log n n

MLE

[4]

Performance Measures The performance of the model was measured with the coefficient of determination (R2), Nash–Sutcliffe efficiency (ENASH), and bias ratio (RBIAS). The coefficient of determination is a measure of linear correlation between two quantities and is given by 2

Oi Si −

n

R2

Oi2

n



Oi

Oi 2

n

Si Si2



Si

2

[5]

where O and S represent observed data and model outputs and n is the number of data points. The Nash–Sutcliffe efficiency statistic (ENASH) is commonly used to assess the predictive power of hydrological models (Nash and Sutcliffe, 1970). It is defined as E NASH

1−

Oi − Si

2

Oi − O

2

[6]

where O is the mean of the observed data. The efficiency statistic ENASH theoretically varies from –∞ to 1 with 1 corresponding to a perfect model. It is a measure of how the plot of observed versus simulated data deviates from a 1:1 line (i.e., perfect model). The bias ratio in percentage is expressed as RBIAS

100

Si − Oi Oi

[7]

The bias ratio measures the degree to which the forecast is under- or overpredicted. A negative bias ratio indicates underprediction, whereas a positive bias ratio reflects overprediction (Salas et al., 2000).

Study Area and Data We applied the outlined ANN model to 18 small watersheds in western Georgia, near the city of Columbus (Fig. 2). These watersheds present a gradient of LULC. The southeastern United States has experienced rapid urban development. Consequently, Georgia’s streams have experienced hydrologic alterations and WQ degradation from extensive development and from other land use activities such as livestock grazing and silviculture (Schoonover, 2005). Grab samples were collected from May 2002 to January 2006 and analyzed for concentration and yields of TDS, TSS, Cl, NO3, SO4, Na, K, TP, and 1432

Fig 2. Watersheds used in this study: BLN, Blanton Creek; BR, Brookstone Branch; BU, Lindsey/Cooper Creek; CB, Clines Branch; FR, Flat Rock Creek; FS, Wildcat Creek; HC, House Creek; MU, Mulberry Creek; RB, Roaring Branch; SB, Standing Boy Creek; SC, Sand Creek.

DOC at each watershed (Table 1). Details on sampling strategies and chemical analysis are given in Schoonover (2005). Watersheds ranged in size from 296 to 2659 ha and were subbasins of the Middle Chattahoochee Watershed within the Piedmont physiographic province. Dominant LULC within the study area were classified as mixed hardwood forest, evergreen forest, urban, developing, and pastoral. One-meter aerial photographs were taken during leaf-off in March 2003 to facilitate LULC classification. The first effort in the 1-m image analyses was to generate an impervious (IS) percentage for each watershed. Impervious surface is a widely accepted and reliable indicator of urbanization due to its impacts on natural resources, particularly for water resources (Arnold and Gibbons, 1996). The remaining land cover classes were then digitized using both unsupervised and supervised classification methods. The overall accuracy was 91%. (Schoonover and Lockaby, 2006). The image processing methods used in this assessment are described in detail by Lockaby et al. (2005). Journal of Environmental Quality • Volume 39 • July–August 2010

Table 1. Land use/land cover (LULC) classes, land use percentages, and watershed areas. Basin no.

Basin ID†

LULC class‡

Number of data

1 2 3 4 5 6 7 8 9 10 11 12

CB HC MU2 MU3 SB1 SB2 SB4 FR HC2 MU1 BR RB

F F F F F F F P P P U U

13 14 15

BLN FS3 BU2

16 17 18

SC FS2 BU1

Purpose

Area (ha)

49 50 52 46 52 52 54 15 37 53 15 54

Training

F P U

39 36 50

F P U

53 36 54

LULC percentages (%)§ MI PA

IS

EV

UG

Other

897 665 606 1044 2009 634 2659 2396 1395 1178 471 367

1.5 1.3 2.6 1.9 1.8 3.4 3.3 13 1.6 3.7 23.0 30.3

48.3 47.9 42.4 41.5 38.6 37.3 41.1 31 30.5 29.3 29 28.4

33 26.7 25 37.1 35 35.4 22.7 7 22.2 24.3 14 11.1

11.8 18 14.4 13 18.8 16.3 25.5 35.6 44.5 35 10.9 10.9

0.1 0.3 1.2 0.8 0.6 1.5 2.2 4.9 0.6 2.8 16.1 16.9

5.3 5.8 14.4 5.7 5.2 6.1 5.2 8.5 0.6 5 7 2.4

Validation

364 296 2469

1.2 2.6 24.9

48.1 32 30.5

28.3 29.9 15.9

18.4 33.1 7.6

0.2 0.5 18

3.8 1.9 3.1

Testing

896 1449 2548

1.2 2.7 41.9

44.8 30.7 20.9

28.8 28.2 12.3

20.3 35.2 5.5

0.2 0.8 17.6

4.7 2.4 1.8

† See Fig. 2 caption for full names. ‡ F, forest; P, pasture; U, urban. § IS, impervious surfaces; EV, evergreen; MI, mixed forest; PA, pasture; UG, urban grass.

The rates of most reactions in natural waters increase by temperature (Chapra, 1997). Therefore, we included temperature as one of the input variables through the use of the Arrhenius equation (Chapra, 1997): K Tw Tw − 200 Teff [8] K 200 where Tw is ambient water temperature (°C) and θ is a dimensionless parameter typically within the range 1.0 to 1.1 but assumed to be 1.05 in this study; Tw is computed from average daily air temperature T av as given in Neitsch et al. (2005): Tw

5.0 0.75T av

[9]

T av values were obtained from a nearby National Climatic Data Center (NCDC) station in the city of Columbus (COOPID:092159; 32°31′ N; 84°56′ W). Areas and LULC percentages of the 18 study watersheds are given in Table 1. Percentage of IS ranges from 1.2 to 41.9%. Forest occupies a major fraction of each watershed. The range for percentage of EV is 20.9 to 48.3%. Percentage of MI varies from 7.0 to 37.1%. Percentage of land in PA was quite variable, with a range of 5.5 to 44.5%. Urban grass percentage was usually small with a range 0.1 to 18%. Other LULC types constitute only minor fractions of the watersheds and therefore were not included in the analyses. The total number of data points for each WQ parameter was 801, ranging from 15 to 54 among watersheds. Out of 18 watersheds, 12, which contained 66% of the total data, were used for training the ANN model; 3 watersheds were used for validation, and the remaining 3 for testing purposes (Table 1). The validation and testing watersheds contained about 16 and 18% of the total data set, respectively. Each set of validation Kalin et al.: Predicting Water Quality in Unmonitored Watersheds

and testing data consisted of 1 forested, 1 pastoral, and 1 urban watershed, while training data consisted of 7 forested, 3 pastoral, and 2 urban watersheds. Land use–based classifications of the watersheds were based on Schoonover (2005). Nutrient yields (kg ha−1 d−1) were calculated and used in the ANN network for each parameter. Summary statistics such as arithmetic mean, minimum, maximum, median, standard deviation, and coefficient of variation of training, testing, and validation data are given for each WQ parameter in Table 2. Total suspended solids shows the largest variation among all parameters as evidenced by its large coefficient of variation values in training, validation, and testing data, which were 8.76, 4.85, and 7.44, respectively. Natural logarithms of WQ parameters were used in the network to avoid zero outputs since we have very low target values. Before the training of the network, all data were normalized within the range 0.1 to 0.9 as follows: zi

0.1 0.8

xi − xmin x max − xmin

[10]

where zi is the normalized value of xi, which is the log-transformed observed value of a certain parameter, and xmin and xmax are the minimum and maximum values in the database for this parameter, respectively. The observed data and model output values are transformed back to their original domains before evaluating model performances.

Results and Discussion The LULC percentages of IS, EV, MI, PA, and UG, temperature effect (Teff), and streamflow discharge (Q) were used as inputs to the ANN network. We experimented with various combinations of these input parameters to identify the optimal input layer. 1433

Table 2. Summary statistics of input data (effective temperature, flow discharge, and water quality) used for training, validation, and testing the artificial neural network (ANN) model.† Teff

Q L s−1 ha−1

Min. Max. Mean Median SD CV

0.005 0.014 0.009 0.009 0.003 0.275

0.0001 9.336 0.231 0.069 0.633 2.747

Min. Max. Mean Median SD CV

0.005 0.014 0.009 0.009 0.003 0.274

0.001 1.505 0.179 0.107 0.243 1.355

Min. Max.

0.005 0.014

0.004 5.611

Mean Median SD CV

0.009 0.009 0.003 0.275

0.251 0.082 0.611 2.437

TDS

TSS

Cl

NO3

SO4

Na

K

TP

DOC

——————————————————————— kg ha−1 d−1 ——————————————————————— Training 0.0003 0 0.00002 0 0.00002 0.0002 0 0 0.00002 20.073 183.57 3.740 1.039 2.743 2.278 1.961 0.127 6.081 0.687 1.29 0.085 0.019 0.079 0.092 0.043 0.003 0.140 0.230 0.02 0.025 0.003 0.016 0.033 0.013 0.0005 0.026 1.785 11.28 0.279 0.069 0.234 0.206 0.119 0.010 0.428 2.597 8.76 3.276 3.686 2.955 2.236 2.774 3.257 3.053 Validation 0.007 0 0.001 0 0.001 0.001 0.001 0 0.001 4.147 21.06 0.563 0.276 0.606 0.639 0.440 0.045 0.987 0.496 0.46 0.065 0.029 0.060 0.054 0.041 0.002 0.066 0.238 0.03 0.030 0.011 0.016 0.029 0.020 0.001 0.024 0.750 2.22 0.099 0.049 0.118 0.084 0.067 0.005 0.140 1.513 4.85 1.529 1.675 1.987 1.545 1.638 2.591 2.118 Testing 0.019 0 0.004 0.00005 0.001 0.004 0.003 0 0.001 10.423 172.28 0.815 0.695 1.175 0.709 1.016 0.257 3.147 0.585 0.216 1.209 2.066

2.23 0.03 16.61 7.44

0.078 0.031 0.137 1.760

0.045 0.014 0.095 2.126

0.071 0.017 0.153 2.163

0.061 0.029 0.102 1.665

0.048 0.017 0.106 2.212

0.004 0.001 0.022 5.234

0.093 0.022 0.292 3.150

† Teff, temperature effect; Q, streamflow discharge; TDS, total dissolved solids; TSS, total suspended solids; TP, total phosphorus; DOC, dissolved organic carbon.

We tried the combinations LULC, Q, LULC + Q, LULC + Teff, Teff + Q, and LULC + Teff + Q. The AIC, BIC, and NMSE error criteria were used in determining optimal input layers. Results are given in Table 3. Mostly, all three error measures consistently picked the same combination. At least two criteria picked the same input layer in all WQ parameters. The LULC + Teff + Q combination for all WQ parameters was determined to be generating better results than other combinations. The WQ parameters TDS, TSS, Cl, NO3, SO4, Na, K, TP, and DOC were the dependent variables in the proposed ANN models. We developed a separate ANN model for each WQ parameter. Table 3 also provides useful information on parameter sensitivities. Normalized mean square error can be used as a sensitivity measure. If the model is insensitive to a parameter, then adjusting that parameter would not improve the model performance (low NMSE in this case). Only calibration of sensitive parameters could yield improved model performances. From Table 3, it is evident that the ANN model is more sensitive to Q for TDS, Cl, Na, K, and TP and to LULC for TSS, SO4, NO3, and DOC. In this study, the number of hidden neurons was searched from 1 to 10, and the number of hidden layers was searched from 1 to 2. We limited the size of hidden layers to 10 nodes in each hidden layer as networks over 10 nodes did not result in better performance based on NMSE, AIC, and BIC. For all WQ parameters, model performance peaked before reaching 10 nodes and steadily decreased after that. The highest number for optimum number of nodes was 7, which was obtained for DOC. Table 4 presents the optimum number of hidden neurons for each WQ parameter. As an example, for TSS there were two neurons in each of the two hidden layers with a total of four neurons. The optimum number of hidden layers was 1434

1 for NO3, SO4, Na, and DOC and 2 for TDS, TSS, Cl, K, and TP. The optimal number of neurons in these hidden layers varied from 1 to 7 (Table 4). A trial-and-error procedure was used to determine the learning rate and momentum parameter. Their values were 0.01 and 0.5, respectively. The log-sigmoid transfer function is adopted for both hidden and output layers. The network training stops as soon as any of these conditions occur: (i) model performance in validation dataset decreases in 10 successive iterations; (ii) the maximum number of epochs, which is predetermined at 1000, is reached. The R2 and NMSE for the training and validation data sets are given in Table 5. The training dataset was only used for training the ANN model to identify the ANN model parameters (i.e., weights and biases); it was not used to measure the performance of the models. Indeed, the independent validation dataset (see Table 1) is used in selecting the best models. This was also done to prevent the overtraining of the model (Srivastava et al., 2006). Except for TSS, all WQ parameters have R2 values at or above 0.7 in the validation dataset. The R2 for TSS is 0.49. However, one should note that it is difficult to make real comparisons between model performances for different WQ parameters based on R2. As stated earlier, R2 is merely an indication of the degree of linear correlation between two datasets. Normalized mean square error is a better metric for interparameter comparisons. It is in a sense similar to bias or mass balance error. The parameter TSS has higher NMSE values than all other WQ parameters, about 0.1; TDS, K, Na, and Cl all have very low NMSE values. Table 6 presents the R2, ENASH, and RBIAS model performance criteria at the three test watersheds for each WQ parameter. Simulated and observed values of each WQ parameter are shown Journal of Environmental Quality • Volume 39 • July–August 2010

Table 3. The best performances for input layers of water quality parameters. Parameter†

Input layer‡

NMSE§

AIC§

BIC§

LULC + Teff + Q LULC + Q Q Q + Teff

0.0026 0.0029 0.0084

−3.8 −3.6 −2.7

−3.7 −3.6 −2.7

0.0132 0.0284 0.0272 0.0804 0.0947 0.1578 0.1036 0.1530 0.1549 0.0064 0.0068 0.0237 0.0247 0.0276 0.0260

−2.7 −0.7 −0.6 1.0 1.1 1.1 1.2 1.7 1.7 -6.4 −6.4 −5.8 −5.8 −4.7 −4.6

−2.6 −0.6 −0.5 1.2 1.1 1.2 1.3 1.7 1.7 −6.2 -6.3 −5.8 −5.7 −4.7 −4.6

LULC + Teff + Q 0.0106 −7.6 LULC + Q 0.0204 −7.4 Q 0.2997 −7.0 Q + Teff 0.0561 −6.3 LULC + Teff 0.0399 −6.1 LULC 0.0388 −6.1 SO4 LULC +Teff + Q 0.0256 -6.4 LULC + Q 0.0462 −6.3 Q 0.1697 −5.4 Q + Teff 0.2076 −5.3 LULC 0.0468 −4.4 LULC + Teff 0.0388 −4.3 Na LULC + Teff + Q 0.0069 -6.5 Q + Teff 0.0126 −6.1 LULC + Q 0.0077 −6.1 Q 0.0101 −5.9 LULC 0.0249 −5.0 LULC + Teff 0.0236 −5.0 K LULC + Teff + Q 0.0041 -7.2 LULC + Q 0.0039 −7.1 Q + Teff 0.0053 −6.8 Q 0.0069 −6.7 LULC + Teff 0.0286 −5.5 LULC 0.0290 −5.4 TP LULC + Teff + Q 0.0399 −11.6 Q + Teff 0.0446 −11.4 LULC + Q 0.0515 −11.1 Q 0.0554 −11.0 LULC 0.0880 −10.3 LULC + Teff 0.0760 −10.3 DOC LULC + Teff + Q 0.0262 −5.8 LULC + Q 0.0347 −5.8 Q 0.0634 −5.6 Q + Teff 0.1189 −5.5 LULC 0.0308 −3.9 0.0315 −3.9 LULC + Teff † TDS, total dissolved solids; TSS, total suspended solids; TP, total phosphorus; DOC, dissolved organic carbon. ‡LULC, land use and land cover; Teff, temperature effect; Q, streamflow discharge. § NMSE, normalized mean square error; AIC, Akaike’s information criterion; BIC, Bayesian information criterion.

−7.5 −7.3 −7.0 −6.2 −6.0 −6.0 -6.4 −6.2 −5.3 −5.2 −4.4 −4.3 -6.4 −6.0 −6.0 −5.9 −5.0 −4.9 -7.1 −7.1 −6.7 −6.7 −5.4 −5.4 −11.5 −11.3 −11.0 −11.0 −10.3 −10.2 −5.8 −5.6 −5.6 −5.4 −3.9 −3.8

TDS

TSS

Cl

LULC LULC + Teff LULC + Teff + Q LULC + Q Q Q + Teff LULC LULC + Teff LULC + Teff + Q LULC + Q Q Q + Teff LULC LULC + Teff

NO3

Kalin et al.: Predicting Water Quality in Unmonitored Watersheds

1435

on scatterplots for each of the three test watersheds in Fig. 3. Overall, the ANN model performs quite well and exceptionally well for some WQ parameters regardless of the watershed. There are no established criteria in the literature for good–bad model performance based on any of these three metrics. Moriasi et al. (2007) proposed performance ratings based on some recommended statistics that include ENASH, and RBIAS in watershed modeling at monthly time scale. Our time scale is much smaller (instantaneous). Models are known to perform better at coarser scales. Taking Moriasi et al. (2007) as the base and relaxing some of the constraints, we developed the following performance rating in evaluating the ANN model performance:

this watershed by 26%. Based on the criteria we set, the developed ANN model performance can be considered “very good” to “good,” with an average rating of “very good.” Overall, the ANN model developed for TDS had one of the best performances compared with other WQ parameters.

Total Suspended Solids

Very Good: ENASH ≥ 0.7; |RBIAS| ≤ 0.25

Although observed TSS data contained more baseflow data than storm data, the developed ANN model performed strikingly well at all three watersheds. Based on our rating system the model performance is “very good/good” at the forested and pastoral watersheds SC and FS2, respectively. The urban watershed BU1 received a “satisfactory” rating. The overall rating based on the average rating from the three watersheds was “good.”

Good: 0.5 ≤ ENASH < 0.7; 0.25 < |RBIAS| ≤ 0.5

Chloride

Model performance for Cl varied from “good” to “very good.” It produced best results at the pastoral watershed, which was Unsatisfactory: ENASH < 0.3; |RBIAS| > 0.7 surprising. We expected better model performance at the urban watershed as Cl is often found in potable water and on roads Total Dissolved Solids during winter months as deicing material. Although chlorine is added to water at the water treatment plants, it is sometime Both R2 and ENASH values were quite high in all three wateradded to irrigation water also. Some of the areas classified as passheds. The lowest ENASH was at the pastoral watershed FS2, ture in the study watersheds could potentially be agricultural. with a value of 0.95. The model also overestimated TDS in For instance, it is almost impossible to distinguish between hay and soybeans from aerial photos or remote Table 4. Number of neurons in input, hidden, and output layers for each water quality parameter. sensing, unless there is ground-truthing. Satisfactory: 0.3 ≤ ENASH < 0.5; 0.50 < |RBIAS| ≤ 0.7

Parameter† TDS TSS Cl NO3 SO4 Na K TP DOC

Number of neurons 1st hidden 2nd hidden layer layer 3 2 2 5 4 6 4 3 7

3 2 1 – – – 1 3 –

Best performances‡ NMSE

AIC −3.9 1.1 −6.2 −7.6 −6.1 −5.9 −6.9 −11.0 −6.6

0.0026 0.0943 0.0069 0.0120 0.0110 0.0089 0.0048 0.0533 0.0139

Nitrate BIC −3.7 1.2 −6.1 −7.4 −6.0 −5.7 −6.8 −10.8 −6.5

† TDS, total dissolved solids; TSS, total suspended solids; TP, total phosphorus; DOC, dissolved organic carbon. ‡ NMSE, normalized mean square error; AIC, Akaike’s information criterion; BIC, Bayesian information criterion. Table 5. R2 and normalized mean square error (NMSE) values obtained for each water quality parameter during training and validation of the artificial neural network (ANN) models. Parameter† TDS TSS Cl NO3 SO4 Na K TP DOC

Training

Validation

R2

NMSE‡

R2

NMSE

0.93 0.56 0.74 0.85 0.87 0.92 0.89 0.62 0.96

0.0074 0.0290 0.0286 0.0181 0.0430 0.0058 0.0089 0.0350 0.0084

0.97 0.49 0.81 0.85 0.73 0.79 0.85 0.71 0.93

0.0030 0.1010 0.0062 0.0131 0.0170 0.0051 0.0038 0.0427 0.0131

† TDS, total dissolved solids; TSS, total suspended solids; TP, total phosphorus; DOC, dissolved organic carbon. ‡ NMSE, normalized mean square error. 1436

The developed model predicted NO3 levels quite well in each watershed based on ENASH values. Bias ratios were higher than expected given the ENASH values. Although nitrate level was overpredicted in the urban watersheds, it was underpredicted in the forested and pastoral watersheds. Model performances were “good/ very good” at all three watersheds.

Sulfate The developed ANN model performs quite well at each watershed for SO4 with model performance varying from “good” to “very good.” There were no distinct differences in model performances between watersheds. Sources of sulfate could be atmospheric or from groundwater. Sulfates also occur naturally in minerals and in some rock formations and thus may be present due to weathering processes.

Sodium and Potassium The ANN models developed for Na and K both worked exceptionally well with performance ratings of “very good” at forested and pastoral watersheds for both WQ parameters. The performance at urban watershed was also “very good” for K, but the performance at urban watershed varied from “good” to “very good” for Na.

Journal of Environmental Quality • Volume 39 • July–August 2010

Table 6. Performance statistics (R2, ENASH, and RBIAS) of the developed artificial neural network (ANN) models at each testing watersheds for the selected water quality parameters. TDS§

TSS

Cl

NO3

SO4

Na

K

TP

DOC

WSD† SC FS2 BU1 SC FS2 BU1 SC FS2 BU1

R2 0.97 0.99 0.99 0.69 0.80 0.42 0.61 0.96 0.81

ENASH† 0.97 0.95 0.99 0.66 0.76 0.31 0.61 0.94 0.78

RBIAS† 0.8 26.2 −5.2 −11.4 −28.8 −54.9 −12.7 15.9 −24.0

Performance‡ VG VG/G VG VG/G VG/G S VG/G VG VG

SC FS2 BU1 SC FS2 BU1 SC FS2 BU1 SC FS2 BU1 SC FS2 BU1 SC FS2 BU1

0.84 0.91 0.86 0.90 0.98 0.83 0.92 0.98 0.95 0.97 0.97 0.98 0.60 0.71 0.99 0.75 0.91 0.98

0.79 0.84 0.77 0.87 0.98 0.79 0.89 0.98 0.92 0.96 0.97 0.91 0.54 0.58 0.52 0.75 0.88 0.95

−34.5 −41.3 28.9 25.4 7.2 −13.7 16.9 9.4 29.5 −6.2 3.4 −20.4 −16.2 −38.0 −58.9 14.6 24.7 18.9

VG/G VG/G VG/G VG/G VG VG VG VG VG/G VG VG VG VG/G G G/S VG VG VG

† WSD, watersheds; ENASH, Nash–Sutcliffe efficiency; RBIAS, bias ratio. ‡ VG, very good; G, good; S, satisfactory. § TDS, total dissolved solids; TSS, total suspended solids; TP, total phosphorus; DOC, dissolved organic carbon.

Total Phosphorus All three watersheds had quite similar ENASH values varying between 0.52 and 0.58. The model under predicted TP loadings at each watershed. The largest underprediction was at the urban watershed BU1. The forested watershed had the lowest underprediction at 16%. Model performance varied from “very good/ good” to “good/satisfactory.” The extremely high R2 value of 0.99 at the urban watershed BU1 indicated a systematic over/under prediction of the model, where we know from RBIAS that it underpredicted observed TP loadings by almost 60%. This is fortunate since systematic errors are easier to fix. Systematic errors are related to model structure and could be stemming from ignoring some of the processes or due to use of some redundant variables.

Dissolved Organic Carbon The ANN model performance for DOC was “very good” in all three watersheds. It overpredicted DOC loadings in all by 15 to 25%. Model performance in the urban watershed was superior to model performance in the other two watersheds.

Summary and Conclusions We presented a methodology based on artificial neural networks to predict water quality parameters in unmonitored basins. The Kalin et al.: Predicting Water Quality in Unmonitored Watersheds

model relied on LULC percentages, temperature, and flow discharge as inputs. The developed model made use of WQ and flow data from nearby watersheds with similar physical characteristics. The only required measurements at the watershed where WQ parameters were needed are flow and temperature. The model was applied to several watersheds in west Georgia varying in size and LULC. The WQ parameters used in this application were TDS, TSS, Cl, NO3, SO4, Na, K, TP, and DOC. Out of the total 18 watersheds, 12 were used in training model parameters, 3 in model validation, and 3 for testing. Each set of validation and testing data consists of 1 forested, 1 pastoral, and 1 urban watershed, while the training dataset consisted of 7 forested, 3 pastoral, and 2 urban watersheds. The model developed using the training data set has successfully predicted the WQ parameters in the independent testing watersheds. To better compare interparameter and interwatershed model performances, we developed a qualitative performance rating system. According to this rating system model performances were categorized as unsatisfactory, satisfactory, good, or very good. The statistical measures Nash–Sutcliffe efficiency (ENASH) and bias ratio (RBIAS) were used in determining the performance ratings. Based on this rating system, TDS, Cl, NO3, SO4, Na, K, and DOC had a performance of at least “good” in all three 1437

Fig. 3. Scatter plots of ANN generated and measured loadings for the water quality parameters total dissolved solids (TDS), total suspended solids (TSS), total phosphorus (TP), and dissolved organic carbon (DOC). The abbreviations on the upper left corner of each figure refer to watershed names: SC, Sand Creek; FS2, Wildcat Creek; BU1, Lindsey Creek.

test watersheds. The average performance for TP in the three test watersheds was “good,” with the lowest being “good/satisfactory.” Total suspended solids had the lowest average performance among all WQ parameters. It had a performance of “satisfactory” at the urban watershed, whereas the forested and pastoral watershed had performance rating of “good/very good.” The average of the ENASH for all WQ parameters was higher at the pastoral watershed than in the forested and urban watersheds, with a value of 0.88. The average ENASH values from all 1438

WQ parameters for the urban and forested watersheds were 0.77 and 0.78, respectively. In addition to having the smallest average ENASH values, the urban watershed also had a larger variation in ENASH values compared with the forested and pastoral watersheds, implying larger uncertainties associated with the urban watersheds. Based on RBIAS values, however, the ANN model worked much better in the forested watershed. Averages of the absolute values of RBIAS were 15.4, 21.7, and 28.3% for the forested, pastoral and urban watersheds, respectively. Standard Journal of Environmental Quality • Volume 39 • July–August 2010

Fig. 3. Continued.

Kalin et al.: Predicting Water Quality in Unmonitored Watersheds

1439

deviation of absolute RBIAS values was also lower at the forested watershed. It was 9.4% at the forested watershed and 12.7 and 16.9% at the pastoral and urban watersheds, respectively. If we had applied the rating system to the combined performances of different WQ parameters, the forested and pastoral watersheds would have received a “very good” performance. The performance of the urban watershed was “good/very good.” Results from this study indicate that if WQ and LULC data are available from multiple watersheds in an area with relatively similar physiographic properties, then one can successfully predict the impact of LULC changes on WQ in any nearby watershed if streamflow data are available or can be estimated. In this study, we did not attempt to predict flow discharge, which is one of the limitations of the study. Since all the WQ data were “snapshots” in time, taken at irregular time intervals, flow discharge data were needed at the times of those WQ measurements. It is extremely difficult to predict instantaneous flows. Because the study watersheds are quite small, rainfall data with high temporal resolution (in addition to soil characteristics, and topographic and morphologic parameters) are needed to develop an ANN model for prediction of flow discharges. Note that it is not only WQ parameters varying with LULC; flow would also change as a function of LULC. This complicates the problem if one wants to explore the impacts of various LULC change scenarios on WQ, for existing flow data cannot be used with those LULC scenarios.

References Ahearn, D.S., R.W. Sheibley, R.A. Dahlgren, M. Anderson, J. Johnson, and K.W. Tate. 2005. Land use and land cover influence on water quality in the last free-flowing river draining the western Sierra Nevada, California. J. Hydrol. 313:234–247. Amiri, B.J., and K. Nakane. 2009. Comparative prediction of stream water total nitrogen from land cover using artificial neural network and multiple linear regression approaches. Pol. J. Environ. Stud. 18:151–160. Anctil, F.O., M. Filion, and J. Tournebize. 2009. A neural network experiment on the simulation of daily nitrate-nitrogen and suspended sediment fluxes from a small agricultural catchment. Ecol. Modell. 220:879–887. Arnheimer, B., and R. Liden. 2000. Nitrogen and phosphorus concentrations for agricultural catchments; influence of spatial and temporal variables. J. Hydrol. 227:140–159. Arnold, C.L., Jr., and G.C. Gibbons. 1996. Impervious surface coverage: The emergence of a key environmental indicator. J. Am. Plann. Assoc. 62:243–259. ASCE Task Committee. 2000. Artificial neural network in hydrology: I. Preliminary concepts. J. Hydrol. Eng. 5:115–123. Basnyat, P., L.D. Teeter, K.M. Flynn, and B.G. Lockaby. 1999. Relationships between landscape characteristics and nonpoint source pollution inputs to coastal estuaries. Environ. Manage. 23:539–549. Basnyat, P., L.D. Teeter, B.G. Lockaby, and K.M. Flynn. 2000. Land use characteristics and water quality: A methodology for valuing of forested buffers. Environ. Manage. 26:153–161. Brey, T., A. Jarre-Teichmann, and O. Borlich. 1996. Artificial neural network versus multiple linear regression: Predicting P/B ratios from empirical data. Mar. Ecol. Prog. Ser. 140:251–256. Chapra, S.C. 1997. Surface water-quality modeling. McGraw-Hill, New York. Chau, K.W., C. Chuntian, and C.W. Li. 2002. Knowledge management system on flow and water quality modeling. Expert Syst. Appl. 22:321–330. Dawson, C.W., and R.L. Wilby. 2001. Hydrologic modeling using artificial neural networks. Prog. Phys. Geogr. 25:80–108. Di Luzio, L., R. Srinivasan, and J.G. Arnold. 2002. Integration of watershed tools and SWAT model into BASINS. J. Am. Water Resour. Assoc. 38:1127–1141.

1440

Dogan, E., B. Sengorur, and R. Koklu. 2009. Modeling biochemical oxygen demand of the Melen River in Turkey using an artificial neural network technique. J. Environ. Manage. 90:1229–1235. Fohrer, N., S. Haverkamp, K. Eckhardt, and H.G. Frede. 2001. Hydrologic response to land use changes on the catchment scale. Phys. Chem. Earth B 26:577–582. Ha, H., and M.K. Stenstrom. 2003. Identification of land use with water quality data in stormwater using a neural network. Water Res. 37:4222–4230. Haykin, S. 1999. Neural networks: A comprehensive foundation. Prentice Hall, New Jersey. Helsel, D.R., and R.M. Hirsch. 2002. Statistical methods in water resources. Chapter A3. In Techniques of water-resources investigations of the United States Geological Survey. Book 4. Available at http://water.usgs.gov/pubs/ twri/twri4a3 (verified 22 Apr. 2010). U.S. Geological Survey, Denver, CO. Hill, A.R. 1981. Stream phosphorus exports from watersheds with contrasting land uses in southern Ontario. Water Resour. Bull. 17:627–634. Lockaby, B.G., D. Zhang, J. McDaniels, H. Tian, and S. Pan. 2005. Interdisciplinary research at the urban-rural interface: The WestGA project. Urban Ecosyst. 8:7–21. Maier, H.R., and G.C. Dandy. 2000. Neural network for the prediction and forecasting of water resources variables: A review of modeling issues and applications. Environ. Model. Softw. 15:101–124. Moriasi, D.N., J.G. Arnold, M.W. Van Liew, R.L. Bingner, R.D. Harmel, and T.L. Veith. 2007. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE 50:885–900. Muttil, N., and K.W. Chau. 2006. Neural network and genetic programming for modelling coastal algal blooms. Int. J. Environ. Pollut. 28:223–238. Nash, J.E., and J.V. Sutcliffe. 1970. River flow forecasting through conceptual models: Part I. A discussion of principles. J. Hydrol. 10:282–290. Neitsch, S.L., J.G. Arnold, J.R. Kiniry, J.R. Williams, and K.W. King. 2005. Soil and water assessment tool theoretical documentation. Version 2005. Temple, TX. Osborne, L.L., and M.J. Wiley. 1988. Empirical relationships between landuse cover and stream water-quality in an agricultural watershed. J. Environ. Manage. 26:9–27. Qi, M., and G.P. Zhang. 2001. An investigation of model selection criteria for neural network time series forecasting. Eur. J. Oper. Res. 132:666–680. Ren, L., and M.Z. Zhao. 2002. An optimal neural network and concrete strength modeling. Adv. Eng. Software 33:117–130. Salas, J.D., M. Markus, and A.S. Tokar. 2000. Streamflow forecasting based on artificial neural networks. p. 23–51. In R.S. Govindaraju and A.R. Rao (ed.) Artificial neural networks in hydrology. Kluwer Academic, Dordrecht, the Netherlands. Schoonover, J.S. 2005. Hydrology, water quality, and channel morphology across an urban-rural land use gradient in the Georgia piedmont, USA. Ph.D. diss. Auburn Univ., Auburn, AL. Schoonover, J.S., and B.G. Lockaby. 2006. Land cover impacts on stream nutrients and fecal coliform in the lower piedmont of west Georgia. J. Hydrol. 331:371–382. Schoonover, J.S., B.G. Lockaby, and B. Helms. 2006. Effects of watershed land use on perennial streams of west Georgia, USA: 1. Influence of urban development on stream hydrology. J. Environ. Qual. 35:2123–2131. Schoonover, J.S., B.G. Lockaby, and J.N. Shaw. 2007. Channel morphology and sediment origin in streams draining the Georgia piedmont. J. Hydrol. 342:110–123. Singh, K.P., A. Basant, A. Malik, and G. Jain. 2009. Artificial neural network modeling of the river water quality: A case study. Ecol. Modell. 220:888–895. Sliva, L., and D.D. Williams. 2001. Buffer zone versus whole catchment approaches to studying land use impact on river water quality. Water Res. 35:3462–3472. Srivastava, P., J.N. McNair, and T.E. Johnson. 2006. Comparison of processbased and artificial neural network approaches for streamflow modeling in an agricultural watershed. J. Am. Water Resour. Assoc. 42:545–563. Zhao, Z., Y. Zhang, and H. Liao. 2008. Design of ensemble neural network using the Akaike information criterion. Eng. Appl. Artif. Intell. 21:1182–1188. Zurada, J.M. 1992. Introduction to artificial neural systems. PWS Publishing, Boston, MA.

Journal of Environmental Quality • Volume 39 • July–August 2010

Suggest Documents