Daily Pan Evaporation Modeling in a Hot and Dry ...

Daily Pan Evaporation Modeling in a Hot and Dry Climate J. Piri1; S. Amin2; A. Moghaddamnia3; A. Keshavarz4; D. Han5; and R. Remesan6 Abstract: Evaporation plays a key role in water resources management in arid and semiarid climatic regions. This is the first time that an artificial neural network 共ANN兲 model is applied to estimate evaporation in a hot and dry region 共BWh climate by the Köppen classification兲. It has been found that ANN works very well at the study site and, further, an integrated ANN and autoregressive with exogeneous inputs can have an improved performance over the traditional ANN. Both models significantly outperformed the two empirical methods. It has been demonstrated that the important weather factors to be included in the model inputs are wind speed, saturation vapor pressure deficit, and relative humidity. This result is different from all those reported in the literature and is interestingly linked with a 1936 study by Anderson, who emphasized the importance of saturation vapor pressure deficit. As evaporation is a nonlinear dynamic process, the selection of suitable input weather variables has been a complicated and time-consuming task for modelers. In this study, a new data analysis tool called the gamma test has been used to identify the best combination of model inputs prior to model construction and evaluation. DOI: 10.1061/共ASCE兲HE.1943-5584.0000056 CE Database subject headings: Evaporation; Neural networks; Droughts; Water resources; Hydrologic models.

Introduction Evaporation has wide implications amongst hydrological processes and plays a key role in water resources management in arid and semiarid climatic regions. The most common and important factors affecting evaporation are solar radiation, air and soil temperature, relative humidity, vapor pressure deficit, atmospheric pressure, and wind speed. In one of the earliest published papers, Dalton 共1802兲 pointed out that evaporation was proportional to the difference between vapor pressure of the air at the water surface and that of the overlying air, although apparently he never expressed this relationship in mathematical terms. Anderson 共1936兲 concluded that the vapor pressure deficit was a much more sensitive indicator of the water vapor conditions of the atmosphere and underwent greater variations for temperature changes than did the relative humidity. Evaporation losses should be considered in the design of various water resources and irrigation systems. In areas with little rainfall, evaporation losses can represent a significant part of the 1

Graduate of Irrigation Engineering, MSc, Dept. of Water Engineering, Faculty of Agriculture, Univ. of Shiraz, Shiraz, Iran. E-mail: [email protected] 2 Professor of Irrigation, Dept. of Water Engineering, Faculty of Agriculture, Univ. of Shiraz, Shiraz, Iran. 3 Assistant Professor of Hydrology, Dept. of Watershed and Range Management, Faculty of Natural Resources, Univ. of Zabol, Zabol, Iran. 4 Assistant Professor of Computer Science, Faculty of Engineering, Univ. of Shiraz, Shiraz, Iran. 5 Reader in Civil and Environmental Engineering, Dept. of Civil Engineering, Univ. of Bristol, Bristol, U.K. 6 Ph.D. Student of Water Resources, Dept. of Civil Engineering, Univ. of Bristol, Bristol, U.K. Note. This manuscript was submitted on May 12, 2008; approved on November 27, 2008; published online on February 12, 2009. Discussion period open until January 1, 2010; separate discussions must be submitted for individual papers. This paper is part of the Journal of Hydrologic Engineering, Vol. 14, No. 8, August 1, 2009. ©ASCE, ISSN 1084-0699/ 2009/8-803–811/$25.00.

water budget for a lake or reservoir, and may contribute significantly to the lowering of water surface elevation 共McCuen 1998兲. Therefore, accurate estimation of evaporation loss from the water body is of primary importance for monitoring and allocating water resources, at farm scales as well as at regional scales. Owing to its convenience and cost effectiveness, an evaporation pan is one of the most widely used instruments for the measurement of evaporation, but its performance is affected by instrumental limits and practical issues such as measurement errors and maintenance, which can reduce the accuracy of evaporation measurements. It is also difficult to use the pan by telemetry techniques, hence the human labor cost is high. Alternatively, mathematic models could be used to estimate evaporation from related weather variables. To date, many researchers have developed models for estimating free water evaporation all over the world 共Rohwer 1931; Penman 1948; Christiansen 1968; Linacre 1994; Cahoon et al. 1991; Van Zyl et al. 1989兲. Mosner and Aulenbach 共2003兲 compared four empirical methods of evaporation estimation, including the Priestly–Taylor, Penman, DeBruin– Keijman, and Papa–dakis equations for Lake Seminole, southwestern Georgia, and northwestern Florida, from April 2000 to September 2001. It has been found that the average monthly lake evaporation estimates derived from the empirical equations were as much as 16% in error. Therefore, there is room for improvement in the conventional evaporation models. In recent years, other methods have been explored by many researchers, such as the mass transfer methods 共Yu and Knapp 1985; Ikebuchi et al. 1988; Laird and Kristovich 2002; Hostetler and Bartlein 1990; Blodgett et al. 1997兲 and eddy correlation techniques 共Winter 1981; Blanken et al. 2000兲. Despite the large amount of literature published, most of the reported methods are too demanding for observed meteorological data and prone to errors if locally calibrated parameters are not available. In addition, evaporation is an incidental, nonlinear, complex, and unsteady process, so it is difficult to derive an accurate formula to represent all the physical processes involved. As a result, there is a new trend in using data mining techniques such as fuzzy logic, artificial neural networks 共ANN兲, and ANFIS to estimate evaporation. This followed a large

JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / AUGUST 2009 / 803

Downloaded 15 Jul 2009 to 137.222.207.15. Redistribution subject to ASCE license or copyright; see http://pubs.asce.org/copyright

number of studies in which some hydrological processes were simulated by nonlinear models based on ANN, support vector machines, fuzzy logical system, polynomial function, local linear regression, Bayesian networks, decision trees, etc. 共Govindaraju 2000; Maier and Dandy 2000; Han et al. 2002; Keskin et al. 2004; Bray and Han 2004; Kisi 2005; Han et al. 2007a兲. In evaporation estimation, some typical studies reported so far use ANN in modeling daily soil evaporation 共Han and Felker 1997兲, daily evapotranspiration 共Kumar et al. 2002兲, daily pan evaporation 共Sudheer et al. 2002; Terzi and Keskin 2005; Kisi 2006; Keskin and Terzi 2006; Terzi et al. 2006兲, and hourly pan evaporation 共Tan et al. 2007兲. Using temperature data alone, Sudheer et al. 共2002兲 found that a properly trained ANN model could reasonably estimate the evaporation values at their study area in a temperate region. From these reports, it is clear that ANN models are superior to the conventional regression models, as ANN do not require any predetermination of regression forms. This advantage becomes more promising when an engineering problem is too complex to be represented by regression equations 共Tan et al. 2007兲. In comparison with a wider application of ANN in other fields 关such as flood forecasting, Hsu et al. 共1995兲, Fernando and Jayawardena 共1988兲, Tokar and Johnson 共1999兲, Elshorbagy et al. 共2000兲, Clair and Ehrman 共1998兲, Imrie et al. 共2000兲, Liong et al. 共2001兲, Han et al. 共2007b兲兴, the modeling experience of ANN in evaporation estimation is still quite limited and there is a need to study and report trials of this technique in different climate regions so that some generalization of this method could be achieved. The objectives of this study are to evaluate ANN models for evaporation estimation in a hot and dry climate 共BWh by the Köppen classification兲; to improve the ANN model by incorporating an autoregressive external input 共ARX兲 component; to identify suitable input weather variables based on the new gamma test 共GT兲; and to compare ANN models with some conventional empirical methods. In this study, the evaporation pan data have been used for model training and testing. The idea is to develop an accurate evaporation pan model so that the missing evaporation pan data in the past can be repaired by the model. Also it could be used to extend evaporation records 共into the past or the future兲 from other weather measurements. As a result, it has a useful application in assessing climate change impact on future evaporation variations.

Empirical Methods of Estimating Evaporation Evaporation pans are commonly used to estimate evaporation from lakes and reservoirs. However, there are many problems with them. Many factors can introduce errors in pan evaporation measurement, such as debris in water, animal activity in and around the pan, pan size, materials employed to construct the pan, exposure of the pan, strong winds, and measurement of water depth in the pan 共Farnsworth et al. 1982; Rosenberg et al. 1983兲. In practice, empirical formulas have been derived to estimate evaporation on the basis of field measurements of evaporation pans and reservoir/lake water balances. Those formulas are linked with various weather factors impacting on evaporation. As the complexity of the empirical methods increases, data requirements to drive the equations often make the empirical methods hard to apply for field applications. A large number of empirical methods for estimating evaporation based on different meteorological inputs have been suggested during the past decades 共Linacre 1977; Warnaka and Pochop 1988; Alizadeh 2004兲. However, many of them are not applicable in this study due to their limitations in

Table 1. Evaporation Formulas for Lakes and Reservoirs References Warnaka and Pochop 1988 Alizadeh 2004

Equation

Formula name

E = f共T , Tdew , Latitude兲 E = 0.03U共es − ea兲

Linacre Marciano

data availability. As a result, only two relevant empirical methods are used in the case study 共i.e., they are compatible with the available weather measurements兲, as listed in Table 1. In Table 1, for each formula, E = evaporation rate 共mm/day兲; es = saturation vapor pressure 共millimeters of Hg兲; ea = actual vapor pressure 共millimeters of Hg兲; U = average wind velocity 共km/h兲 at a height of 2 m above the lake or surrounding land areas; and T and Tdew = air temperature and the dew point, respectively.

ANN and NNARX Models The theory of ANN started in the early 1940s when the first computational representation of a neuron was developed by McCulloch and Pitts 共1943兲. Later, Rosenblatt proposed the idea of perceptrons 共1962兲 in which single layer feed forward networks of McCulloch–Pitts neurons could carry out various computational tasks with the help of weights and a training algorithm. The pioneering work was followed and expanded by many researchers 共Fine 1999; Haykin 1999兲. Basically, ANN are one of the artificial intelligence 共AI兲 techniques that mimic the behavior of human brains. ANN attempt to describe a nonlinear relationship between the input and output of a complex system using historic data. Similar to a human neural system, an ANN is an information processing structure that consists of a number of input units and output units connected in a systematic fashion. Between the input and output units, there may be one or more hidden layers, each consisting of a number of units called neurons, nodes, or cells. The connections between units lying on different layers are assigned varying weights. Input signals 共or data sets兲 are fed in from the input layer, and they follow all possible connection paths to reach the next layer. Along each connection link, the signal goes through a transformation, eventually reaching the output layer. Through this mechanism an ANN learns to identify patterns in the data set and predict variables. A detailed overview of ANN concepts and their applications in hydrology can be found in two excellent articles by the ASCE Committee of Artificial Neural Networks in Hydrology 共ASCE 2000a, b兲. Nowadays, as there are many books on ANN and a multitude of materials on the Internet about ANN implementations, this paper only provides a brief description of ANN used and readers can get further information from other sources 关such as Smith 共1993兲 and the complete issue of Journal of Hydrologic Engineering, Vol. 5, Issue 2, 2000兴. There are different learning algorithms and the adaptive algorithm used in this study is the “Momentum LevenbergMarquardt” algorithm that was adopted on the basis of the “generalized delta rule” 共Rumelhart et al. 1986兲. In this scheme, the adaptive learning rates were used for increasing the convergence velocity throughout all the ANN simulations. The sigmoid and linear curves are used as activation functions for the hidden and output nodes, respectively. In recent years, a combination of ANN and ARX has gained some popularity in the control field. Traditionally, the ARX model in Fig. 1共a兲 has been widely used in control theory for modeling

804 / JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / AUGUST 2009


x(t)

Study Area and Data Set

x(t)

x(t-nb)

x(t-nb) ∑ ∑

y(t-1)

y(t-nb)

∧

y (t / θ )

y(t-1)

Neural Network

∧

y (t / θ )

y(t-nb) (a)

(b)

Fig. 1. 共a兲 ARX block; 共b兲 NNARX model block

various control processes 共Ljung 1999兲. Its simple structure is basically linear and can be described as A共q兲y共t兲 = B共q兲u共t − nk兲 + e共t兲

共1兲

where y共t兲 = output 共 evaporation兲 and u共t兲 = vector with all the inputs, such as wind speed, air temperature, and saturation vapor pressure deficit; e共t兲 = white noise; A共q兲 and B共q兲 = polynomials in terms of time shift operator, of the order na and nb, respectively; and nk = time delay. NNARX is to extend an ARX model with a neural network. The NNARX architecture is depicted in Fig. 1共b兲. This structure is almost the same as the structure shown in Fig. 1共a兲 but instead of a simple summation block, a neural network structure is used. NNARX models have recently been explored and studied by researchers in other fields with some successful results 共Keshavarz and Roopaei 2006兲. The estimated value of y共t兲 shown by yˆ 共t 兩 ␪兲 is expressed as yˆ 共t兩␪兲 = f共u共t − nk兲, . . . ,u共t − nb − nk + 1兲y共t − 1兲, . . . ,y共t − na兲兲共2兲 where f 共 · 兲 = nonlinear mapping by ANN, and ␪ = model parameters. For comparison purposes, both ANN and NNARX were explored in this study. The software used in this study is the Neural Network Based System Identification Toolbox developed with MATLAB 7.1.

The study area is the Sistan Plain located in the Southeast of Iran, one of the driest regions in the country and famous for its “120 day wind” 共bād-e sad-o-bist-roz兲, a highly persistent dust storm in the summer which blows from the north to south with velocities of nearly 20 knots. Hirmand River, originated from Afghanistan, is bifurcated into two branches when it reaches the Iranian border, namely Parian and Sistan. The Hirmand River flows through the Sistan Plain and discharges into the natural swamp of Hamun-e-Hirmand 共Fig. 2兲. As can be seen in Fig. 2, the Sistan Plain is essentially an inland delta with its major watercourses leading to a series of lakes. The Sistan Delta has a very hot and dry climate and its climate is classified as BWh according to the Koppen climate classification 共i.e., main climate B: arid; precipitation W: desert; temperature h: hot兲共Kottek et al. 2006兲. Strong winds in the region are quite unique and are an important contributing factor for the high evaporation. The daily weather variables are measured by an automated weather station at Chahnime of Zabol 共latitude 61° 40⬘ – 61° 49⬘ W, longitude 30° 45⬘ – 30° 50⬘ N兲 operated by the IR Sistan Balochastan Regional Water 共IR SBRW, http://www.sbrw.ir兲. The data set consisted of 11 years 共1995–2006兲 of daily records of air temperature 共T兲, wind speed 共W兲, saturation vapor pressure deficit 共Ed兲, relative humidity 共RH兲, and pan evaporation 共E兲. The daily statistical parameters of the weather data are given in Table 2 and their distributions are presented in Figs. 3共a and b兲. It is interesting to note that except for relative humidity, distributions of all other variables are quite different than the Gaussian distribution. In Table 2, the Xmean, Sx, Cov, Csx, Xmax, and Xmin denote the mean, standard deviation, coefficient of variation, skewness, maximum and minimum of the weather factors, respectively. The last column of Table 2 represents the correlation vector between the potential input variables for models 共T, Ed, W, and RH兲 and the output variable 共E兲. The correlation results indicate that the influential order of the weather variables is: temperature, saturation vapor pressure deficit, wind speed, and relative humid-

Fig. 2. Sistan Region and the Chahnime reservoirs JOURNAL OF HYDROLOGIC ENGINEERING © ASCE / AUGUST 2009 / 805


Table 2. Correlation Matrix for the Input-Output Variables of ANN and NNARX Models Station Chahnime

Data set

Unit

Xmean

Sx

Cov 共Sx / Xmean兲

Csx

Xmin

Xmax

Correlation with E

T W Ed RH E

°C km/day kPa % mm

23.2 262.5 2.93 41.05 12.94

9.69 185.97 1.41 16.00 9.19

0.417 0.708 0.481 0.389 0.710

−0.20 0.77 0.50 0.48 0.63

0 5 0.08 10.33 0

41 915 6.53 91.90 39.44

0.872 0.735 0.852 0.103 1

ity. However, due to a potential problem in cross correlation and nonlinear effects, the linear correlation between evaporation and its related weather variables is not readily suitable for input variable selection in the ANN model development. In this study, a new methodology called the gamma test 共GT兲 is explored.

Model Input Selection Using the Gamma Test The selection of model input variables is a complex issue, even for linear multivariable regression analysis. For nonlinear models such as an ANN, the input selection is much more complicated because the traditional covariance and redundancy analysis derived from linear systems are not readily suited to nonlinear relationships. In this study, a new tool, the GT, from the Machine

Learning field 共Končar 1997; Agalbjörn et al. 1997兲 is explored. There is very limited work in using the GT in hydrology 关examples include water level and flow modeling in the River Thames by Durrant 共2001兲 and daily solar radiation prediction by Remesan et al. 共2008兲兴. The GT estimates the minimum mean square error 共MSE兲 that can be achieved when modeling the unseen data with any continuous nonlinear models. The GT was first reported by Končar 共1997兲 and Agalbjörn et al. 共1997兲, and later enhanced and discussed in detail by many researchers 共Chuzhanova et al. 1998; De Oliveira 1999; Tsui 1999; Tsui et al. 2002; Durrant 2001; Jones et al. 2002兲. Only a brief introduction on the GT is given here and interested readers should consult the aforementioned papers for further details. The basic idea is quite distinct from the earlier attempts with nonlinear analysis. Suppose one has a set of data observations of the form 兵共xi,y i兲,1 艋 i 艋 M其共3兲 where the input vectors xi Rm = vectors confined to some closed bounded set C Rm and, without loss of generality, the corresponding outputs y i R are scalars. The vectors x contain predicatively useful factors influencing the output y. The only assumption made is that the underlying relationship of the system is of the following form y = f共x1, . . . ,xm兲 + r

共4兲

where f = smooth function and r = random variable that represents noise. Without loss of generality it can be assumed that the mean of the r’s distribution is zero 共as any constant bias can be subsumed into the unknown function f兲 and that the variance of the noise Var共r兲 is bounded. The domain of a possible model is now restricted to the class of smooth functions which have bounded first partial derivatives. The gamma statistic ⌫ is an estimate of the model’s output variance that cannot be accounted for by a smooth data model. The GT is based on N关i , k兴, which are the kth 共1 艋 k 艋 p兲 nearest neighbors xN关i,k兴共1 艋 k 艋 p兲 for each vector xi共1 艋 i 艋 M兲. Specifically, the GT is derived from the delta function of the input vectors M

1 兩xN共i,k兲 − xi兩2 ␦ M 共k兲 = M i=1

兺

共1 艋 k 艋 p兲

共5兲

where 兩…兩 denotes Euclidean distance, and the corresponding gamma function of the output values M

1 ␥ M 共k兲 = 兩y N共i,k兲 − y i兩2 2M i=1

兺

Fig. 3. 共a兲 Distribution of the weather variables; 共b兲 distribution of evaporation pan values

共1 艋 k 艋 p兲

共6兲

where y N共i,k兲 = corresponding y value for the kth nearest neighbor of xi in Eq. 共5兲. In order to compute ⌫, a least-squares regression line is constructed for the p points 共␦M 共k兲 , ␥ M 共k兲兲



Table 3. Gamma Test Results on the Evaporation Estimation Data Set Different combinations Parameters W, T, RH, Ed T, RH, Ed W, RH, Ed W, T, Ed W, T, RH Gamma 共⌫兲 Gradient 共A兲 Standard error Vratio Near neighbors M Mask

0.02184

0.05295

0.02160

0.02175

0.02207

0.05946

−0.30649

0.27377

0.09502

0.06902

0.00049

0.00145

0.00058

0.00025

0.00046

0.08736 10

0.21182 10

0.08641 10

0.08703 10

0.08830 10

4,019 1,111

4,019 111

4,019 1,011

4,019 1,101

4,019 1,110

␥ = A␦ + ⌫

共7兲

The intercept on the vertical axis 共␦ = 0兲 is the ⌫ value, as can be shown ␥ M 共k兲 → Var共r兲 in probability as ␦ M 共k兲 → 0

共8兲

Calculating the regression line gradient can also provide helpful information on the complexity of the system under investigation. A formal mathematical justification of the method can be found in Evans and Jones 共2002兲. The graphical output of this regression line 关Eq.共7兲兴 provides very useful information. First, it is remarkable that the vertical intercept ⌫ of the y 共or gamma兲 axis offers an estimate of the best MSE achievable utilizing a modeling technique for unknown smooth functions of continuous variables 共Evans and Jones 2002兲. Second, the gradient offers an indication of the model’s complexity 共a steeper gradient indicates a model of greater complexity兲. The GT is a nonparametric method and the results apply regardless of the particular techniques used to subsequently build a model of f. One can standardize the result by considering another term Vratio, which returns a scale invariant noise estimate between 0 and 1. Vratio is defined as Vratio =

⌫ ␴2共y兲

共9兲

where ␴2共y兲 = variance of output y, which allows a judgment to be formed independent of the output range as to how well the output can be modeled by a smooth function. A Vratio close to 0 indicates that there is a high degree of predictability of the given output y. The reliability of the ⌫ statistic can be determined by running a series of the GT for a definite number of unique data points 共M兲, to establish the size of the data set required to produce a stable asymptote. The GT result would avoid the overfitting of a model beyond the stage where the MSE on the training data is smaller than Var共r兲 and help one to decide the required data length to build a meaningful model. The GT analysis can be performed using winGamma software implementation 共Durrant 2001兲. In this study, different combinations of input data were explored to assess their influence on the evaporation modeling. There were 2m − 1 meaningful combinations of inputs 共15 in this study兲. The best one can be determined by observing the gamma value, which indicates a measure of the best MSE attainable using any modeling methods for unseen input data. Thus, the writers performed the GTs in different dimensions varying the number of inputs to the model and four representative combinations 共including the best one兲 are tabulated in Table 3. In Table 3, the mini-

Fig. 4. Gamma values 共⌫兲 and standard error for the best input combination 共W, Ed, RH兲

mum value of ⌫ was observed when the 共W, RH, Ed兲 data set, was used i.e., daily wind speed 共W兲, relative humidity 共RH兲, and daily saturation vapour pressure deficit 共Ed兲. The gradient 共A兲 is considered as an indicator of model complexity. Vratio is the measure of predictability of given outputs using available inputs. An input data set with low values of MSE, gradient, and Vratio is considered as the best scenario for the modeling. From Table 3 one can deduce that the combination of daily wind speed, daily relative humidity, and daily saturation vapor pressure deficit can make a good model comparable with other options. The relative importance of inputs is W ⬎ Ed⬎ RH⬎ T. The significance of the daily mean temperature data was relatively small when compared with other weather variables as the elimination of this input made small variation in the gamma statistic value. This is different in comparison with the linear correlation result. The quantity of sufficient input data to predict the desirable output was analyzed using the GT with various data lengths as shown in Fig. 4. Fig 4 illustrates that a training data length of 2,413 is sufficient for the gamma statistics to become stable and low 共gamma 0.02116 and SE 0.00103兲, therefore the training data length has been set as 2,413 and the rest of the data is used for model testing.

Modeling Results Traditionally, it is a huge task to choose the best input combination for nonlinear modeling. The result from the aforementioned GT enables modelers to focus on the model calibration and testing with a prefixed input selection. In this study, the modeling performance is validated by two main criteria: root-mean-square error 共RMSE兲 and coefficient of determination 共R2兲共Karunanithi et al. 1994兲. The validation results for all the models are presented in Table 4 and Fig. 5. It is quite encouraging that the performance of NNARX is better than ANN on both RMSE and R2. All the emTable 4. Statistic Analysis for the Model Type Empirical AI

Models

RMSE 共%兲

R2

Linacre Marciano ANN NNARX

61.5 45.8 18.5 15.7

0.23 0.54 0.93 0.95



Model (mm/day) 200

400

600 800 Day

1000

Evaporation (mm)

0

200

400

600 800 Day

1000

1200

0

200

400

600

800

1000

1200

0

1400

0

200

400

600

800

1000

1200

Model (mm/day)

Evaporation (mm)

Observed Marciano

10

15 20 25 30 Observed E (mm/day)

35

40

5

10


35

40


35

40

40 35 30 25 20 15 10 5 0 0

Day 40 35 30 25 20 15 10 5 0

5

40 35 30 25 20 15 10 5 0

1400

Observed Lincare

40 35 30 25 20 15 10 5 0

0

1400

Observed NNARX

40 35 30 25 20 15 10 5 0

Evaporation (mm)

1200

Model (mm/day)

0

40 35 30 25 20 15 10 5 0

Model (mm/day)

Evaporation (mm)

Observed ANN

40 35 30 25 20 15 10 5 0

1400

10

40 35 30 25 20 15 10 5 0 0

Day

5

5

10


35

40

Fig. 5. Modeled and observed evaporation for the testing data

pirical models produced much poorer results than the two AI models 共ANN and NNARX兲. Among the empirical formulas, the models with the inputs of wind and vapor pressures performed much better than the ones with temperature and dew point, indicating the importance of wind in this area. It can be found that ANN and NNARX models have much smaller biases and variances in their residuals. The Marciano equation has a better performance between the two empirical equations. The Linacre equation is not sensitive to the weather variables at the study site so it is not a suitable model for this region. It should be noted that the reason for the vertical bands of the measured data is because the evaporation values were recorded to the nearest graduation marks, therefore they were not continuous.

Discussion and Conclusions Since its application in evaporation estimation from 1997, ANNs have been studied by many researchers in different climate regions. The modeling experiences have been reported from the United States 共Han and Felker 1997; Kumar et al. 2002兲, India

共Sudeer et al. 2002兲, Turkey 共Terzi and Kesk 2005; Keskin and Terzi 2006; Terzi et al. 2006; Kisi 2006兲, and Singapore 共Tan et al. 2007兲. This paper is the first attempt to apply this model in a Middle East region. It is confirmed that ANNs and NNARX worked well for a hot and dry place such as Iran. The improved performance by integrating ANNs and ARX is not only novel 共the first application of such a model in evaporation research field兲 but is also quite mind broadening as it indicates that there is still room for improving the existing ANN model despite it being a universal nonlinear regression tool. However, it should be pointed out that the increased accuracy may not justify the more complex NNARX model structure. Parsimony is a very important feature in mathematical modeling 共i.e., Occam’s razor “simpler theories are, other things being equal, generally better than more complex ones”兲. NNARX is just slightly better than ANN and further study is still needed to assess the trade-off between model complexity and accuracy of ANN and NNARX for evaporation estimation. Another interesting finding in this study is the unique combination of the contributing variables. It has been demonstrated that the important weather factors to be included in the model inputs are wind speed, saturation vapor pressure deficit, and relative humidity. This result is different from all those reported in the lit-



erature. In the United States, Han and Felker 共1997兲 used three weather input variables to estimate the evaporation from the soil: relative air humidity, air temperature, and wind speed. Kumar et al. 共2002兲 selected six input variables: minimum and maximum temperature, minimum and maximum relative humidity, wind speed, and solar radiation. In India with its hot and humid climate, Sudheer et al. found that their model worked best with six inputs: minimum and maximum temperature, minimum and maximum relative humidity, sunshine hours, and wind speed 共albeit the improvement of using the minimum and maximum temperature over the mean temperature was quite small兲. In Turkey, there are mixed results from the published papers. Terzi and Kesk 共2005兲 investigated the evaporation at Lake Egirdir and found the important weather factors were 共in the order of their importance兲: air temperature, solar radiation, and air pressure. It is surprising to note that they found the influences of relative humidity and wind were negligible but the air pressure was included. One year later, the authors reported that only two weather input variables 共air temperature and solar radiation兲 were good enough and the air pressure was dropped. A subsequent paper by the authors 共Terzi et al. 2006兲 reported that three contributing weather factors should be considered in their model: solar radiation, air temperature, and relative humidity 共wind and air pressure were ignored兲. The capricious choice of the weather variables may indicate that their model input selection schemes were not very stable. In Singapore, Tan et al. 共2007兲 found the important variables were: solar radiation, relative humidity, air temperature, and surface wind speed. It is clear from the published literature that although data mining methods such as ANN have their own model structures and estimation schemes, the input variables are still fundamentally based on the physical processes involved in evaporation, i.e., the supply of energy to provide the latent heat of vaporization and the ability to transport the vapor away from the evaporation surface. The energy can be represented by variables such as solar radiation, sunshine hours, or air temperature and the vapor transport is represented by wind and relative humidity. Almost all the modeling procedures published so far followed this basic idea except for the study in Turkey, which may explain why their modeling results were inconsistent. However, it is interesting that all authors have ignored an important paper published by Anderson 共1936兲 in which he argued that saturation vapor pressure deficit has a much more significant link with evaporation than relative humidity. None of the published ANN models used saturation vapor pressure deficit and the present paper is the first study to incorporate this weather factor in the model. It has been clearly demonstrated that saturation vapor pressure deficit has a much higher impact on evaporation than relative humidity. This phenomenon might also be true in other climate regions and it is hoped that this paper will stimulate more researchers to explore this in their future evaporation studies. It should be noted that saturation vapor pressure deficit parameter is calculated based on air temperature and relative humidity parameters and is not measured. Therefore, there is no new information content from saturation vapor pressure deficit parameter in comparison with air temperature and relative humidity. However, the prominent role played by saturation vapor pressure deficit in the evaporation estimation in the case study is quite interesting. The application of the GT in this paper is another important contribution to evaporation estimation methodologies. As evaporation process is highly nonlinear and different field measurements are available at different parts of the world, selection of suitable input weather variables is not an easy task for modelers. Many hydrological modeling processes involve uncertainties

共Han et al. 2000, 2002; Han and Bray 2006; Wang et al. 2006; Hammond and Han 2006, 2008兲. The uncertainty linked with input variables has troubled hydrologists for a long time. The GT provides a very convenient tool to help modelers choose the best input combination before calibrating and testing models, therefore it reduces the input variable selection uncertainty. For this purpose, the GT could greatly reduce the time and effort for other researchers before constructing ANN, NNARX, and support vector machines models. It should be noted that there are several other schemes available for variable selections in nonlinear modeling 共such as information entropy, cross validation, etc.兲. The GT is easier to use than other schemes but it may not be the best scheme for all the nonlinear problems. Further studies are needed to explore and compare different variable selection schemes for evaporation estimations. Overall, the objectives of this study were to evaluate ANN model for evaporation estimation under a hot and dry climate, to improve ANN model by incorporating ARX component, to investigate the suitable input data for ANN/NNARX models using the GT, and to compare the ANN models with some conventional empirical methods. It is quite clear that all the planned objectives have been achieved with some interesting findings about the weather variable selections. It should be pointed out that although the methodology described here is mainly used for repairing and extending the measured evaporation records, there is a potential for using the scheme to find out the important contributing variables in developing empirical formulas that could be used for ungauged catchments via regionalization approach.

References Agalbjörn, S., Končar, N., and Jones, A. J. 共1997兲. “A note on the gamma test.” Neural Comput. Appl., 5共3兲, 131–133. Alizadeh, A. 共2004兲. Principles of applied hydrology, Imam Reza Univ. 共in Persian兲. Anderson, D. B. 共1936兲. “Relative humidity or vapor pressure deficit.” Ecology, 17共2兲, 277–282. ASCE Task Committee on Application of Artificial Neural Networks in Hydrology 共Rao Govindaraju兲. 共2000a兲. “Artificial neural networks in hydrology. II: Hydrologic applications. J. Hydrol. Eng., 5共2兲, 124– 137. ASCE Task Committee on Application of Artificial Neural Networks in Hydrology 共Rao Govindaraju兲. 共2000b兲. “Artificial neural networks in hydrology. II: Preliminary concepts.” J. Hydrol. Eng., 5共2兲, 115–123. Blanken, P. D., et al. 共2000兲. “Eddy covariance measurements of evaporation from Great Slave Lake, Northwest Territories, Canada.” Water Resour. Res., 36共4兲, 1069–1077. Blodgett, T. A., Lenters, J. D., and Isacks, B. L. 共1997兲. “Constraints on the origin of paleolake expansions in the Central Andes.” Earth Interact., 1. Bray, M., and Han, D. 共2004兲. “Identification of support vector machines for runoff modeling.” J. Hydroinform., 6共4兲, 265–280. Cahoon, J. E., Costello, T. A., and Ferguson, J. A. 共1991兲. “Estimating pan evaporation using limited meteorological observations.” Agric. Forest Meteorol., 55, 181–190. Christiansen, J. E. 共1968兲. “Pan evaporation and evapotranspiration from climatic data.” J. Irrig. and Drain. Div., 94, 243–265. Chuzhanova, N. A., Jones, A. J., and Margetts, S. 共1998兲. “Feature selection for genetic sequence classification.” Bioinformatics, 14共2兲, 139– 143. Clair, T. A., and Ehrman, J. M. 共1998兲. “Using neural networks to assess the influence of changing seasonal climates in modifying discharge, dissolved organic carbon, and nitrogen export in eastern Canadian rivers.” Water Resour. Res., 34共3兲, 447–455.



Dalton, J. 共1802兲. “Experiment essays on the constitution of mixed gases; on the force of steam of vapor from waters and other liquids in different temperatures, both in a torricellian vacuum and in air; on evaporation; and on the expansion of gases by heat.” Mem. Manch. Lit. Philos. Soc., 5, 535–602. De Oliveira, A. G. 共1999兲. “Synchronisation of chaos and applications to secure communications.” Ph.D. thesis, Dept. of Computing, Imperial College of Science, Technology and Medicine, Univ. of London, London. Durrant, P. J. 共2001兲. “winGamma: A non-linear data analysis and modelling tool with applications to flood prediction.” Ph.D. thesis, Dept. of Computer Science, Cardiff Univ., Wales, U.K. Elshorbagy, A., Simonovic, S. P., and Panu, U. S. 共2000兲. “Performance evaluation of artificial neural networks for runoff prediction.” J. Hydrol. Eng., 5共4兲, 424–427. Evans, D., and Jones, A. J. 共2002兲. “A proof of the gamma test.” Proc. R. Soc. London, Ser. A, 458共2027兲, 2759–2799. Fernando, D. A., and Jayawardena, A. W. 共1988兲. “Runoff forecasting using RBF network with OLS algorithm.” J. Hydrol. Eng., 3共3兲, 203– 209. Fine, T. L. 共1999兲. Feedforward neural network methodology (Statistics for engineering and information science), Springer, New York. Govindaraju, R. S. 共2000兲. “Artificial neural networks in hydrology. I: Preliminary concepts.” J. Hydrol. Eng., 5共2兲, 115–123. Hammond, M., and Han, D. 共2006兲. “Issues of using digital maps for catchment delineation.” Proc., ICE, Water Management, 159共1兲, 45– 51. Hammond, M., and Han, D. 共2008兲. “Quantization analysis of weather radar data with synthetic rainfall.” Stochastic Environ. Res. Risk Assess., 22共3兲, 1436–3259. Han, D., and Bray, M. 共2006兲. “Automated Thiessen polygon generation.” Water Resour. Res., 42. Han, D., Chan, L., and Zhu, N. 共2007a兲. “Flood forecasting using support vector machines.” J. Hydroinform., 9共4兲, 267–276. Han, D., Cluckie, I. D., Griffith, R. J., and Austin, G. 共2000兲. “Rainfall measurements for urban catchments using weather radars.” J. Urban Technol., 7共1兲, 85–102. Han, D., Cluckie, I. D., Karbassioun, D., Lawry, J., and Krauskopf, B. 共2002兲. “River flow modelling using fuzzy decision trees.” Water Resour. Manage., 16共6兲, 431–445. Han, D., Kwong, T., and Li, S. 共2007b兲. “Uncertainties in real-time flood forecasting with neural networks.” Hydrolog. Process., 21, 223–228. Han, H., and Felker, P. 共1997兲. “Estimation of daily soil water evaporation using an artificial neural network.” J. Arid Environ., 37, 251– 260. Haykin, S. 共1999兲. Neural networks—A comprehensive foundation, Prentice-Hall, Englewood Cliffs, N.J. Hostetler, S. W., and Bartlein, P. J. 共1990兲. “Simulation of lake evaporation with application to modeling lake level variations of Harney– Malheur Lake, Oregon.” Water Resour. Res., 26共10兲, 2603–2612. Hsu, K., Gupta, H. V., and Sorooshian, S. 共1995兲. “Artificial neural network modeling of the rainfall-runoff process.” Water Resour. Res., 31共10兲, 159–169. Ikebuchi, S., Seki, M., and Ohtoh, A. 共1988兲. “Evaporation from Lake Biwa.” J. Hydrol., 102, 427–444. Imrie, C. E., Durucan, S., and Korre, A. 共2000兲. “River flow prediction using neural networks: Generalization beyond the calibration range.” J. Hydrol. Eng., 233共3–4兲, 138–154. Jones, A. J., Tsui, A., and De Oliveira, A. G. 共2002兲. “Neural models of arbitrary chaotic systems: Construction and the role of time delayed feedback in control and synchronization.” Complexity International, 9. Karunanithi, N., Grenney, W. J., Whitley, D., and Bovee, K. 共1994兲. “Neural networks for river flow prediction.” J. Comput. Civ. Eng., 8共2兲, 201–220. Keshavarz, E., and Roopaei, M. 共2006兲. “Intelligent structures in economical forecasting.” Proc., Int. Conf. on Advanced Technologies in Telecommunications and Control Engineering (ATTCE).

Keskin, M. E., and Terzi, O. 共2006兲. “Artificial neural network models of daily pan evaporation.” J. Hydrol. Eng., 11共1兲, 65–70. Keskin, M. E., Terzi, O., and Taylan, D. 共2004兲. “Fuzzy logic model approaches to daily pan evaporation estimation in western Turkey.” Hydrol. Sci. J., 49共6兲, 1001–1010. Kisi, O. 共2005兲. “Discussion of ‘Fuzzy logic model approaches to daily pan evaporation estimation in western Turkey.’” Hydrol. Sci. J., 50共4兲. Kisi, O. 共2006兲. “Daily pan evaporation modeling using a neuro-fuzzy computing technique.” J. Hydrol., 329. Končar, N. 共1997兲. “Optimisation methodologies for direct inverse neurocontrol.” Ph.D. thesis, Dept. of Computing, Imperial College of Science, Technology and Medicine, Univ. of London, London. Kottek, M., Grieser, J., Beck, C., Rudolf, B., and Rubel, F. 共2006兲. “World map of the Köppen-Geiger climate classification updated.” Meteorol. Z., 15, 259–263. Kumar, M., Raghuwanshi, N. S., Singh, R., Wallender, W. W., and Pruitt, W. O. 共2002兲. “Estimating evapotranspiration using artificial neural network.” J. Irrig. Drain. Eng., 128共4兲, 224–233. Laird, N. F., and Kristovich, D. A. R. 共2002兲. “Variations of sensible and latent heat fluxes from a great lakes buoy and associated synoptic weather patterns.” J. Hydrometeor., 3共1兲, 3–12. Linacre, E. T. 共1977兲. “A simple formula for estimating evaporation rates in various climates, using temperature data alone.” Agric. Forest Meteorol., 18, 409–424. Linacre, E. T. 共1994兲. “Estimating U.S. Class A pan evaporation from few climate data.” Water Int., 19, 5–14. Liong, S. Y., Khu, S. T., and Chan, W. T. 共2001兲. “Derivation of pareto front with genetic algorithm and neural network.” J. Hydrol. Eng., 6共1兲, 52–61. Ljung, L. 共1999兲. System identification. Theory for the user, Prentice-Hall PTR, Englewood Cliffs, N.J. Maier, H. R., and Dandy, G. C. 共2000兲. “Neural networks for the prediction and forecasting of water resources variables: A review of modeling issues and applications.” Environ. Modell. Software, 15, 101–124. McCuen, R. H. 共1998兲. Hydrologic analysis and design, Prentice-Hall, Englewood Cliffs, N.J. McCulloch, W. S., and Pitts, W. 共1943兲. “A logical calculus of the ideas immanent in nervous activity.” Bull. Math. Biophys., 5, 115–133. Mosner, M. S., and Aulenbach, B. T. 共2003兲. “Comparison of methods used to estimate lake evaporation for a water budget of Lake Seminole, southwestern Georgia and northwestern Florida.” Proc., 2003 Georgia Water Resources Conf. Penman, H. L. 共1948兲. “Natural evapotranspiration from open water, bare soil and grass.” Proc. R. Soc. London, Ser. A, 193, 120–145. Remesan, R., Shamim, M. A., and Han, D. 共2008兲. “Model data selection using gamma test for daily solar radiation estimation.” Hydrol. Process., in press. Rohwer, C. 共1931兲. “Evaporation from free water surfaces.” U.S. Dept. of Agriculture Technical Bulletin No. 271, Washington, D.C. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 共1986兲. Learning internal representations by back propagation, D. E. Rumelhart, J. L. McClelland, and PDP Research Group, eds., MIT, Cambridge, Mass., 316–382. Smith, M. 共1993兲. Neural networks for statistical modeling, Van Nostrand Reinhold, New York. Sudheer, K. P., Gosain, A. K., Rangan, D. M., and Saheb, S. M. 共2002兲. “Modelling evaporation using an artificial neural network algorithm.” Hydrolog. Process., 16, 3189–3202. Tan, S. B. K., Shuy, E. B., and Chua, L. H. C. 共2007兲. “Modelling hourly and daily open-water evaporation rates in areas with an equatorial climate.” Hydrolog. Process., 21共486–499兲. Terzi, O., and Keskin, M. E. 共2005兲. “Modelling of daily pan evaporation.” J. Appl. Sci., 5共2兲, 368–372. Terzi, O., Keskin, M. E., and Taylan, E. D. 共2006兲. “Estimating evaporation using ANFIS.” J. Irrig. Drain. Eng., 132共5兲, 503–507. Tokar, A. S., and Johnson, P.A. 共1999兲. “Rainfall-runoff modeling using artificial neural networks.” J. Hydrol. Eng., 4共3兲, 232–239. Tsui, A. P. M. 共1999兲. “Smooth data modelling and stimulus-response via



stabilisation of neural chaos.” Ph.D. thesis, Dept. of Computing, Imperial College of Science, Technology and Medicine, Univ. of London, London. Tsui, A. P. M., Jones, A. J., and De Oliveira, A. G. 共2002兲. “The construction of smooth models using irregular embeddings determined by a gamma test analysis.” Neural Comput. Appl., 10共4兲, 318–329. Van Zyl, W. H., De Jager, J. M., and Maree, C. J. 共1989兲. “The relationship between daylight evaporation from short vegetation and the USWB Class A pan.” Agric. Forest Meteorol., 46, 107–118.

Wang, Y. C., Han, D., Yu, P. S., and Cluckie, I. D. 共2006兲. “Comparative modeling of two catchments in Taiwan and England.” Hydrolog. Process., 4335–4349. Warnaka, K., and Pochop, L. 共1988兲. “Analyses of equations for free water evaporation estimates.” Water Resour. Res., 24共7兲, 979–984. Winter, T. C. 共1981兲. “Uncertainties in estimating the water balance of lakes.” Water Resour. Bull., 17共1兲, 82–115. Yu, Y. S., and Knapp, H. V. 共1985兲. “Weekly, monthly, and annual evaporations for Elk City Lake.” J. Hydrol., 80, 93–110.