comparison between adaptive neuro-fuzzy inference

0 downloads 0 Views 13MB Size Report
Oct 31, 2018 - between adaptive neuro-fuzzy inference systems and artificial ... Hence, the reliable prediction of river water temperature is ... ral network (ANN) models being the most widely used ... use of ANFIS model for river water temperature modeling, .... water fluxes throughout the year and absence of a strong.
Environmental Science and Pollution Research https://doi.org/10.1007/s11356-018-3650-2

RESEARCH ARTICLE

Modeling daily water temperature for rivers: comparison between adaptive neuro-fuzzy inference systems and artificial neural networks models Senlin Zhu 1 & Salim Heddam 2 & Emmanuel Karlo Nyarko 3 & Marijana Hadzima-Nyarko 4 & Sebastiano Piccolroaz 5,6 & Shiqiang Wu 1 Received: 5 July 2018 / Accepted: 31 October 2018 # Springer-Verlag GmbH Germany, part of Springer Nature 2018

Abstract River water temperature is a key control of many physical and bio-chemical processes in river systems, which theoretically depends on multiple factors. Here, four different machine learning models, including multilayer perceptron neural network models (MLPNN), adaptive neuro-fuzzy inference systems (ANFIS) with fuzzy c-mean clustering algorithm (ANFIS_FC), ANFIS with grid partition method (ANFIS_GP), and ANFIS with subtractive clustering method (ANFIS_SC), were implemented to simulate daily river water temperature, using air temperature (Ta), river flow discharge (Q), and the components of the Gregorian calendar (CGC) as predictors. The proposed models were tested in various river systems characterized by different hydrological conditions. Results showed that including the three inputs as predictors (Ta, Q, and the CGC) yielded the best accuracy among all the developed models. In particular, model performance improved considerably compared to the case where only Ta is used as predictor, which is the typical approach of most of previous machine learning applications. Additionally, it was found that Q played a relevant role mainly in snow-fed and regulated rivers with higher-altitude hydropower reservoirs, while it improved to a lower extent model performance in lowland rivers. In the validation phase, the MLPNN model was generally the one providing the highest performances, although in some river stations ANFIS_FC and ANFIS_GP were slightly more accurate. Overall, the results indicated that the machine learning models developed in this study can be effectively used for river water temperature simulation. Keywords River water temperature . Air temperature . River flow discharge . Gregorian calendar . MLPNN . ANFIS . Hydrological regime Responsible editor: Marcus Schulz * Senlin Zhu [email protected] 1

State Key Laboratory of Hydrology-Water resources and Hydraulic Engineering, Nanjing Hydraulic Research Institute, Nanjing 210029, China

2

Faculty of Science, Agronomy Department, Hydraulics Division, University 20 Août 1955, Route El Hadaik, BP 26 Skikda, Algeria

3

Faculty of Electrical Engineering, Computer Science and Information Technology Osijek, University J.J. Strossmayer in Osijek, Kneza Trpimira 2b, 31000 Osijek, Croatia

4

Faculty of Civil Engineering Osijek, University J.J. Strossmayer in Osijek, Osijek, Croatia

5

Institute for Marine and Atmospheric Research, Department of Physics, Utrecht University, Princetonplein 5, 3584, CC Utrecht, The Netherlands

6

Service for Torrent Control, Autonomous Province of Trento, via Trener 3, I-38121 Trento, Italy

Introduction Water temperature is a key factor for the health of aquatic ecosystems. It controls several physical and bio-chemical processes in rivers, such as reoxygenation processes and the temperature-dependent metabolism of aquatic plants and animals (Rajwakuligiewicz et al. 2015; Sandersfeld et al. 2017). All aquatic organisms have a typical water temperature range for tolerance and abrupt variations of river thermal conditions may seriously impact their metabolism, behavior, and survival (Caissie et al. 2007; Carolli et al. 2011; Verbrugge et al. 2012; Casas-Mulet et al. 2016; Westhoff and Rosenberger 2016). For example, previous studies have shown that the migration of river fishes is more probable when the maximum water temperature exceeds its optimal temperature range (Eaton et al. 1995; Howell et al. 2010), and in general, fish mortality increases with the number of days on which river

Environ Sci Pollut Res

water temperature exceeded its temperature tolerance (Phelps et al. 2010). Hence, the reliable prediction of river water temperature is particularly relevant due to its significant ecological role in affecting aquatic systems (Grbić et al. 2013). Besides meteorological forcing and climate, several other factors control thermal dynamics in rivers, such as the hydrological regime, riverbed conditions, fluvial topography, and anthropic activities (Caissie 2006; Hester and Doyle 2011; Kelleher et al. 2012; Lisi et al. 2015; Piccolroaz et al. 2016; Cai et al. 2018). When the aim is to accurately predict river water temperature, it is necessary to understand in detail how all these factors combine in affecting river thermal processes (Hadzima-Nyarko et al. 2014), considering also the impact of extreme climatic events such as heat waves (Piccolroaz et al. 2018). A large number of models for river water temperature prediction have been proposed and successfully applied in the past decades, which can be divided into two main categories: physically based deterministic models and statistical models (Benyahya et al. 2007; Cole et al. 2014). Deterministic models simulate the spatio-temporal variations of water temperature in rivers based on the mathematical description of the heat fluxes exchanged between the river and the surrounding system (Hebert et al. 2011). Some famous examples of this type of models are the widely used HEC-RAS model (Jensen and Lowney 2004) and the MIKE 11 model (Wang 2013). Deterministic models are generally complex, and rely on a large number of requirements for inputs, such as fluvial topography, the whole set of meteorological variables, and the hydrological conditions, and therefore are frequently impractical when observed data are scarce. This fostered the development of statistical models, which have been widely used in river water temperature predictions because of their simplicity and minimal data requirements. Linear (Morrill et al. 2005; Krider et al. 2013) and non-linear regression models (Mohseni et al. 1998; Vliet et al. 2012), stochastic regression models (Ahmadi-Nedushan et al. 2007; Rabi et al. 2015), and hybrid statistical-physical based models (Gallice et al. 2015; Toffolon and Piccolroaz 2015; Piccolroaz et al. 2016) have been implemented successfully to model river water temperature. In addition, machine learning models have been increasingly used in river water temperature predictions in recent years, with artificial neural network (ANN) models being the most widely used approaches (Karaçor et al. 2007; Sahoo et al. 2009; Hadzima-Nyarko et al. 2014; DeWeber and Wagner 2014; Piotrowski et al. 2014, 2015; Rabi et al. 2015; Temizyurek and Dadasercelik 2018; Zhu et al. 2018). For example, Piotrowski et al. (2015) compared different types of ANN models for short time river water temperature predictions. Results showed that simple and popular multi-layer

perceptron neural networks are in most cases not outperformed by more complex and advanced models, and ANN models are useful tools for river water temperature modeling. Adaptive neuro-fuzzy inference systems (ANFIS) are widely used in various fields, such as daily evaporation estimation (Shiri et al. 2011; Sanikhani et al. 2012; Kisi and Zounemat-Kermani 2014), municipal water consumption modeling (Yurdusev and Firat 2009), river flow forecasting (He et al. 2014), water quality simulation (Heddam 2014), and estimation of oxidation parameters for food (Karaman et al. 2012). For example, Karaman et al. (2012) compared ANFIS and ANN models for estimating oxidation parameters of sunflower oil, and results showed that ANFIS model outperformed ANN model for the prediction of oxidation parameters. Currently, few studies have been directed toward the use of ANFIS model for river water temperature modeling, and the intercomparison between ANFIS and ANN for water temperature simulation is limited. Statistical models typically use air temperature as the only independent variable, considering it as a proxy for the net heat exchanges with the atmosphere (Stefan and Preud’homme 1993; Mohseni and Stefan 1999; Webb et al. 2003; Caissie 2006; Hadzima-Nyarko et al. 2014; Rabi et al. 2015; Zhu et al. 2018). However, some researchers also considered other impact factors, such as river flow discharge (Webb et al. 2003; Ahmadi-Nedushan et al. 2007; Arismendi et al. 2014; Toffolon and Piccolroaz 2015; Piccolroaz et al. 2016; Laanaya et al. 2017), solar radiation (Sahoo et al. 2009), riparian shade (Johnson et al. 2014), landform attributes, and forested land cover (Deweber and Wagner 2014). Air temperature and river flow discharge are generally the most available variables for modeling temperatures in rivers, and they have been shown to have the greatest impact on water temperature (Webb et al. 2003; Vliet et al. 2011). However, the role of flow discharge in machine learning models has seldom been investigated, and most of the previous studies employed only atmospheric factors as predictors (Sahoo et al. 2009; HadzimaNyarko et al. 2014; Rabi et al. 2015; Zhu et al. 2018; Temizyurek and Dadasercelik 2018). The stimulus for this research is therefore to implement ANN and ANFIS models to simulate water temperature for various river systems characterized by different hydrological conditions with air temperature (Ta ) and river flow discharge (Q) as predictors. According to Heddam (2016b) and Heddam and Kisi (2017), the inclusion of the components of the Gregorian calendar (CGC) improved significantly the performance of the machine learning models in the case of water quality modeling. Thus, the components of the CGC (year, month, and day) were also considered in the ANN and ANFIS models. The aim is to contribute to river water temperature modeling, deepening our understanding of river thermal behavior and providing references for water resource management.

Environ Sci Pollut Res

Materials and methods Study area In this study, we tested the performances of the proposed ANN and ANFIS models considering one river in Croatia and three rivers in Switzerland, which are characterized by different hydrological conditions. The cases are briefly introduced below and summarized in Table 1, where the periods of data availability are also listed. (i) The Drava River is located in the southern Central Europe, and it is one of the largest tributaries of the Danube River. The hydrological parameters of the Drava River are regularly monitored at Botovo, Terezino Polje, Donji Miholjac, and Osijek stations by the Meteorological and Hydrological Service of Croatia in Croatia. Two stations (Botovo and Donji Miholjac) were considered here. The Botovo station is located at about 20 km downstream of the Dubrava Reservoir (Bonacci and Oskoruš 2008). The Dubrava Reservoir was mainly constructed for water supply, irrigation, and hydropower generation. It has a total volume of 93.5 × 106 m3, and a maximum water level of about 149.60 m. This case (case 1) is representative of a large lowland river downstream of a reservoir. The Donji Miholjac station is located in a city and is close to a large fish pond. It is far downstream from the Dubrava Reservoir (120 km), and this case (case 2) is representative of a large lowland river far for regulations. (ii) The Mentue River flows through the Swiss plateau at low altitude. It drains through a sparsely inhabited area predominantly devoted to agriculture, and unaffected by strong anthropogenic thermal alterations. This case (case 3) was selected as representative of the typical condition of a small natural lowland river. (iii) The Rhône River is approximately 813 km in length, and it is one of the major rivers in Europe. The Rhône Table 1

River at the Sion hydrological station lies at the bottom of a populated Alpine valley. Since the beginning of the twentieth century, with the construction of a large highhead hydropower storage system, its hydrological regime has been dramatically altered and now it is affected by strong hydropeaking and thermo-peaking (Meile et al. 2011). This case (case 4) is taken as representative of a strongly regulated river due to the presence of highelevation hydropower reservoirs. (iv) The Dischmabach River is approximately 15 km in length. It is locate at high altitude in a steep glacial valley (Dischma), which is uninhabited and mainly used for mountain pastures. Due to the significant influence of snow melting in the warm season, this case (case 5) is considered as representative of a river with a clear nivoglacial regime.

Figure 1 presents the time series of the annual averaged air temperature (Ta), water temperature (Tw), and river flow discharge (Q) for the five stations. The seasonal dynamics of Ta, Tw, and Q are presented in Fig. 2 through the climatological year, which is defined by averaging for each day of the year all measurements available over the observation period for that same specific day. This figure effectively exemplifies the different hydrological and thermal regimes of the considered river stations, and indicates the different effect of Q in altering Tw dynamics. The response of Tw to changes in Ta is almost linear for the Mentue River (Fig. 2c), suggesting an overall negligible effect of Q due to low water fluxes throughout the year and absence of a strong seasonal pattern. Similar conditions can be observed for the Drava River (Fig. 2a, b), although the higher values and more marked seasonal pattern of Q result in a damped response of Tw to changes in Ta in spring and summer, when river flow discharge is higher. This effect is clearer for the Botovo station (Fig. 2a), likely due to the effect of the upstream Dubrava Reservoir, whose larger thermal

Characteristics of the studied river and meteorological stations

Case

1, 2

3

4

5

River name

Drava

Mentue

Rhône

Dischmabach

River station name River station elevation (m a.s.l.) Catchment area (km2) Calibration period Validation period Meteorological station name Meteorological station elevation (m a.s.l.) Distance from river station (km)

Botovo 121.55 31,038 1991–2008 2009–2016 Koprivnica 141 11.8

Yvonand 449 105 2002–2009 2010–2012 Mathod 437 12.7

Sion 484 3,373 1984–2003 2004–2013 Sion 482 2.14

Davos 1,668 43.3 2004–2009 2010–2012 Davos 1,594 4.9

Donji Miholjac 88.57 37,142 1993–2008 2009–2016 Donji Miholjac 97 1.0

Environ Sci Pollut Res 800

(a) Botovo

12 600

10 8

400

6 4

Ta

2

Tw

200

Q

Discharge (m3/s)

14 Temperature (eC)

Fig. 1 Time series plot of annual averaged air temperature (Ta), water temperature (Tw), and river flow discharge (Q) for the five case studies. a Drava at Botovo. b Drava at Donji Miholjac. c Mentue. d Rhône. e Dischmabach

0 0 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 Year 1000

(b) Donji Miholjac

12

800

10 8

600

6

400

4

Ta

2

Tw

Q

Discharge (m3/s)

Temperature (eC)

14

200

0 0 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012 2014 2016 2018 Year 2.5

(c) Mentue

2

10

1.5 9.5 1 9

Ta

8.5 2000

Tw

2002

Q

2004

Discharge (m3/s)

Temperature (eC)

10.5

0.5

2006

2008

2010

2012

0 2014

Year 140

(d) Rhône

120

10

100

8

80

6

60

4

Discharge (m3/s)

Temperature (eC)

12

40 Ta

2 0 1980

Tw

1985

Q 1990

20 1995

2000

2005

2010

0 2015

Year 2.5

(e) Dischmabach

5

2

4

1.5

3 1

2 1

Ta

Tw

0.5

Q

0 2003

2004

inertia in comparison to that of the Drava River determines a time-lagged response of Tw to Ta, similarly to what observed downstream of large dams (see, e.g., Cai et al. 2018) and to thermal dynamics in deep lakes (Toffolon

2005

2006

Discharge (m3/s)

Temperature (eC)

6

2007

2008 2009 Year

2010

2011

2012

0 2013

et al. 2014; Piccolroaz et al. 2015). A similar effect, although much stronger, is evident in the Rhône River at the Sion station (Fig. 2d), where releases of cold water from high-altitude hydropower reservoirs cause a clear

Environ Sci Pollut Res 800

25 Ta

Tw

a) Botovo

Q

700

Temperature (eC)

20

500

10

400 300

5

Discharge (m3/s)

600 15

200 0 0

30

60

90

120

150

-5

180

210

240

270

300

330

360 100 0

Day of year

800

25 Tw

b) Donji Miholjac

Q

700

Temperature (°C)

20

600 15

500 400

10

300

5

Discharge (m3/s)

Ta

200 0 0

30

60

90

120

150

180

210

240

270

300

330

360

100 0

-5 Day of year

7

25 Tw

c) Mentue

Q

6

Temperature (eC)

20

5

15

4 10 3 5

2

0 0

30

60

90

120

150

180

210

240

270

300

330

Discharge (m3/s)

Ta

1

360

-5

0 Day of year

25

300 Tw

Q

d) Rhône

20

250

15

200

10

150

5

100

0

50 0

30

60

90

120

150

180

210

240

270

300

330

Discharge (m3/s)

Temperature (eC)

Ta

360

-5

0 Day of year 5

15 Ta

Tw

Q

e) Dischmabach

4.5 3.5 3

5

2.5 0 0

30

60

90

120

150

180

210

240

270

300

330

2 360 1.5 1

-5

0.5 0

-10 Day of year

Discharge (m3/s)

4

10 Temperature (eC)

Fig. 2 Climatological (reference) years for the five case studies. a Drava at Botovo. b Drava at Donji Miholjac. c Mentue. d Rhône. e Dischmabach. Ta air temperature, Tw water temperature, Q river flow discharge

Environ Sci Pollut Res

of the ANN and contains several neurons generally determined using trial and error. The neurons in the hidden layer need to perform linear and nonlinear operations. In a first stage, the neurons in the hidden layer receive the input variables multiplied by the corresponding links (the weights), perform a summation, and in the second stage, the result is passed to the second layer through a nonlinear activation function, generally the sigmoid. In the output layer we have only one neuron that corresponds to the dependent variables (water temperature). In the present investigation, we have used the widely used multilayer perceptron neural network (MLPNN) (Rumelhart et al., 1986), which has a standard structure: input, one hidden, and one output layer, and was reported in the literature as universal approximators (Hornik et al. 1989; Hornik 1991). MLPNN with one hidden layer contains n neurons and one output layer with only one neuron, and is expressed as follows (Fig. 3): " " !# #

flattening of the seasonal pattern of Tw, especially in summer when hydropower production is significant (Meier and Wüest 2004; Piccolroaz et al. 2016). Finally, in the Dischmabach River (Fig. 2e), Tw is generally low throughout the year due to the high altitude and the contribution of snow melting that, similarly to the cold hydropower releases discussed above, determine a cooling of Tw during the warm season. The nivo-glacial regime of this river is well described by the annual cycle of flow discharge, with no flow in winter and rapid variations of the flow discharge starting in spring due to snow melting waters (Toffolon and Piccolroaz 2015). The basic statistical characteristics of Tw, Ta, and Q for the five cases are summarized in Table 2, through the average, minimum, maximum, standard deviation, and coefficient of variation values of the daily data sets.

Multilayer perceptron neural network

n

Y ¼ f 2 ∑ wjk f 1

Artificial neural network is the most and well-known approach for building robust nonlinear models (Haykin 1999). ANN models were inspired from the function of the human brain in which an undetermined number of biological neurons were fully interconnected and play a key role in delivering a reliable transmission of information. The ANN models are organized in several parallel layers, and each layer contains several neurons. There are three kinds of layers: the input, hidden, and output layers. The input layer performs no calculation and only contains the input variables. The hidden layer is the most important part

Table 2 Daily statistical parameters of the used data sets for all stations

River station Botovo

Donji Miholjac

Mentue

Rhône

Dischmabach

j¼1

n

∑ xi wij þ δ j

þ δ0

ð1Þ

j¼1

where xi is the input variable, wij is the weight between the input i and the hidden neuron j, δj is the bias of the hidden neuron j, f1 is the activation sigmoid function, wjk is the weight of connection of neuron j in the hidden layer to unique neuron k in the output layer, δ0 is the bias of the output neuron k, and finally f2 is a linear activation function for the neuron in the output layer. The weights (wij) and bias levels (δ0) are the only free parameters that can be adjusted when the structure of the

Item

Unit

Xmean

Ta

°C

11.39

Tw Q Ta Tw Q Ta Tw Q Ta Tw Q Ta Tw Q

°C m3/s °C °C m3/s °C °C m3/s °C °C m3/s °C °C m3/s

11.43 478.09 11.76 11.96 515.23 9.97 9.72 1.40 10.33 7.00 103.71 3.78 4.29 1.68

Xmax

Xmin

Sx

Cv

30.10

− 15.00

8.46

0.74

25.20 2345.00 30.40 27.20 2166.00 25.55 21.70 36.29 27.60 12.16 757.30 20.12 11.02 13.14

0.00 103.00 − 14.80 0.00 175.00 − 10.14 0.00 0.08 − 15.00 0.31 17.09 − 22.18 0.17 0.24

6.22 231.64 8.68 7.17 225.44 7.38 5.80 2.07 7.88 2.17 78.42 7.72 2.88 1.57

0.54 0.48 0.74 0.60 0.44 0.74 0.60 1.48 0.76 0.31 0.76 2.04 0.67 0.93

Xmean mean, Xmax maximum, Xmin minimum, Sx standard deviation, Cv coefficient of variation, Tw water temperature (°C), Ta air temperature (°C), and Q river flow discharge (m3 /s)

Environ Sci Pollut Res

δj δ0

Ta Q Tw

CGC Input Layer

Output Layer

Wij

Wjk

river flow discharge (Q), also the components of the Gregorian calendar (year number YY, number of months from 1 to 12 MM, and number of days from 1 to 31 DD) were used as input variables. Data normalization is an important step in modeling with ANN models. It removes the dimensional differences in the data and improves the prediction ability of the ANN models. In this study, all the variables were normalized to have zero mean and unit variance using the Z-score method (Olden et al. 2004; Heddam 2016a):

Hidden Layer Fig. 3 Multilayer perceptron neural network (MLPNN) architecture. Ta air temperature, Q river flow discharge, CGC the components of the Gregorian calendar, Tw water temperature

neural network has been defined (number of layers, number of neurons in each layer, activation function for each layer). Modification of these parameters will change the output values of the designed network. The weights (wij) and bias levels (δ0) need to be adjusted to minimize the model error (difference between the modeled value and observed value), which is referred to as the training or calibration process of the network. The root-mean-square error (RMSE) and the mean-squared error (MSE) is often used to define the network error. In the present application of the MLPNN model, besides air temperature (Ta) and

xni;k ¼

xi;k −mk SdK

ð2Þ

where xni,k is the normalized value of the variable k (input or output) for each sample i, xi,k is the original value of the variable k (input or output), and mk and Sdk are the mean value and standard deviation of the variable k (input or output). The script of the MLPNN model was implemented in MATLAB. In this study, the MLPNN model has one hidden layer with sigmoidal activation function, and one output layer with linear activation function. The number of neurons in the hidden layer varies from one station to another and from one version to the other; however, in the major part of the present investigation, the number was generally between 10 and 13.

Fig. 4 Structure of the ANFIS model developed for predicting river water temperature (Tw)

Environ Sci Pollut Res

ANFIS The ANFIS is a machine learning model composed of two different paradigms: the ANN and the fuzzy logic (FL) (Jang 1993). ANFIS architecture includes the structure through several layers similar to the ANN, and fuzzification using the concept of the fuzzy inference system (FIS) from the FL paradigms (Jang et al. 1996). Contrary to the ANN model for which the parameters (weights and biases) are stored between the layers from the input to the output, ANFIS model possesses two kinds of parameters: linear and nonlinear (Jang 1993), which are adjusted during the training process. The FIS is an ensemble of logical rule-based in the form of (if A and B, then C). Similar to any regression model, the FIS transforms the input variables using the membership functions. Hence, the rule bases are composed from the results of the membership functions (MFs) (Jang et al. 1996). The complete structure of ANFIS consists of five layers, in addition to the input layer (Fig. 4): (i) the fuzzification layer, (ii) the base rules layer, (iii) the normalized layer, (iv) the defuzzification layer, and (v) the output layer (Jang 1993). The parameters of the ANFIS model are stored in (i) the fuzzification layer which contains the nonlinear parameters for the MFs, also called the premise parameters, updated during the training process using the back-propagation (BP) algorithm (forward step), and (ii) the linear parameters, also called consequent parameters, which are stored in the defuzzification layer, also updated during the training process using the least squares (LS) method (backward step) (Jang 1993). Consequently, each IF-THEN fuzzy rule contains at the same time, the linear and nonlinear parameters. The MFs plays an important role in the ANFIS models, and a good choice of the MFs among several available kinds with a best optimization of the parameters, leads to a model with high accuracy. For example, a Gaussian MFs possesses two adjustable parameters: Y¼e

−ðx−cÞ2 2σ2

ð3Þ

where Ai (or Bi − 2) is the linguistic label, μΑi ðxÞ and μΒi−2 ðyÞ are the fuzzy membership functions, and x and y are the input variables. Layer 2: the base rule layer Ο2i ¼ wi ¼ μAi μBi ; i ¼ 1; 2

ð5Þ

where wi is the firing strength of a rule. Layer 3: the normalized firing strengths Ο3i ¼ wi ¼ ðwi =ðw1 þ w2 ÞÞ; i ¼ 1; 2

ð6Þ

Layer 4: the defuzzification layer Ο4i ¼ wi f i ¼ wi ðpi x þ qi y þ ri Þ; i ¼ 1; 2

ð7Þ

where wi is the output of layer 3 and pi, qi, and ri are the consequent parameters. Layer 5: the output of the ANFIS model   Ο5i ¼ ∑ wi f i ¼ ∑ wi f i =ðw1 þ w2 Þ : ð8Þ i¼1

i¼1

The most important step in developing ANFIS model is the creation of the fuzzy rule base. The number of fuzzy rule for any ANFIS model is directly related to the identification method used for partitioning the input space. There are three identification methods mainly used for ANFIS model: (i) grid partition method (GP), (ii) subtractive clustering (SC), and (iii) fuzzy c-means clustering (FC). Consequently, three models were developed using three different MATLAB functions: (i) ANFIS_GP using the genfis1 function for the partition method, (ii) ANFIS_SC using the genfis2 for the SC method, and (iii) ANFIS_FC using the genfis3 for the FC method (Jang 1993; Jang et al. 1996). The difference between the three models is directly related to the number of fuzzy rules generated by the partitioning method. Using the grid partition method, the number of fuzzy rule is calculated as follow: if β is the number of input variables, and α is the number of fuzzy subset for each input, the number of fuzzy rule is α β (Wei et al. 2007). The grid partition method is not suitable if the number of inputs is superior to 6 (Jang 1993). Using the SC method, the number of fuzzy rules is related to the cluster center radii: the number of fuzzy rules is equal to the number of clusters (Cakmakci 2007). Unlike the first two methods, when using FC method, the number of fuzzy rules is equal to the clusters and fixed by the user.

where c and σ are the parameters of the MFs (the premise parameters) that must be updated during the training process, x is an input variable, and Y is the response of the MFs. For more details about this method, the reader can refer to Jang (1993) and Jang et al. (1996). From the input to the output the model can be expressed as follow: Layer 1: the fuzzification layer

Model evaluation

Ο1i ¼ μΑi ðxÞ; i ¼ 1; 2; and Ο1i ¼ μΒi−2 ðyÞ; i ¼ 3; 4

In this study, model performance was evaluated using the following statistical indices metrics: the coefficient of

ð4Þ

Environ Sci Pollut Res

correlation (R), the Willmott index of agreement (d), the RMSE, and the mean absolute error (MAE). 2 3 1 ∑ðΟi −Οm ÞðΡi −Ρm Þ 6 7 Ν ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 7 ð9Þ R¼6 4 1 n 5 n 1 ∑ ðΟi −Οm Þ2 ∑ ðΡi −Ρm Þ2 Ν i¼1 Ν i¼1 N

∑ ðΡi −Οi Þ2 d ¼ 1−

i¼1 N

ð10Þ

∑ ðjΡi −Οm j þ jΟi −Οm jÞ

2

i¼1

sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 Ν ∑ ðΟi −Ρi Þ2 RMSE ¼ Ν i¼1 MAE ¼

ð11Þ

1 N ∑ jΟi −Ρi j N i¼1

ð12Þ

where N is the number of data points, Oi is the observed, and Pi is the corresponding predicted value of water temperature at time i. Om and Pm are the average values of Oi and Pi. The Willmott index of agreement is a standardized measure of the degree of model prediction error and varies between 0 and 1: a value of 1 indicates perfect agreement, while 0 indicates no match at all.

Results and discussion Daily water temperature (Tw) at the five reference stations described in the BStudy area^ section was predicted by using the MLPNN, ANFIS_GP, ANFIS_SC, and ANFIS_FC models. The four models were developed according to the following three different versions: (i) version 1 with only one input variable (Ta), (ii) version 2 with two inputs variable Table 3 Performances of different models in modeling water temperature (Tw; °C) for Botovo station

Model version

(Ta and Q), and (iii) version 3 with three inputs (Ta, Q, and the CGC). While for each machine learning model the three versions were based on the same architecture (the MLPNN model or the ANFIS models), we stress that they represent three substantially different river water temperature models due to the different combinations of variables used as predictors. The different machine learning models were compared based on the four statistical indices: RMSE, MAE, R, and d. According to the results obtained (Tables 3, 4, 5, 6, and 7), the following conclusions can be drawn. The comparative results between the three versions of the MLPNN, ANFIS_FC, ANFIS_GP, and ANFIS_SC, models revealed that the version 3 (i.e., the one with three inputs as predictors: Ta, Q, and the CGC) yielded the best accuracy among all the developed models, and outperform all the other versions in term of higher R and d and lower RMSE and MAE values, at the five stations. This result indicates that the use of CGC is complementary (not redundant) to the use of Ta and Q, as it provides additional information on the seasonality of the river thermal dynamics compared to that encapsulated in the Ta and Q time series. The comparison between the improvement in river water temperature prediction obtained using model version 3 and version 2 relative to model version 1 indicated that the inclusion of river flow discharge alone (model version 2) provided a lower gain than including CGC for case studies 1, 2, and 3, although its relative importance increases for cases 4 and 5, both in calibration and in validation. This confirms previous findings, which suggested that the role of flow discharge on river water temperature modeling is more relevant for high-altitude snowmelt-fed rivers and highly regulated river stations with hydropower plants at higher altitudes (Piccolroaz et al. 2016; Sohrabi et al. 2017). Figure 5 shows the monthly distribution of RMSE for each version of the MLPNN model and each river station considered in the analysis. In this case, the RMSE values for each

Training (calibration)

Validation

R

d

RMSE (°C)

MAE (°C)

R

d

RMSE (°C)

MAE (°C)

MLPNN3 MLPNN2 MLPNN1 ANFIS_GP3

0.981 0.933 0.926 0.982

0.990 0.964 0.961 0.991

1.227 2.267 2.366 1.193

0.956 1.725 1.838 0.923

0.977 0.926 0.922 0.972

0.988 0.961 0.958 0.986

1.307 2.287 2.350 1.435

1.040 1.763 1.836 1.127

ANFIS_GP2 ANFIS_GP1 ANFIS_SC3 ANFIS_SC2 ANFIS_SC1 ANFIS_FC3 ANFIS_FC2 ANFIS_FC1

0.932 0.926 0.972 0.932 0.926 0.979 0.933 0.926

0.964 0.961 0.985 0.964 0.960 0.989 0.964 0.961

2.276 2.365 1.483 2.283 2.368 1.281 2.263 2.366

1.735 1.837 1.171 1.744 1.840 1.006 1.726 1.837

0.926 0.922 0.972 0.927 0.922 0.974 0.926 0.922

0.961 0.958 0.986 0.961 0.958 0.987 0.961 0.958

2.287 2.348 1.429 2.283 2.350 1.368 2.291 2.349

1.761 1.833 1.128 1.755 1.835 1.076 1.763 1.835

Environ Sci Pollut Res Table 4 Performances of different models in modeling water temperature (Tw; °C) for Donji Miholjac station

Model version

Training (calibration) R

d

RMSE (°C)

MAE (°C)

R

d

RMSE (°C)

MAE (°C)

MLPNN3

0.984

0.992

1.290

1.014

0.971

0.986

1.690

1.337

MLPNN2 MLPNN1 ANFIS_GP3 ANFIS_GP2 ANFIS_GP1 ANFIS_SC3 ANFIS_SC2 ANFIS_SC1 ANFIS_FC3 ANFIS_FC2 ANFIS_FC1

0.938 0.934 0.984 0.938 0.934 0.976 0.937 0.933 0.981 0.939 0.934

0.967 0.965 0.992 0.967 0.965 0.988 0.967 0.964 0.990 0.968 0.965

2.487 2.579 1.284 2.487 2.576 1.562 2.513 2.585 1.388 2.481 2.575

1.909 2.012 1.002 1.910 2.009 1.245 1.934 2.014 1.098 1.905 2.009

0.932 0.928 0.969 0.933 0.928 0.968 0.932 0.928 0.972 0.933 0.928

0.964 0.962 0.984 0.965 0.962 0.983 0.964 0.962 0.986 0.965 0.962

2.578 2.645 1.770 2.551 2.645 1.789 2.577 2.642 1.685 2.556 2.647

1.999 2.063 1.352 1.976 2.063 1.416 2.001 2.060 1.328 1.976 2.066

month were calculated based on daily Tw values for that same month of the year. For models MLPNN1 (version 1 with only Ta as input variable), local maxima in RMSE are clearly visible in all river stations, and show strong correlation with the seasonal pattern of Q summarized in Fig. 2. In particular, RMSE shows a clear increase during the seasons of higher river flow discharge, except for case 4 (the Rhône River), where they occur just before and after this period. As discussed above, the inclusion of Q in the model (MLPNN2) slightly improves model performance by decreasing RMSE, but does not remove the presence of these local maxima in RMSE during the wet seasons. However, when also CGC is added to the model (MLPNN3), besides a general marked improvement of the prediction capability, one can also appreciate a general flattening of RMSE (particularly for the Dischmabach) that assumes an almost uniform distribution throughout the year. This

Table 5 Performances of different models in modeling water temperature (Tw; °C) for Yvonand station

Validation

Model version

MLPNN3 MLPNN2 MLPNN1 ANFIS_GP3 ANFIS_GP2 ANFIS_GP1 ANFIS_SC3 ANFIS_SC2 ANFIS_SC1 ANFIS_FC3 ANFIS_FC2 ANFIS_FC1

confirms that in the proposed model, Q alone may not be sufficiently informative and that the addition of CGC significantly contributes to better capture the seasonal pattern. This is coherent with previous findings by Piccolroaz et al. (2016), which suggested that in their hybrid air2stream model, the value of adding the role of flow discharge was not much associated to the incorporation of the thermal inertia of the river through the inclusion of water depth, but more to the possibility to account for the effect of upstream and lateral contributing water and heat fluxes. In that model, in fact, besides the dependence on Ta and Q, a sinusoidal annual term with annual periodicity was included to mimic the effect of lateral inflows and heat fluxes, which is analogous to the inclusion of CGC in model MLPNN3. Results at Botovo station are reported in Table 3. A clear increase of model performances can be observed from version

Training (calibration)

Validation

R

d

RMSE (°C)

MAE (°C)

R

d

RMSE (°C)

MAE (°C)

0.991 0.979 0.978 0.984 0.978 0.978 0.987 0.978 0.977 0.986 0.980 0.978

0.995 0.989 0.989 0.992 0.989 0.989 0.993 0.989 0.988 0.993 0.990 0.989

0.790 1.165 1.210 1.024 1.196 1.207 0.937 1.209 1.217 0.962 1.157 1.211

0.610 0.884 0.927 0.790 0.917 0.925 0.731 0.926 0.932 0.753 0.885 0.927

0.988 0.976 0.975 0.980 0.976 0.975 0.984 0.975 0.975 0.983 0.976 0.975

0.994 0.987 0.987 0.990 0.987 0.987 0.992 0.987 0.987 0.991 0.987 0.987

0.913 1.287 1.307 1.166 1.292 1.314 1.054 1.303 1.308 1.080 1.286 1.309

0.685 0.970 0.982 0.878 0.968 0.987 0.790 0.985 0.989 0.820 0.965 0.985

Environ Sci Pollut Res Table 6 Performances of different models in modeling water temperature (Tw; °C) for Sion station

Model version

Training (calibration) R

d

RMSE (°C)

MAE (°C)

R

d

RMSE (°C)

MAE (°C)

MLPNN3

0.966

0.982

0.547

0.422

0.944

0.971

0.763

0.530

MLPNN2 MLPNN1 ANFIS_GP3 ANFIS_GP2 ANFIS_GP1 ANFIS_SC3 ANFIS_SC2 ANFIS_SC1 ANFIS_FC3 ANFIS_FC2 ANFIS_FC1

0.944 0.935 0.963 0.943 0.935 0.952 0.942 0.935 0.959 0.945 0.935

0.970 0.966 0.981 0.970 0.966 0.975 0.969 0.965 0.979 0.971 0.966

0.695 0.744 0.568 0.697 0.743 0.642 0.705 0.748 0.598 0.690 0.744

0.535 0.572 0.437 0.536 0.572 0.490 0.542 0.576 0.462 0.531 0.572

0.939 0.927 0.949 0.938 0.926 0.939 0.937 0.926 0.941 0.939 0.927

0.969 0.962 0.974 0.968 0.962 0.968 0.968 0.962 0.970 0.968 0.962

0.790 0.864 0.723 0.795 0.865 0.793 0.801 0.866 0.779 0.794 0.864

0.582 0.652 0.508 0.582 0.653 0.571 0.585 0.652 0.548 0.584 0.652

1 to version 3, as the R and d values increased gradually and the RMSE and MAE values decreased significantly. During the validation phase, the predicted Tw values correlate generally well with the observed Tw values for most of the proposed models. Using only Ta as input variable, it is clear that all the four models provided the same accuracy. Although including Q as input variable of the models (version 2) decreased the RMSE and MAE values, the percentage of improvement is not as significant (Table 3). Broadly speaking, the four models provided relatively the same accuracy with only very marginal difference. Inclusion of the CGC significantly improves the accuracy of the models, and the most significant improvement is achieved by the MLPNN3 model (RMSE = 1.307 °C, MAE = 1.040 °C), followed by the ANFIS_FC3 (RMSE = 1.368 °C, MAE = 1.076 °C), and then by the ANFIS_SC3 (RMSE = 1.429 °C, MAE = 1.128 °C) and ANFIS_GP3 (RMSE = 1.435 °C, MAE = 1.127 °C) in the validation period. Table 7 Performances of different models in modeling water temperature (Tw; °C) for Davos station

Validation

Model version

MLPNN3 MLPNN2 MLPNN1 ANFIS_GP3 ANFIS_GP2 ANFIS_GP1 ANFIS_SC3 ANFIS_SC2 ANFIS_SC1 ANFIS_FC3 ANFIS_FC2 ANFIS_FC1

The improvement is clearly visible using MLPNN3, and leads to significant decreasing in the RMSE and MAE of the MLPNN1 by 44.38% and 43.36%. ANFIS_FC3 guaranteed the second significant improvement by decreasing the RMSE and MAE of the ANFIS_FC1 by 41.76% and 41.36%, respectively. Finally, the ANFIS_GP3 and ANFIS_SC3 achieved the same improvement accuracy of about 39%. Overall, by comparing the four models, the MLPNN3 is more accurate than the three ANFIS models. The scatterplots and comparison between observed and predicted Tw for the best models (validation phase) at Botovo station are presented in Fig. 6. Table 4 provides the detailed training and validation results for the four models at Donji Miholjac station, which indicates that the daily Tw calculated using only the Ta is in good agreement with the observed values, and the MLPNN1, ANFIS_GP1, ANFIS_SC1, and ANFIS_FC1 models have the same accuracy. The validation results have no significant

Training (calibration)

Validation

R

d

RMSE (°C)

MAE (°C)

R

d

RMSE (°C)

MAE (°C)

0.990 0.966 0.951 0.992 0.966 0.951 0.980 0.962 0.950 0.986 0.967 0.951

0.995 0.983 0.974 0.996 0.983 0.974 0.990 0.980 0.974 0.993 0.983 0.974

0.398 0.743 0.894 0.370 0.742 0.890 0.578 0.783 0.900 0.473 0.735 0.893

0.313 0.559 0.677 0.289 0.560 0.673 0.458 0.590 0.685 0.376 0.555 0.676

0.987 0.964 0.950 0.976 0.965 0.950 0.976 0.963 0.951 0.983 0.964 0.950

0.994 0.981 0.973 0.988 0.982 0.973 0.987 0.981 0.974 0.991 0.981 0.973

0.457 0.759 0.896 0.639 0.750 0.898 0.631 0.775 0.886 0.531 0.768 0.895

0.366 0.571 0.691 0.472 0.562 0.690 0.504 0.588 0.684 0.421 0.585 0.689

Environ Sci Pollut Res Fig. 5 Monthly distribution of RMSE for each version of the MLPNN model and each river station. a Drava at Botovo. b Drava at Donji Miholjac. c Mentue. d Rhône. e Dischmabach

difference for all the four models. For the version 2 using Ta and Q as predictors, the accuracy of the models were only slightly better than version 1, which indicates that the contribution of the Q for the estimation of the Tw is marginal. However, when the CGC was added to Ta and Q (version 3), the performances of the models were significantly improved. The ANFIS_FC3 was the most accurate model with the lowest RMSE and MAE values (RMSE = 1.685 °C, MAE = 1.328 °C) and higher R and d values (R = 0.972, d = 0.986), slightly better than the MLPNN3 (RMSE = 1.690 °C, MAE = 1.337 °C), ANFIS_GP3 (RMSE = 1.770 °C, MAE = 1.352 °C), and

ANFIS_SC3 (RMSE = 1.789 C, MAE = 1.416 °C). The scatterplots and comparison between observed and predicted Tw for the best models at Donji Miholjac station (validation phase) are shown in Fig. 7. The performances of the models for the Yvonand station at Mentue River are summarized in Table 5. The R and d values for the models ranged between 0.975 and 0.988, 0.975 and 0.980, 0.975 and 0.984, and 0.975 and 0.983, for the MLPNN, ANFIS_GP, ANFIS_SC, and ANFIS_FC, respectively. The RMSE and MAE values by the models of the version 3 were much smaller than the values provided by the models of the

Environ Sci Pollut Res 25

Observed Tw (°C)

Predicted Tw (°C)

25

Predicted Tw (°C)

Tw (°C)

20 15 10 5

25

y = 0.9614x + 0.4761 R² = 0.9537

20

20

15

15

10

10

5

2701

0

Observed Tw (°C)

Predicted Tw (°C)

25

Predicted Tw (°C)

20 Tw (°C)

15 10 5

0 0

Time (days) 25

5

(a)

2851

2551

2401

2251

2101

1801

1951

1651

1501

1351

1201

901

1051

601

751

451

301

151

1

0

5 10 15 20 Observed Tw (°C)

25

25

y = 0.9591x + 0.5018 R² = 0.9494

20

20

15

15

10

10

5

5

2701

2551

2401

2251

2101

Observed Tw (°C)

(b)

2851

25

1801

Time (days)

1951

1651

1501

1351

1201

1051

901

751

601

451

151

301

1

0 0

Predicted Tw (°C)

25

Predicted Tw (°C)

20 Tw (°C)

15 10 5

0 0

5 10 15 20 Observed Tw (°C)

25

25

y = 0.957x + 0.5908 R² = 0.9446

20

20

15

15

10

10 5

5

2851

2701

2401

2551

2251

2101

1951

1801

1651

1501

1351

1201

1051

901

751

601

451

301

1

151

0

0 0

Time (days) 25

Observed Tw (°C)

Predicted Tw (°C)

25

Predicted Tw (°C)

20 Tw (°C)

15 10 5

(c) 5

10 15 20 Observed Tw (°C)

0 25

25

y = 0.946x + 0.6428 R² = 0.9446

20

20

15

15

10

10

5

Time (days)

2851

2701

2551

2401

2251

2101

1951

1801

1651

1501

1351

1201

901

1051

751

601

451

301

151

1

0

5

(d)

0 0

5 10 15 20 Observed Tw (°C)

0

25

Fig. 6 Scatterplots and comparison between observed and predicted river water temperature (Tw) for the best models at Botovo station. a MLPNN3. b ANFIS_FC3. c ANFIS_GP3. d ANFIS_SC3

versions 1 and 2. Switching from version 1 to version 2 by the inclusion of the Q as input variable did not improve significantly the performances of the models, especially for the ANFIS_SC model, as the percentage of improvement of the RMSE and MAE was lower than 0.5%. By contrast, the models for the version 3 showed a good accuracy and significant improvement. The MLPNN3 has the highest ratio in decreasing the RMSE and MAE of the MLPNN1 with 30.15% and 30.24%, respectively, compared to 11.26% and

11.24% for the ANFIS_GP3, 19.42% and 20.12 for the ANFIS_SC3, and 17.49% and 16.75% for the ANFIS_FC3, respectively. The scatterplots and comparison between observed and predicted Tw for the best models at Mentue station (validation phase) are reported in Fig. 8. The performances of the models for the Sion station at Rhône River are summarized in Table 6. As a preliminary analysis, we investigated the performance of the four models using only Ta as input variable. The statistical indices clearly show

Environ Sci Pollut Res 30

Observed Tw (°C)

30

Predicted Tw (°C)

25 Predicted Tw (°C)

Tw (°C)

24 18 12 6

20

15

15

10

10

5

2851

2701

2551

2401

2251

2101

1951

1801

1651

1501

1351

1051

1201

901

601

751

451

301

1

151

0

0 0

Predicted Tw (°C)

15 10 5

10 15 20 25 Observed Tw (°C)

20 15

10

10

2851

2701

2551

2401

2251

2101

1951

1801

1651

1351

1501

1201

1051

901

751

601

451

301

1

151

0 0

30

Predicted Tw (°C) Predicted Tw (°C)

Tw (°C)

24

12 6

5

10 15 20 25 Observed Tw (°C)

25

20

20

15

15

10

10

2851

2701

2551

2401

2251

2101

1951

1801

1651

1501

1351

1201

1051

901

751

601

451

301

151

1

0

0 0

Predicted Tw (°C)

30

Predicted Tw (°C)

25 20 15 10 5

5

2851

2701

2551

2401

2251

2101

1951

1801

1651

1501

1351

1201

1051

901

751

601

451

301

151

1

Time (days)

10 15 20 25 Observed Tw (°C)

30

30

y = 0.946x + 0.7048 R² = 0.9366

25

25

20

20

15

15

10

10

5

0

5

(c)

Time (days)

Tw (°C)

30

y = 0.9559x + 0.6401 R² = 0.9384

5

Observed Tw (°C)

0

30

25

0

30

5

(b)

Time (days)

18

25

15

5

Observed Tw (°C)

30

20

0

30

30

y = 0.9637x + 0.4407 R² = 0.944

25 Predicted Tw (°C)

Tw (°C)

5

30

25 20

5

(a)

Time (days) Observed Tw (°C)

25

20

0

30

30

y = 0.9638x + 0.4798 R² = 0.9438

5

(d)

0 0

5

10 15 20 25 Observed Tw (°C)

0 30

Fig. 7 Scatterplots and comparison between observed and predicted river water temperature (Tw) for the best models at Donji Miholjac station. a MLPNN3. b ANFIS_FC3. c ANFIS_GP3. d ANFIS_SC3

that there was no significant difference in the prediction of Tw based on the Ta variable and the models provided similar accuracy with little variation. The RMSE of the MLPNN1, ANFIS_GP1, ANFIS_SC1, and ANFIS_FC1 remains relatively low (0.864 °C to 0.866 °C), MAE between 0.652 °C and 0.653 °C, followed by R and d equal to 0.927 and 0.962, respectively. The obtained modeling results for version 2 clearly show that the MLPNN2 has generally slightly improved performances with respect to the other models. For instance, the R and d are equal to 0.939 and 0.969 for MLPNN2, 0.938 and

0.968 for ANFIS_GP2 and ANFIS_SC2, and 0.939 and 0.968 for ANFIS_FC2. Indeed, the inclusion of Q to the input variable has contributed to an improvement in model accuracy, especially by decreasing the RMSE and MAE values (Table 6); however, the Q provided a lower improvement of Tw prediction compared to the inclusion of the CGC (version 3). MLPNN3 improved the accuracy of the MLPNN1 by decreasing the RMSE and MAE by 11.670% and 18.71%, against 8.56% and 10.74% achieved by MLPNN2. Similarly, ANFIS_GP3 decreased the RMSE and MAE of the ANFIS_GP1 by 16.42% and 22.21%,

Environ Sci Pollut Res 25

Observed Tw (°C)

Predicted Tw (°C)

25

Predicted Tw (°C)

Tw (°C)

20 15 10 5

25

y = 0.955x + 0.4992 R² = 0.9764

20

20

15

15

10

10

5

5

(a)

1081

1021

961

901

841

781

721

601

661

541

481

421

361

301

241

181

61

121

1

0 0

0 0

5

Time (days) 25

15

20

Observed Tw (°C)

Predicted Tw (°C)

25

Predicted Tw (°C)

15 10 5

25

y = 0.9553x + 0.4873 R² = 0.9664

20

20

15

15

10

10 5

5

1081

1021

961

901

841

781

721

661

601

541

481

421

301

361

241

181

121

61

1

0

(b)

0 0

5

Time (days) 25

10

15

20

Observed Tw (°C)

Predicted Tw (°C)

25

25

y = 0.956x + 0.467 R² = 0.9606

20

Predicted Tw (°C)

20

15 10 5

15

15

10

10 5

5

(c)

1081

1021

961

901

841

781

721

661

601

541

481

421

361

301

241

181

61

121

1

0

0

0 0

5

Time (days) 25

10

15

20

25

Observed Tw (°C)

Observed Tw (°C)

Predicted Tw (°C)

25

Predicted Tw (°C)

20 Tw (°C)

15 10 5

25

y = 0.9542x + 0.4901 R² = 0.968

20

20

15

15

10

10

5

5

Time (days)

1081

1021

961

901

841

781

721

601

661

541

481

361

421

301

241

181

61

121

0 1

0

25

Observed Tw (°C)

20 Tw (°C)

25

Observed Tw (°C)

20 Tw (°C)

10

(d) 0

0 0

5

10 15 20 Observed Tw (°C)

25

Fig. 8 Scatterplots and comparison between observed and predicted river water temperature (Tw) for the best models at Yvonand station. a MLPNN3. b ANFIS_FC3. c ANFIS_GP3. d ANFIS_SC3

against 8.09% and 10.87% achieved by ANFIS_GP2. In the same context, ANFIS_SC3 decreased the RMSE and MAE of the ANFIS_SC1 by 8.43% and 12.42%, against 7.51% and 10.28% achieved by ANFIS_SC2. Finally, ANFIS_FC3 decreased the RMSE and MAE of the ANFIS_FC1 by 8.84% and 15.95%, against 8.10% and 10.43% achieved by ANFIS_FC2. The scatterplots and comparison between observed and predicted Tw for the best models at Sion station (validation phase) are presented in Fig. 9.

Results for the Davos station at Dischmabach River are shown in Table 7. It is clear from the results that the four models developed using only the Ta as input variable showed good agreement with the observed Tw, expressed by the high values of R and d (R ≈ 0.950, d ≈ 0.973), and low values of error: RMSE≈0.896 °C, MAE≈0.690 °C. Additionally, the modeling results indicate that the ANFIS_SC1 performed slightly better than the other models. Contrary to the results obtained in the four previous stations (Botovo, Donji

Environ Sci Pollut Res 12

Observed Tw (°C)

Predicted Tw (°C) Predicted Tw (°C)

Tw (°C)

9

12

6 3

12

y = 0.9249x + 0.5362 R² = 0.8905

9

9

6

6

3

3

3601

3421

3241

3061

2701

2881

2341

2521

2161

1981

1801

1441

1621

1261

901

1081

721

541

361

1

181

0

(a)

0 0

3

6 9 Observed Tw (°C)

Time (days) 12

Observed Tw (°C)

Predicted Tw (°C)

y = 0.9117x + 0.6276 R² = 0.8854 Predicted Tw (°C)

Tw (°C)

12

12

9 6 3

9

9

6

6

3

3

3601

3421

3241

3061

2701

2881

2521

2341

2161

1981

1801

1621

1441

1261

1081

901

721

541

361

1

181

0

(b)

0 0

3

Time (days) 12

Predicted Tw (°C)

12

Predicted Tw (°C)

9 Tw (°C)

6

9

6 3

12

y = 0.9234x + 0.5598 R² = 0.901

9

9

6

6

3

3

(c)

3601

3421

3241

2881

3061

2521

2701

2341

2161

1981

1801

1621

1441

1261

1081

901

721

361

541

1

181

0 0

0 0

3

Time (days) Observed Tw (°C)

Predicted Tw (°C)

12

Predicted Tw (°C)

9 Tw (°C)

6 3

6 9 Observed Tw (°C)

3601

3421

3241

2881

3061

2701

2521

2161

2341

1981

1801

1621

1441

1261

1081

901

721

541

361

1

Time (days)

12

12

y = 0.9055x + 0.6749 R² = 0.8809

9

9

6

6

3

3

0 181

0

12

Observed Tw (°C)

Observed Tw (°C)

12

0

12

(d) 0

0 0

3

6

9

12

Observed Tw (°C)

Fig. 9 Scatterplots and comparison between observed and predicted river water temperature (Tw) for the best models at Sion station. a MLPNN3. b ANFIS_FC3. c ANFIS_GP3. d ANFIS_SC3

Miholjac, Yvonand, and Sion), inclusion of the Q as input variables (version 2) improved significantly the performances of the models. Indeed, MLPNN2 decreased the RMSE and MAE of the MLPNN1 by 15.30% and 17.40%, and ANFIS_GP2 improved the accuracy of the ANFIS_GP1 by decreasing the RMSE and MAE of about 16.50% and 18.55%, respectively. The improvement guaranteed by the ANFIS_SC2 is less than the improvement provide by the MLPNN2 and ANFIS_GP2 and reached

relatively low percentage reduction in RMSE and MAE of 12.53% and 14.04%, respectively. Finally, ANFIS_FC2 improved the accuracy of the ANFIS_FC1 by decreasing the RMSE and MAE of about 14.19% and 15.09%, respectively. The overall accuracy of the four models for the version 3 shows that the MLPNN3 outperformed all the other three models. The scatterplots and comparison between observed and predicted Tw for the best models at Davos station (validation phase) are shown in Fig. 10.

Environ Sci Pollut Res 12

Observed Tw (°C)

Predicted Tw (°C) Predicted Tw (°C)

Tw (°C)

9

12

6 3

12

y = 0.9767x + 0.1158 R² = 0.9745

9

9

6

6

3

3

(a)

12

Observed Tw (°C)

1081

Time (days)

1021

961

901

841

781

721

661

601

541

481

421

361

301

241

121

181

61

1

0

Predicted Tw (°C)

0

12

Predicted Tw (°C)

Tw (°C)

9 6 3

0

0 3 6 9 Observed Tw (°C)

12

12

y = 0.9903x + 0.0353 R² = 0.9662

9

9

6

6

3

3

12

Observed Tw (°C)

1081

Time (days)

1021

961

901

841

781

721

661

601

541

481

421

361

301

181

241

121

61

1

0

Predicted Tw (°C)

0

12

Predicted Tw (°C)

Tw (°C)

9 6 3

(b)

0 3

6 9 Observed Tw (°C)

12

y = 0.9994x - 0.0271 R² = 0.9526

9

9

6

6

3

3

(c)

1081

1021

961

901

841

781

721

661

601

541

481

421

361

301

241

181

121

61

1

0

0

0 0

3 6 9 Observed Tw (°C)

Time (days) 12

Observed Tw (°C)

Predicted Tw (°C)

12

Predicted Tw (°C)

Tw (°C)

9 6 3

1081

1021

961

901

841

781

721

661

601

541

481

421

361

301

241

181

121

61

1

12

12

y = 0.9379x + 0.3349 R² = 0.9522

9

9

6

6

3

3

(d)

0 Time (days)

0

12

0

0 0

3 6 9 Observed Tw (°C)

12

Fig. 10 Scatterplots and comparison between observed and predicted river water temperature (Tw) for the best models at Davos station. a MLPNN3. b ANFIS_FC3. c ANFIS_GP3. d ANFIS_SC3

Conclusions Water temperature is an important indicator which impacts the overall health of rivers, and the accurate prediction of water temperature is one of the most important issues for river management. In this study, MLPNN and three ANFIS models (ANFIS_FC, ANFIS_GP, and ANFIS_SC) were developed to model daily water temperature for rivers. The proposed models were tested in five river stations characterized by

different hydrological conditions. In the validation phase, MLPNN models performed well for the Botovo, Yvonand, and Davos stations with lower RMSE and MAE, and higher R and d values. At Donji Miholjac station, ANFIS_FC3 was more accurate, while the ANFIS_GP3 is the best model at Sion station. Modeling results showed that the inclusion of three inputs as predictors (Ta, Q, and the CGC) yielded the best modeling accuracy among all the developed models, indicating a significant improvement compared to the case when

Environ Sci Pollut Res

only Ta is used as predictor, as typically assumed in most of previous machine learning applications. Results showed that the use of CGC is complementary to the use of Ta and Q, and provides additional relevant information on the seasonality of the river thermal dynamics, possibly mimicking the effect of lateral and upstream water and heat inputs. In addition, the results indicated that for highly regulated river stations with hydropower plants and snowmelt-fed rivers at higher altitudes, the role of flow discharge on river water temperature modeling becomes more important. Overall, modeling performance indicated that the machine learning models developed in this study can be effectively used for river water temperature prediction. Acknowledgements We acknowledge the Swiss Federal Office of the Environment (FOEN), the Swiss Meteorological Institute (MeteoSchweiz), and the Croatian Meteorological and Hydrological Service for providing the water temperature, air temperature, and river flow discharge data used in this study. We thank an anonymous reviewer for the useful comments and suggestions which helped to improve the quality of the study. Funding This work was jointly funded by the National Key R&D Program of China (2018YFC0407203, 2016YFC0401506) and the research project from Nanjing Hydraulic Research Institute (Y118009).

References Ahmadi-Nedushan B, St-Hilaire A, Ouarda TBMJ, Bilodeau L, Robichaud É, Thiémonge N, Bobée B (2007) Predicting river water temperatures using stochastic models: case study of the Moisie river (Quebec, Canada). Hydrol Process 21:21–34 Arismendi I, Safeeq M, Dunham JB, Johnson SL (2014) Can air temperature be used to project influences of climate change on stream temperature? Environ Res Lett 9:084015 Benyahya L, Caissie D, St-Hilaire A, Ouarda TBMJ, Bobée B (2007) A review of statistical water temperature models. Can Water Resour J 32:179–192 Bonacci O, Oskoruš D (2008) The influence of three Croatian hydroelectric power plants operation on the river Drava hydrological and sediment regime. Xxivth Conference of the Danubian Countries on the Hydrological Forecasting & Hydrological Bases of Water Management Cai H, Piccolroaz S, Huang J, Liu Z, Liu F, Toffolon M (2018) Quantifying the impact of the three gorges dam on the thermal dynamics of the Yangtze River. Environ Res Lett 13:054016 Caissie D (2006) The thermal regime of rivers-a review. Freshw Biol 51: 1389–1406 Caissie D, Satish MG, El-Jabi N (2007) Predicting water temperatures using a deterministic model: application on Miramichi River catchments (New Brunswick, Canada). J Hydrol 336:303–315 Cakmakci M (2007) Adaptive neuro-fuzzy modeling of anaerobic digestion of primary sedimentation sludge. Bioprocess Biosyst Eng 30: 349–357 Carolli M, Bruno MC, Siviglia A, Maiolini B (2011) Responses of benthic invertebrates to abrupt changes of temperature in flume simulations. River Res Appl 28:678–691 Casas-Mulet R, Saltveit SJ, Alfredsen KT (2016) Hydrological and thermal effects of hydropeaking on early life stages of salmonids: a

modelling approach for implementing mitigation strategies. Sci Total Environ 573:1660–1672 Cole JC, Maloney KO, Schmid M, McKenna JE (2014) Developing and testing temperature models for regulated systems: a case study on the upper Delaware River. J Hydrol 519:588–598 Deweber JT, Wagner T (2014) A regional neural network ensemble for predicting mean daily river water temperature. J Hydrol 517:187– 200 Eaton JG, Mccormick JH, Stefan HG, Hondzo M (1995) Extreme value analysis of a fish/temperature field database. Ecol Eng 4:289–305 Gallice A, Schaefli B, Lehning M, Parlange MB, Huwald H (2015) Stream temperature prediction in ungauged basins: review of recent approaches and description of a new physics-derived statistical model. Hydrol Earth Syst Sci 19:3727–3753 Grbić R, Kurtagić D, Slišković D (2013) Stream water temperature prediction based on Gaussian process regression. Expert Syst Appl 40: 7407–7414 Hadzima-Nyarko M, Rabi A, Šperac M (2014) Implementation of artificial neural networks in modeling the water-air temperature relationship of the river Drava. Water Resour Manag 28:1379–1394 Haykin S (1999) Neural networks a Comprehensive Foundation. Prentice Hall, Upper Saddle River He Z, Wen X, Liu H, Du J (2014) A comparative study of artificial neural network, adaptive neuro fuzzy inference system and support vector machine for forecasting river flow in the semiarid mountain region. J Hydrol 509:379–386 Hebert C, Caissie D, Satish MG, El-Jabi N (2011) Study of stream temperature dynamics and corresponding heat fluxes within Miramichi River catchments (New Brunswick, Canada). Hydrol Process 25: 2439–2455 Heddam S (2014) Modeling hourly dissolved oxygen concentration (DO) using two different adaptive neuro-fuzzy inference systems (ANFIS): a comparative study. Environ Monit Assess 186:597–619 Heddam S (2016a) Multilayer perceptron neural network based approach for modelling phycocyanin pigment concentrations: case study from lower Charles River buoy, USA. Environ Sci Pollut Res 23:17210– 17225 Heddam S (2016b) New modelling strategy based on radial basis function neural network (RBFNN) for predicting dissolved oxygen concentration using the components of the Gregorian calendar as inputs: case study of Clackamas River, Oregon, USA. Model Earth Syst Environ 2:1–5 Heddam S, Kisi O (2017) Extreme learning machines: a new approach for modeling dissolved oxygen (DO) concentration with and without water quality variables as predictors. Environ Sci Pollut Res 24: 16702–16724 Hester ET, Doyle MW (2011) Human impacts to river temperature and their effects on biological processes: a quantitative synthesis. J Am Water Resour Assoc 47:571–587 Hornik K (1991) Approximation capabilities of multilayer feedforward networks. Neural Netw 4:251–257 Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2:359–366 Howell PJ, Dunham JB, Sankovich PM (2010) Relationships between water temperatures and upstream migration, cold water refuge use, and spawning of adult bull trout from the Lostine River, Oregon, USA. Ecol Freshw Fish 19:96–106 Jang JSR (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23:665–685 Jang JSR, Sun CT, Mizutani E (1996) Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice Hall, Upper Saddle River, pp 73–90 Jensen MR, Lowney CL (2004) Temperature modeling with HEC-RAS. World Water and Environmental Resources Congress

Environ Sci Pollut Res Johnson MF, Wilby RL, Toone JA (2014) Inferring air–water temperature relationships from river and catchment properties. Hydrol Process 28:2912–2928 Karaçor AG, Sivri N, Uçan ON (2007) Maximum stream temperature estimation of Degirmendere River using artificial neural network. J Sci Ind Res 66:363–366 Karaman S, Ozturk I, Yalcin H, Kayacier A, Sagdic O (2012) Comparison of adaptive neuro-fuzzy inference system and artificial neural networks for estimation of oxidation parameters of sunflower oil added with some natural byproduct extracts. J Sci Food Agric 92: 49–58 Kelleher C, Wagener T, Gooseff M, McGlynn B, McGuire K, Marshall L (2012) Investigating controls on the thermal sensitivity of Pennsylvania streams. Hydrol Process 26:771–785 Kisi O, Zounemat-Kermani M (2014) Comparison of two different adaptive neuro-fuzzy inference systems in modelling daily reference evapotranspiration. Water Resour Manag 28:2655–2675 Krider LA, Magner JA, Perry J, Vondracek B, Ferrington LC (2013) Airwater temperature relationships in the trout streams of southeastern Minnesota’s carbonate-sandstone landscape. J Am Water Resour Assoc 49:896–907 Laanaya F, St-Hilaire A, Gloaguen E (2017) Water temperature modelling: comparison between the generalized additive model, logistic, residuals regression and linear regression models. Hydrol Sci J 62: 1078–1093 Lisi PJ, Schindler DE, Cline TJ, Scheuerell MD, Walsh PB (2015) Watershed geomorphology and snowmelt control stream thermal sensitivity to air temperature. Geophys Res Lett 42:3380–3388 Meier W, Wüest A (2004) Wie verändert die hydroelektrische Nutzung die Wassertemperatur der Rhone? Wasser Energie Luft 96:305–309 www.rhone-thur.eawag.ch/wel_rhone.pdf Meile T, Boillat JL, Schleiss AJ (2011) Hydropeaking indicators for characterization of the upper-Rhone River in Switzerland. Aquat Sci 73: 171–182 Mohseni O, Stefan HG (1999) Stream temperature/air temperature relationship: a physical interpretation. J Hydrol 218:128–141 Mohseni O, Stefan HG, Erickson TR (1998) A non-linear regression model for weekly stream temperatures. Water Resour Res 34: 2685–2692 Morrill JC, Bales RC, Conklin MH (2005) Estimating stream temperature from air temperature: implications for future water quality. J Environ Eng 131:139–146 Olden JD, Joy MK, Death RG (2004) An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecol Model 178:389–397 Phelps QE, Tripp SJ, Hintz WD, Garvey JE, Herzog DP, Ostendorf DE, Ridings JW, Crites JW, Hrabik RA (2010) Water temperature and river stage influence mortality and abundance of naturally occurring Mississippi River scaphirhynchus sturgeon. N Am J Fish Manag 30: 767–775 Piccolroaz S, Toffolon M, Majone B (2015) The role of stratification on lakes’ thermal response: the case of Lake Superior. Water Resour Res 51:7878–7894 Piccolroaz S, Calamita E, Majone B, Gallice A, Siviglia A, Toffolon M (2016) Prediction of river water temperature: a comparison between a new family of hybrid models and statistical approaches. Hydrol Process 30:3901–3917 Piccolroaz S, Toffolon M, Robinson CT, Siviglia A (2018) Exploring and Quantifying River thermal response to heatwaves. Water 10:1098 Piotrowski AP, Osuch M, Napiorkowski MJ, Rowinski PM, Napiorkowski JJ (2014) Comparing large number of metaheuristics for artificial neural networks training to predict water temperature in a natural river. Comput Geosci 64:136–151 Piotrowski AP, Napiorkowski MJ, Napiorkowski JJ, Osuch M (2015) Comparing various artificial neural network types for water temperature prediction in rivers. J Hydrol 529:302–315

Rabi A, Hadzima-Nyarko M, Sperac M (2015) Modelling river temperature from air temperature in the river Drava (Croatia). Hydrol Sci J 60:1490–1507 Rajwakuligiewicz A, Bialik RJ, Rowiński PM (2015) Dissolved oxygen and water temperature dynamics in lowland rivers over various timescales. J Hydrol Hydromech 63:353–363 Rumelhart DE, Hinton GE, Williams RJ (1986) Learning internal representations by error propagation. MIT Press, Massachusetts Sahoo GB, Schladow SG, Reuter JE (2009) Forecasting stream water temperature using regression analysis, artificial neural network, and chaotic non-linear dynamic models. J Hydrol 378:325–342 Sandersfeld T, Mark FC, Knust R (2017) Temperature-dependent metabolism in Antarctic fish: do habitat temperature conditions affect thermal tolerance ranges? Polar Biol 40:1–9 Sanikhani H, Kisi O, Nikpour MR, Dinpashoh Y (2012) Estimation of daily pan evaporation using two different adaptive neuro-fuzzy computing techniques. Water Resour Manag 26:4347–4365 Shiri J, Dierickx W, Baba PA, Neamati S, Ghorbani MA (2011) Estimating daily pan evaporation from climatic data of the state of Illinois, USA using adaptive neuro-fuzzy inference system (ANFIS) and artificial neural network (ANN). Hydrol Res 42:491–502 Sohrabi MM, Benjankar R, Tonina D, Wenger SJ, Isaak DJ (2017) Estimation of daily stream water temperatures with a Bayesian regression approach. Hydrol Process 31:1719–1733 Stefan HG, Preud’homme EB (1993) Stream temperature estimation from air temperature. J Am Water Resour Assoc 29:27–45 Temizyurek M, Dadasercelik F (2018) Modelling the effects of meteorological parameters on water temperature using artificial neural networks. Water Sci Technol 77:1724–1733 Toffolon M, Piccolroaz S (2015) A hybrid model for river water temperature as a function of air temperature and discharge. Environ Res Lett 10:114011 Toffolon M, Piccolroaz S, Majone B, Soja AM, Peeters F, Schmid M, Wuest A (2014) Prediction of surface temperature in lakes with different morphology using air temperature. Limnol Oceanogr 59: 2185–2202 Verbrugge LNH, Schipper AM, Huijbregts MAJ, Velde GVD, Leuven RSEW (2012) Sensitivity of native and non-native mollusc species to changing river water temperature and salinity. Biol Invasions 14: 1187–1199 Vliet MTHV, Ludwig F, Zwolsman JJG, Weedon GP, Kabat P (2011) Global river temperatures and sensitivity to atmospheric warming and changes in river flow. Water Resour Res 47:247–255 Vliet MTHV, Yearsley JR, Franssen WHP, Ludwig F, Haddeland I, Lettenmaier DP, Kabat P (2012) Coupled daily streamflow and water temperature modeling in large river basins. Hydrol Earth Syst Sci 16:4303–4321 Wang Q (2013) Prediction of water temperature as affected by a preconstructed reservoir project based on MIKE11. Acta Hydrochim Hydrobiol 41:1039–1043 Webb BW, Clack PD, Walling DE (2003) Water-air temperature relationships in a Devon river system and the role of flow. Hydrol Process 17:3069–3084 Wei M, Bai B, Sung AH, Liu Q, Wang J, Cather ME (2007) Predicting injection profiles using ANFIS. Inf Sci 177:4445–4461 Westhoff JT, Rosenberger AE (2016) A global review of freshwater crayfish temperature tolerance, preference, and optimal growth. Rev Fish Biol Fish 26:329–349 Yurdusev MA, Firat M (2009) Adaptive neuro fuzzy inference system approach for municipal water consumption modeling: an application to Izmir, Turkey. J Hydrol 365:225–234 Zhu S, Nyarko EK, Nyarko MH (2018) Modelling daily water temperature from air temperature for the Missouri River. PeerJ 6:e4894

Suggest Documents