River water modelling prediction using multi-linear regression, artificial

Available online at www.sciencedirect.com

ScienceDirect

Available online at www.sciencedirect.com Procedia Computer Science 00 (2018) 000–000

ScienceDirect

www.elsevier.com/locate/procedia

Procedia Computer Science 120 (2017) 75–82

9th 9thInternational InternationalConference Conferenceon onTheory Theoryand andApplication ApplicationofofSoft SoftComputing, Computing,Computing Computingwith with Words and Perception, ICSCCW 2017, 24-25 22-23 August August 2017, 2017, Budapest, Budapest, Hungary Hungary

River water modelling prediction using multi-linear regression, artificial neural network, and adaptive neuro-fuzzy inference system techniques S.I.Abbaa, Sinan Jasim Hadia , Jazuli Abdullahia* a

Near East University, Engineering Faculty,Near East Boulevard 99138, Nicosia, North Cyprus, Mersin 10 Turkey.

Abstract

In this study, multi linear regression ( MLR), artificial neural network (ANN) and adaptive neuro fuzzy inference system(ANFIS) techniques were developed to predict the Dissolve oxygen concentration at down stream of Agra city, using monthly input data which are dissolve oxygen(DO), pH, biological oxygen demand(BOD) and water temperature (WT) at three different places viz, Agra upstream, middle stream and downstream. Initially, 11 input parameters for all the three locations were used except DO at the downstream, then, 7 input for middle and downstream except DO at the target location and finally the downstream location was considered in the analysis. The performance was evaluated using determination coefficient (DC) and root mean square error (RMSE), the result of DO showed that both the ANN and ANFIS can be applied in modelling DO concentration in Agra city, and also indicate that, ANN model is slightly better than ANFIS and also indicates a considerable superiority to MLR. © 2018 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the 9th International Conference on Theory and application of Soft Computing, Computing with Words and Perception. Keywords: Multilinear regression; artificial neural network; adaptive neuro fuzzy inference ; dissolve oxygen.

1. Introduction Rivers at the initial stage are free from any impurities and considered the most clean water resource in all over the

* Corresponding author. Tel.: +905338570781. E-mail address: [email protected] 1877-0509 © 2018 The Authors. Published by Elsevier B.V.

Peer-review under responsibility of the scientific committee of the 9th International Conference on Theory and application of Soft Computing, Computing with Words and Perception. 1877-0509 © 2018 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the 9th International Conference on Theory and application of Soft Computing, Computing with Words and Perception. 10.1016/j.procs.2017.11.212

76 2

S.I. Abba et al. / Procedia Computer Science 120 (2017) 75–82 S. I. Abba et al./ Procedia Computer Science 00 (2018) 000–000

globe, but due the rapid increased in industrialization, human and urban development caused to loss their sustainability. For the sustainable development to be pursued , it is very essential to assess the quality of river (Farhad et al. 2013; Abba et al. 2015). Dissolved oxygen (DO) is the amount of oxygen in the dissolved form; it is one of the best variables in indicating the quality and health status of the ecosystem. It is crucial to ensure the concentration and range of DO which varies according to different national and international standard, However, it can range from 0-18 part per million (ppm). If the DO is less the aquatic animal are likely to lose their life in the receiving environment(Shaghaghian, 2010; Jain, 2014; Kisi and Murat, 2012). The linear model of determining the water quality has been used in several literatures by considering the different characteristics of water. Thus, dynamic nature of the system defined the unsuitability of the traditional method to cope with the interactions and process taking place in the stream body (Lewis, 2005; Kisi and Murat, 2012). A convenient methodologies are paramount and played a vital role in solving the complex problems and nonlinear process involved in any water body, as such soft computing tool for example, Artificial neural network, genetic algorithms, Fuzzy theory etc. have found to be more accurate and significant in resolving the model simulation and forecasting of nonlinear interaction, in which both the classical model and soft computing have their own advantage and limitation (Shaghaghian, 2010; Jang and Sun, 2001). However, the prediction and determination of DO has been studied by many researchers for example, Sirilak at el. Used ANN in estimating the DO of a river (Areerachakul, 2011). Elshafie et al. (2007) determined the DO in a river using fuzzy logic. Ahour and Sadeghian (2013), simulate the DO concentration by employing ANN and ANFIS method (Ahour and Sadeghian, 2013). Applied MLP in predicting the DO concentration (Soyupak et al., 2003). Kisi 2012 used RBNN and MLP to modelled the DO (Kisi and Murat, 2012). Different type of classical model have been used to determine the water quality in River Yamuna, but due to the complexity of the system and some drawback the results were normally poor, soft computing techniques in the modern technology proved to be effective and flexible ways of determining the water quality of a river, in recent years, soft computing have been an exceptional performance over the experimental model (Sarkar and Pandey, 2015). In this paper, MLR, ANN, and ANFIS were applied and compare for modelling the DO in Yamuna River of Agra downstream. By considering the upper stream, middle stream and downstream of Agra, which is one of the most important city in term of environmental pollution. The location of the down stream depicts the impact of wastewater discharge from Agra city. Agra city used the Yamuna water significantly for domestic and irrigation and contribute about 9% of pollution level in the river . 1.1 Study Area and Data Collection The River Yamuna is the main tributary of Ganga River having length of 1,376 km. About 57 million people of north India depend on it. Comprises about 42% of the Ganga basin area in the Indian Territory, A total catchment area of Yamuna in 3,66,223 km2. It is yearly discharge is almost 10,000m3/s and supply approximately 70% of drinking water in Delhi, River leaves Delhi as polluted water because there are no efficient numbers of water treatment plant that can sustain the volume of water due to the rapid urbanization. Subsequently, the water reaches Agra as polluted water and Yamuna was the main source of municipal water in Agra, with the same condition of inadequate numbers of treatment plant to treat the polluted water, therefore, the consumers of Delhi and Agra take in high amount of harmful impurities and toxic in water (CSE 2008). The routine monitoring and evaluation of the entire river has been controlled by Central Pollution Control Board (CPCB) under the National River Conservation Program (NRCR) and National Water Quality Monitoring Program (NWQMP) (Pollution et al., 2006). Fig. 1 indicates the Yamuna River basin in India and locations of Agra city

S.I. Abba al. / Procedia Computer Science 120 000–000 (2017) 75–82 S. I. Abba et al./ et Procedia Computer Science 00 (2018)

773

Fig. 1. Location of Yamuna River along Agra city

In this research, the monthly water quality data of Yamuna River were obtained from the Central Pollution Control Board (CPCB) for the years 1999 to 2005, data processing and mining were carried out to remove the noise and unrecorded data set, the data consist of eleven parameters namely; Dissolve oxygen (DO), Biological oxygen demand(BOD), pH and water Temperature(WT), which all were involved in modelling using. Table 1. Shows the statistical analysis of each parameter. The MLR, ANN and ANFIS techniques were developed to predict the DO concentration Results of all the three techniques were evaluated and compare using the equations (1) and (2), the best model was estimated according to Root Mean Square Error (RMSE) and Determination Coefficient statistics ( DC ). ��

� ∑� ��

(1)

�

∑� ��

� �� ∑� ��

(2)

Where, �� , �� , and �� are observed values, predicted values and means of observed values respectively. (Nourani et al., 2013). Table 1. Statistical Analysis of each input Variables Station Upper Stream Mid Stream Down Stream

Parameters

Minimum

Maximum

Range

Median

Mean

Variance

Coef.var

Correlation

DO

0

22.8

22.8

6.1

7.3

22.1

0.6

-0.043

PH

7.2

9.2

2

8

8.1

0.3

0.1

-0.156

BOD

0.4

63

62.6

11

14.4

113.8

0.7

-0.09

WT

12.5

37

24.5

28

26.1

36.7

0.2

-0.037

DO

0

14.8

14.8

4.2

4.9

10

0.6

0.95

PH

6.7

8.9

2.2

7.8

7.9

0.2

0.1

0.322

BOD

4

46

42

15

18.1

96.4

0.5

-0.158

WT

12.5

37

24.5

29

27.1

28.7

0.2

0.076

DO

0

14.6

14.6

4

4.5

9.2

0.7

1

PH

6.9

8.9

2

7.8

7.8

0.2

0.1

0.336

BOD

4

46

42

16.5

19.7

114

0.5

-0.183

WT

12.5

37

24.5

29

27.1

29

0.2

0.083


78 4

2.

Experimental Method

2.1 Multi linear Regression (MLR) Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. It helps in determining the level of variation between the variables . MLR line of Y (dependent variable) on X (independent variable) defined in the equation (3) (Parmar and Bhardwaj, 2015; Chen and Liu, 2015). � � ��

(3)

Where �� is the value of the �th predictor, �� is the regression constant, and �� is the coefficient of the �th predictor.

2.2 Artificial neural network (ANN)

ANN is a model designed based on a mathematical model to process information which resembles brain in learning process and synaptic weigth (Kuo-lin Hsu et al., 1995; Muhammad et al., 2014). In the ANN, information processing occurs at many single elements called nodes (neurons), which are passed between the nodes through the link, each connected link having an associated weight, which represents its connection strength to determine The output signal, the activation function, should be applied to each node of the nonlinear transformation(Committee 2000). ANN can be categorized interm of learning method , flow of information and objective function, Among the various classifications of ANN, Feed-Forward Neural Network (FFNN) with Back propagation (BP) is widelyused and the most common one, Each training input data is flow through the system and passed to the out put layer, after the training error is generated which is propagated back to the network until the desired output is achvied. The main concept is to minimize error, so that the network learns the training data(Committee 2000; Nourani et al., 2013; Muhammad et al., 2014). The detail information about BP can be obtained from (Sharifi et al., 2009; Committee 2000; Muhammad et al., 2014; Nourani et al., 2013). As shown in Fig. 2, It has been used to estimate and simulate functions with BP threelayer FFNN, which are used to define a set of input and output parameters between non-linear function mappings to provide an overall framework (Nourani et al., 2015) .

Fig. 2 A three-layered feed-forward neural network with BP (Nourani et al., 2015)

2.3 Adaptive Neuro-Fuzzy Inference System (ANFIS) Fuzzy logic is a machine intelligent ways of dealing with uncertainty, imprecision and vagueness, which was introduced by Zadeh. A fuzzy rule is a mathematical expression that describes the relationship between the input and the output of a system based on the form of the language variable and the if-then statement(Zadeh, 1965; Kisi and Murat, 2012). ANFIS method is used as an estimator in the world. It has the ability of approximating real function.


79 5

Fuzzy database , defuzzifier and fuzzifier are main part of the system. The knowledge of fuzzy inference system(FIS) and fuzzy rules are essential section of fuzzy logic (Parmar and Bhardwaj, 2015). ANFIS as hybrid learning algorithm have a several drawback like all others soft computing tools, the approach are more complex and suitable for some inference system likeTakagi-Sugeno-Kang (Akrami et al., 2014). The main important ANFIS rule system are basically classified in to Mandani and Takagi-Sugeno-Kang which are normally express into linguistic variable and mathematical function respectively. De-fuzzification process is needed in Mandani rule while there is no need of de-fuzzification in Sugeno process.(Takagi and Sugeno, 1985). More information about ANFIS technique and its architectures can be found in reference (Parmar and Bhardwaj, 2015; Jang, 1993). The general structure of ANFIS is shown in Fig 3.

Fig 3. ANFIS structure

Assume the FIS contain two inputs ‘x’ and ‘y’ and one output ‘f’, a first order sugeno fuzzy has following rules. Rule 1: if μ�x�is A� and μ�y� B� then f� � �� x�� y��

Rule 2: if μ�x�is A� and μ�y�is B� then f� � �� x�� y��

�� , �� , �� , �� Parameters, are membership function for x and y inputs

�� , �� , ��, �� , �� , ��, Are outlet function’s parameters. The structure and formulation of ANFIS follow a five layer neural network arrangement. Layer 1: In this layer, every node i is an adaptive node having a node function for

Q� � �μ�� x� f�� i�1,2 �� Q� � �μ�� x� f�� i��,�

(4)

Where �� is the membership grade for input x or y. The membership function chosen was Gaussian because it has lowest prediction error. Layer 2: In this layer every rule between inputs are connected by T-norm operator that perform as ‘AND’ operator � � � � �� . �� 1,2

(5)

Layer 3: In layer every neuron is labelled Norm, and the output is called ‘Normalized firing strength’’ � � � � ��

��

��

, 1,2

(6)


80 6

Layer 4: Every node i in this layer is an adaptive node and performs the consequent of the rules.

� � � � �� (�� Are irregular parameters refered to as ‘’ consequent parameterd’’

(7)

Layer 5: In this layer the ovrall output is computed as the summation of all incoming signals �� ∑ ��

∑� ��

(8)

∑� � �

1.0 Result and Discussion A suitable combination of input variables for ANN, ANFIS and MLR is important for predicting the water quality parameters in a River. The sensitivity Analysis between the input parameters was performed and the correlation coefficient was determined between the input variables to obtain the dominant parameters. It indicates that the most effective to affect the dissolve oxygen was pH, which has the highest correlation in middle and downstream. The data were normalized and partitioned into calibration and verification to simulate the DO, for the development of model, different combination of input parameters was selected. At the initial stage of the training, all the data set of Agra station that is Agra upstream, Agra middle and Agra downstream except DO value at the downstream were considered, secondly, all the data set of Agra middle and downstream except DO at the downstream were also used and finally the downstream was considered except the DO value. For the analysis, MLR, ANN, and ANFIS were analysed. Table 2 Shows the trained result of the three models and comparative performance of both calibration and verification model based on DC and RMSE. Table 2. Trained result of ANFIS, ANN, and MLR Calibration

ANFIS

ANN

MLR

Validation

DC

RMSE

DC

RMSE

11

trimf, 2

0.99

0.001

0.66

1.88

7

trimf, 2

0.99

0.014

0.28

2.74

3

trimf, 2

0.15

2.54

0.05

3.15

(11 - 14- 1)

0.92

0.81

0.7

1.72

( 7- 7 - 1)

0.94

0.7

0.81

1.38

( 3 - 3 - 1)

0.16

2.29

0.05

3.31

11

0.6

1.74

0.84

1.28

7

0.62

1.71

0.88

1.13

3

0.06

2.68

0.02

3.21

It can be seen from Table 2 that, the discrepancy in term of performance of the three techniques based on the input combinations , the input 11, 7 and 3 are generated and defined base on the division of streams ( upper, middle and down). For ANFIS the model with optimum input and triangular membership gave the better result, while ANN and MLR it was found that the second model is the best. The result indicated that large amount may increase the complexity of data with leads to over fitting. Table 2 shows that ANFIS model performed better than the other two in training while in validation the MLR is a bit higher than ANN and ANFIS. However, it can be seen averagely that, ANN model (7-7-1) performance is more considerable than ANFIS and MLR in both cases due to the overestimation and underestimation form ANFIS and MLR respectively .Fig 4(a-c) shows the step time series plots of three techniques for the best model type .

S. I. Abba et al./etProcedia Computer Science 00 (2018) 000–000 S.I. Abba al. / Procedia Computer Science 120 (2017) 75–82

Fig 4a. Observed and Predicted DO value for MLR

Fig 4b. Observed and Predicted DO value for ANN

Fig 4c Observed and Predicted DO value for ANFIS

817

82 8


Conclusion The River water quality model is important in order to safeguard aquatic life. The estimation of water variables using the linear function method is difficult and time consuming due the complexity process. Soft computing techniques have become a very vital tool for modelling and simulation of nonlinear interaction of parameters. The MLR, ANN, ANFI has been used in modelling DO concentration at down stream of Agra city, the performance criteria were determined and compared using DC and RMSE. It was found that the result of ANN from the middle and downstream was slightly better than ANFIS model and out performed MLR model.The result indicates that for predicting DO concentration at Agra station, the combination of middle and downstream are suitable for a good model simulation, it also indicate that ANFIS proved to be of high accuracy when the three stations are combined. Abba, S. I., Said, Y. S., Bashir, A., 2015. Assessment of Water Quality Changes at Two Location of Yamuna River Using the National Sanitation Foundation of Water Quality. Journal of Civil Engineering and Environmental Technology, 2(8), 730–733. Ahour, M., Sadeghian, M. S., 2013. The Study of Artificial Neural Network ( ANN ) Efficiency with, 2(August), 30–38. Akrami, S. A., Nourani, V., Hakim, S. J. S., 2014. Development of Nonlinear Model Based on Wavelet-ANFIS for Rainfall Forecasting at Klang Gates Dam. Water Resources Management, 28(10), 2999–3018. Areerachakul, S., Junsawang, P., Pomsathit, A., 2011. Prediction of dissolved oxygen using artificial neural network. .International Conference on Computer Communication and Management, 5, 524–528. Chen, W., Liu, W., 2015. Water Quality Modeling in Reservoirs Using Multivariate Linear Regression and Two Neural Network Models, 2015. Advances in Artificial Neural Systems. ID 521721, 12pp. Committee, A. T., 2000. Artificial neural networks in hydrology. I: preliminary concepts. Journal of Hydrologic Engineering, 5(2), 115–123. El-Shafie, A., Taha, M. R., & Noureldin, A. (2007). A neuro-fuzzy model for inflow forecasting of the Nile river at Aswan high dam. Water resources management, 21(3), 533-556. Farhad Yousefabadi, L. O., Shariati, F., Alireza, Mardookhpour., 2013. A Comparison of water quality indices for Haraz River Department of Environmental Engineering , Lahijan Branch , Islamic, 3(3), 30–36. Jang, J. S. R C. T., Sun, E. M., 2001. Book Reviews (Vol. 38). Jain, B. P. K. J. K., 2014. Wastewater Engineering (Including Air Pollution). (B. . Pumia, Ed.) (Second). India: Laxmi Publications (P) LTD. Jang, J. R. (1993). ANFIS : Adap tive-Ne twork-Based Fuzzy Inference System, 23(3). Kisi, O., Murat, A., 2012. Comparison of Ann and Anfis Techniques in Modeling Dissolved Oxygen. Sixteenth International Water Technology Conference, 1–10. Kuo-lin Hsu, Hoshin Vijai Gupta, S., Sorooshian., 1995. Artificial Neural Networks Modelling of the rainfall - runoff process. Journal of Hydrologic Engineering, 31(10), 2517–2530. Lewis, M. E., 2005. Geological Survey Tecniques of Water-Resources Investigations. U.S. Muhammad Sani Gaya, Abdul Wahaba , N., Sama Y. M., S. I. S., 2014. ANFIS Modelling of Carbon and Nitrogen Removal in Domestic Wastewater Treatment Plant. Jurnal Teknologi, 67(5), 439–446. Nourani, V., Alami, M. T., Vousoughi, F. D., 2015. Wavelet-entropy data pre-processing approach for ANN-based groundwater level modeling. Journal of hydrology, 524, 255–269. Nourani, V., Khanghah, T. R., Sayyadi, M., Prof, A., Student, M. S., Student, B. S., 2013. Application of the Artificial Neural Network to monitor the quality of treated water, 3(1), 38–45. Parmar, K. S., Bhardwaj, R., 2015. River Water Prediction Modeling Using Neural Networks, Fuzzy and Wavelet Coupled Model. Water Resources Management, 29(1), 17–33. Pollution, C., Board, C., 2006. Water quality status of yamuna river (1999 – 2005). Assessment and Development of River, 136. Sarkar, A., Pandey, P., 2015. River Water Quality Modelling Using Artificial Neural Network Technique. Aquatic Procedia, 4(Icwrcoe), 1070– 1077. https://doi.org/10.1016/j.aqpro.2015.02.135 Shaghaghian, M. R., 2010. Prediction of Dissolved Oxygen in Rivers Using a Wang-Mendel Method – Case Study of Au Sable River. World Academy of Science, Engineering and Technology, 4(2), 676–683. Sharifi, S. S., Delirhasannia, R., Nourani, V., Sadraddini, A. A., Ghorbani, A., 2009. Using Artificial Neural Networks ( ANNs ) and Adaptive Neuro-Fuzzy Inference System ( ANFIS ) for Modeling and Sensitivity Analysis of Effective Rainfall, (2008), 133–139. Soyupak, S., Karaer, F., Gurbuz, H., Kivrak, E., Senturk, E., & Yazici, A., 2003. A neural network-based approach for calculating dissolved oxygen profiles in reservoirs. Neural Computing & Applications, 12(3–4), 166–172. Takagi, T., Sugeno, M., 1985. Fuzzy identification of systems and its applications to modeling and control. IEEE, (15), 116–32. Yetilmezsoy, K., Ozkaya, B., Cakmakci, M., 2011. Artificial intelligence-based prediction models, 193–218. Zadeh, L. A. (1965). Fuzzy Sets. Information Control, 8(3), 38–53.

River water modelling prediction using multi-linear regression, artificial

River water modelling prediction using multi-linear regression, artificial

Suggest Documents

River Water Quality Modelling Using Artificial Neural ...

Modelling using polynomial regression

prediction of turbidity in tigris river using artificial neural networks

river water quality modelling - CiteSeerX

Spatial Regression Modelling of River Temperature

Modelling and prediction of rainfall using artificial neural network and ...

River water quality modelling for river basin and water resources ...

modeling of river water quality parameters using artificial neural network

Artificial Neural Network Modelling for Prediction of

Ground water level prediction using artificial ... - Inderscience Online

Flood Water Level Modelling and Prediction Using ... - IEEE Xplore

WATER QUALITY PREDICTION FOR RIVER BASIN MANAGEMENT

Ridge Regression using Artificial Neural Network

Nonlinear Survival Regression Using Artificial Neural Network

Integrated river water quantity- quality modelling

Software Defect Prediction Using Regression via ... - CiteSeerX

Crime Prediction Using Regression and Resources ...

Leukemia Prediction Using Sparse Logistic Regression - PLOS

Prediction Using Estimators of Linear Regression ...

IRJET- Stock Market Prediction using Regression

Software Defect Prediction Using Regression via ... - CiteSeerX

Modelling using polynomial regression - Semantic Scholar

River Water Quality Modelling In Developing A Catchment Water ...

IRJET- Regression and Neural Network Modelling for River Yamuna