Atmospheric Research 171 (2016) 21–30
Contents lists available at ScienceDirect
Atmospheric Research journal homepage: www.elsevier.com/locate/atmosres
Selection of meteorological parameters affecting rainfall estimation using neuro-fuzzy computing methodology Roslan Hashim a,b,⁎, Chandrabhushan Roy a, Shervin Motamedi a,b, Shahaboddin Shamshirband c, Dalibor Petković d, Milan Gocic e, Siew Cheng Lee a a
Department of Civil Engineering, Faculty of Engineering, University of Malaya, 50603 Kuala Lumpur, Malaysia Institute of Ocean and Earth Sciences, University of Malaya, 50603 Kuala Lumpur, Malaysia Department of Computer System and Technology, Faculty of Computer Science and Information Technology, University of Malaya, 50603 Kuala Lumpur, Malaysia d Faculty of Mechanical Engineering, Department for Mechatronics and Control, University of Nis, Aleksandra Medvedeva 14, 18000 Nis, Serbia e Faculty of Civil Engineering and Architecture, University of Nis, Aleksandra Medvedeva 14, 18000 Nis, Serbia b c
a r t i c l e
i n f o
Article history: Received 7 April 2015 Received in revised form 2 December 2015 Accepted 4 December 2015 Available online 15 December 2015 Keywords: Rainfall Forecasting Meteorological data Anfis Variable selection
a b s t r a c t Rainfall is a complex atmospheric process that varies over time and space. Researchers have used various empirical and numerical methods to enhance estimation of rainfall intensity. We developed a novel prediction model in this study, with the emphasis on accuracy to identify the most significant meteorological parameters having effect on rainfall. For this, we used five input parameters: wet day frequency (dwet), vapor pressure (ea ), and maximum and minimum air temperatures (Tmax and Tmin) as well as cloud cover (cc). The data were obtained from the Indian Meteorological Department for the Patna city, Bihar, India. Further, a type of soft-computing method, known as the adaptive-neuro-fuzzy inference system (ANFIS), was applied to the available data. In this respect, the observation data from 1901 to 2000 were employed for testing, validating, and estimating monthly rainfall via the simulated model. In addition, the ANFIS process for variable selection was implemented to detect the predominant variables affecting the rainfall prediction. Finally, the performance of the model was compared to other soft-computing approaches, including the artificial neural network (ANN), support vector machine (SVM), extreme learning machine (ELM), and genetic programming (GP). The results revealed that ANN, ELM, ANFIS, SVM, and GP had R2 of 0.9531, 0.9572, 0.9764, 0.9525, and 0.9526, respectively. Therefore, we conclude that the ANFIS is the best method among all to predict monthly rainfall. Moreover, dwet was found to be the most influential parameter for rainfall prediction, and the best predictor of accuracy. This study also identified sets of two and three meteorological parameters that show the best predictions. © 2015 Elsevier B.V. All rights reserved.
1. Introduction Increasing population and industrialization have put tremendous pressures on the natural processes (e.g., Bawa and Seidler, 2015; Brown, 2015; Gupta, 2014; Heald and Spracklen, 2015; Lemaire et al., 2014). In addition, in recent years, the frequency of natural hazards (i.e., droughts, floods, hurricanes, cyclones and tsunamis) have increased (Gong and Forrest, 2014; Guan et al., 2015; Kumar et al., 2015; Sowmya et al., 2015). Therefore, temporal prediction of natural processes, such as rainfall, is helpful in appropriate planning and management of urban areas. For example, rainfall determines the ground water level, which in turn supports the water supply to urban and rural populations (Dhakal and Sullivan, 2014; Van Eekelen et al., 2015). In addition, rainfall endures various human ecology interactions such as agriculture, thermal comfort, and
⁎ Corresponding author. E-mail address:
[email protected] (R. Hashim).
http://dx.doi.org/10.1016/j.atmosres.2015.12.002 0169-8095/© 2015 Elsevier B.V. All rights reserved.
evaporation (Arvor et al., 2014; de Abreu-Harbich et al., 2015; Müller et al., 2014; Singh et al., 2014). Rainfall is not easy to predict, because it depends on the space and time scales. Researchers consider rainfall as a stochastic process (Kundu et al., 2014; Ramesh et al., 2013; Schleiss et al., 2014). In this regard, various empirical and numerical models have been developed to investigate the non-linear trend of rainfall (Buzzi et al., 2014; Dai et al., 2015; Kumar et al., 2014; Nicholson, 2014; Nielsen et al., 2014). Recently, accuracy of conventional prediction models have been improved through the meteorological and satellite observations (Mehran and AghaKouchak, 2014; Zhou et al., 2014). The efficiency of softcomputing techniques was investigated for non-linear natural processes, such as rainfall (Nastos et al., 2014), drought (Deo and Şahin, 2015), oceanic wave (Sayemuzzaman et al., 2015), and sediment transport (Makarynskyy et al., 2015). In this view, use of the soft-computing methods for predicting rainfall can be beneficial. Rainfall models that employ soft-computing techniques can be classified into two groups. The first group uses historical time series data of a station to train the model and predict rainfall (e.g., Ramana et al., 2013;
22
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
Shamshirband et al., 2015a,b). Second set of rainfall models uses multiple meteorological parameters to make prediction (e.g., Ortiz-García et al., 2014; Nastos et al., 2014). The soft-computing methods can effectively recognize relationships between dependent and independent variables for non-linear natural processes, such as rainfall (Datta et al., 2014; Geetha and Nasira, 2014; Maheswaran and Khosa, 2014; Sharma and Bose, 2014; Valverde et al., 2014). For example, Ramana et al. (2013) used the wavelet technique with the ANN to satisfactorily predict the rainfall time series. They used monthly rainfall gauge data for Darjeeling, India, to calibrate the developed model. They reported that the wavelet neural network models were superior to the ANN models. A recent study compared the performance of the ANFIS and the support vector regression (SVR)-based rainfall prediction models (Shamshirband et al., 2015a,b). The results show that the ANFIS better predicts rainfall than the SVR. This study employed only the historical monthly rainfall data of 29 stations in Serbia, measured from the year 1946 to 2012, as input to train the models. They suggested that even with one input, the ANFIS and SVR can make accurate monthly predictions. A similar study used the historical daily rainfall data from the years 1987 to 2001 in three stations in Turkey to estimate one-day ahead rainfall using the waveletneuro-fuzzy technique. The results showed R2 of 0.9 for the predicted rainfall using the wavelet-neuro-fuzzy network, while the conventional neuro-fuzzy method yielded a very low R2 (0.1). The authors concluded that the results of the wavelet-neuro-fuzzy compare well with other conventional soft-computing techniques, such as ANN and multilinear regression (MLR) methods. In addition to the time series rainfall data, many soft-computing studies attempted to predict rainfall using multiple meteorological parameters. For example, Ortiz-García et al. (2014) compared rainfall estimation by the SVM, multi-layer perceptron, extreme learning machine (ELM), and other classical algorithms, such as K-nearest neighbor classifiers and decision tree. They used parameters, such as total perceptible water, air temperature, humidity, wind speed, and wind direction. The results indicated that the SVM outperforms other classification
Geography
techniques. Nastos et al. (2014) used the monthly total rain and corresponding rain days (dwet, wet day) in the ANN model to predict rainfall intensity in Athens, Greece. They used meteorological data for 11 years, attained from the National Observatory of Athens. The rainfall predictions of the 4-month observation was in good agreement with the field data. Furthermore, the RMSE and R2 proved the robustness of the models, with a statistical significance of P b 0.01. Based on the above discussion, it can be concluded that the softcomputing models satisfactorily predict rainfall. In addition, previous research works have employed multiple meteorological parameters, but until now, no study has investigated the effect of input parameters on the accuracy of rainfall prediction. In this study, we developed a prediction model using the ANFIS to identify the most influential parameters that accurately estimate rainfall. The process, which is called variable selection, includes a number of ways to discover a subset of the total recorded parameters that show good capability of prediction. The ANFIS network was used to perform a variable search and thereafter, it was used to examine how 5 input parameters with influence on rainfall prediction (monthly cloud cover (cc), average monthly vapor pressure (ea ), frequency of monthly wet day (dwet), and monthly maximum and minimum air temperatures (Tmax and Tmin)). This study obtained the average monthly meteorological observations from the year 1901 to 2002 for the district Patna in the state of Bihar, India. The collected data w used to calibrate and estimate monthly rainfall using the model.
2. Methodology This section describes the study area and provides details of the input parameters, observation data and the ANFIS method. Section 2.1 discusses the observation area and Section 2.2 presents the input and output meteorological data. Also, section 2.2 presents the ANFIS model that was used for predicting rainfall. The schematic flow of this research work is depicted in Fig. 1.
Climate
Meteorological Variables
Description of the Study Site
Process of Data Collection
Process of ANFIS Simulation Comparison with Other Soft Computing Methods
Results and Discussions
Experiment
Simulation
Conclusion Fig. 1. Flow of research works in this study.
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
23
Fig. 2. Geographic map showing the city of Patna in the state of Bihar, India.
2.1. Description of study area
2.2. Simulation using ANFIS technique
In this study, we used the meteorological measurements from the district of Patna in the state of Bihar, India. Patna is located in the highly fertile Indo-Gangetic plains at 25° 36′ 0″ N and 85° 5′ 60″ E (Fig. 2). Five meteorological parameters consisted the dataset from the year 1901 to 2002. The data were made available from the Indian Meteorological Department (IMD, http://indiawaterportal.org/met_data). Patna is the capital city of the state of Bihar, which contributes significantly to India's agricultural production. Therefore, accurate rainfall prediction for this region is important. Fig. 1 shows a map of the state of Bihar topography. It has the Himalayas on its north and the river Ganges running from the west to east. The fertile alluvial plain is the characteristic of the Ganges plain. Sone, Ghaghara, Kosi, Falgu, Poonpoon, and Karmanasa are other important rivers flowing in this region. Economy of this region is mainly based on agriculture with 70% of the population engaged in cultivation-related works. Rice and wheat are two important agricultural products, produced during the months of November to March, and May to October, respectively. This area receives significant mean annually rainfall (1137 mm) (Subash et al., 2011). Unfortunately, in the recent years, the area has been affected by several floods and droughts (Parry, 2007).
This section presents the ANFIS model that was used for predicting the monthly rainfall. First, input and output variables used for modeling are presented in the following sub-section.
Table 1 Input parameters. Input
Parameter description
Abbreviation
Input 1 Input 2 Input 3 Input 4 Input 5
Monthly average vapor pressure (hPa) Monthly minimum air temperature (°C) Monthly maximum air temperature (°C) Monthly wet day frequency (day) Monthly cloud cover (%)
ea Tmin Tmax dwet cc
2.2.1. Input and output variables Table 1 shows the five input parameters selected for analysis: ea, dwet, Tmax, Tmin, and cc. These parameters are considered potentially influential in the rainfall prediction (output parameter) (Bellerby et al., 2000; Kuligowski and Barros, 1998; Kutzbach, 1967; Maqsood et al., 2005; Nastos et al., 2014; Ortiz-García et al., 2014). (See Table 2.) Previous research works predicted rainfall by several meteorological parameters, such as actual moisture content, cloud cover, maximum and minimum air temperatures, vapor pressure and cloud texture (Bellerby et al., 2000; Kuligowski and Barros, 1998; Kutzbach, 1967; Maqsood et al., 2005; Nastos et al., 2014; Ortiz-García et al., 2014). Five meteorological parameters were selected in our study to get the observation data (Table 1), including ea , cc, Tmax and Tmin. In addition, the wet day frequency (dwet) was introduced as an extra input parameter. The reason for choosing this parameter is explained below. Sometimes, only historical rainfall data are used to estimate the rainfall (e.g., Mehran & Aghakouchak, 2014; Nastos et al., 2014; Shamshirband et al., 2015a,b). This suggests that prediction of rainfall for a certain month is based on the past trend of rainfall. Every day can be categorized into two based on the occurrence of rainfall: wet or dry day. Basically, in wet days, precipitation happens with mean rainfall of more than 1.0 mm/day, while dry days have no rainfall. Moreover, dwet is the total wet days in the dataset. Table 2 Output parameter. Output
Parameter description
Abbreviation
Output 1
Monthly average rain (mm)
Pa
24
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30 Table 3 Statistical analysis of measurement data for the input and output variables.
Output
Input
Fig. 3. List of meteorological parameters selected as input for rainfall prediction.
Even though dwet is not usually employed for the similar applications, it signifies rainfall occurrence. Nastos et al. (2014) predicted the rainfall intensity in Athens, Greece, by using the wet day through an ANN model. They intended to find the influence of these input variables on the accuracy of rainfall forecast. Consequently, they considered dwet together with other meteorological input parameters to forecast monthly rainfall. Fig. 3 shows all the meteorological parameters selected to predict rainfall. 2.2.2. Process of variable selection using ANFIS To build a system with the best characteristics, it is necessary to identify the most relevant and influential subset of parameters and subject them to analysis. This process of selection is usually called the variable selection. The purpose of this process was to find a subset of the total set of parameters that showed good capability of prediction. Essentially, with the neural network as the foundation, we modeled the complex system's architecture in function of approximation and regression. A neural network is an architecture made up of extremely parallel adaptive processing elements. The structured networks make them connected. Therefore, the accuracy of the neural network models, which are created as a result of this data, relies heavily on the accuracy of chosen data in the representation of system. For a successful generation and creation of a model, which is capable of estimating a special process output, the selection process of the subset of parameters is crucial. This is achieved in the process of variable selection. As mentioned before, the purpose of this procedure was to find a subset of total parameters set that has been recorded for prediction (Andersson et al., 2000; Castellano and Fanelli, 2000; Cibas et al., 1996; Dieterle et al., 2003; Kariminia et al., 2015a). The problems faced in the process of the parameter selection could possibly be resolved by integrating prior knowledge to segregate and
Meteorological parameters
Statistical parameters Max
Min
Mean
Standard deviation
Rainfall (mm) ea (hPa) Tmin (°C) Tmax (°C) dwet (day) cc (%)
534.69 35.83 30.25 43.27 18.03 78.55
0.00 10.76 6.78 21.60 0.00 3.42
90.95 21.88 19.69 31.97 4.37 38.89
121.95 8.29 6.71 5.29 4.82 19.34
remove irrelevant parameters. Otherwise, a more sophisticated approach to the above-mentioned problem is to view it as an optimization procedure through the use of genetic algorithms. It was aimed to select proper explanatory (input) parameters and thereby minimize the error between the model of the explained variables and the observed values. Among several neural network systems, the ANFIS is one of the most used and powerful. Therefore, this study employed the ANFIS for the variable selection (Chan et al., 2011; Kwong et al., 2009; Kariminia et al., 2015b). In order to determine how the meteorological parameters affect the rainfall prediction, a parameter search was conducted via the ANFIS. ANFIS (Jang, 1993) as a hybrid intelligent system that increases the capability of learning and adapting automatically has been used by researchers for many different purposes in a variety of engineering systems, such as in modeling (Al-Ghandoor and Samhouri, 2009; Petković et al., 2012a,b; Singh et al., 2012), prediction (Hosoz et al., 2011; Khajeh et al., 2009; Sivakumar and Balu, 2010), and control (Areed et al., 2010; Kurnaz et al., 2010; Petković et al., 2012a; Ravi et al., 2011; Tian and Collins, 2005). This neuro-adaptive learning methodology allows the fuzzy modeling process to obtain information regarding the gathered data (Aldair and Wang, 2011; Dastranj et al., 2011). This is the foundational idea underlying all the neuro-adaptive learning methodologies. The ANFIS methodology aims to establish the fuzzy inference system (FIS) by analyzing the input/output data pairs (Grigorie and Botez, 2009; Manoj, 2011). The fuzzy logic enables adjusting the membership function (MF) parameters and allows the associated FIS to detect and trace the assumed input/output data (Akcayol, 2004; Sayemuzzaman et al., 2015). Generating the predetermined input–output subsets requires construction of a set of ‘IF/THEN’ rules for fuzzy alongside the suitable MFs. The ANFIS can serve as the foundation of such a construction. The input–output data are converted membership functions. In accordance with the collection of input–output data, the ANFIS takes the initial FIS and adjusts it through a back propagation algorithm. Three components comprise the FIS: (1) a rule base, (2) a database, and (3) a reasoning mechanism. The rule base consists of a choice of fuzzy
Layer 1 Layer 4 Layer 2
Layer 3
A x
x w1
B
Layer 5
y
w1
w1 f1 f
C w2
y D\
w2f2
w2 x
y
Fig. 4. Diagram showing the ANFIS structure.
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
25
Fig. 5. Variation of (a) P s , (b) ea , (c) Tmin and Tmax, (d) dwet, and (e) cc from the year 1901 to 2002.
rules. The database assigns the MFs, which are employed in the fuzzy rules. Finally, the last component is the reasoning mechanism that infers from the rules and input data to come to a feasible outcome. These
intelligent systems are a combination of knowledge, methods, and techniques from a variety of different sources. They adjust to perform better in changing environments. These systems have human-like intelligence within a specific domain. The ANFIS recognizes patterns and assists in the revision of environments. The FIS integrates human comprehension, does interface, and makes decisions. The FIS in MATLAB was employed in the whole process of training and evaluation. An ANFIS network for two inputs is depicted in Fig. 4. The fuzzy IF-THEN rules of Takagi and Sugeno's class and two inputs for the first-order Sugeno were employed in this study: if x is A and y is C then f 1 ¼ p1 x þ q1 y þ r 1
Fig. 6. Influence of meteorological input parameters on the rainfall prediction.
ð1Þ
The 1st layer is made of input parameters MFs and provides the input values to the following layer. Each node here is considered as an adaptive with a node function O = μAB(x) and O= μCD(x), where μAB(x)
26
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
Table 4 Regression errors from ANFIS method for rainfall estimation in means of one and two parameters combinations.
ea (hPa) Tmin (°C) Tmax (°C) dwet (day) cc(%)
ea (hPa)
Tmin (°C)
Tmax (°C)
dwet (day)
cc (%)
trn = 66.1127, chk = 59.7926
trn = 57.9560, chk = 53.9889 trn = 97.9453, chk = 93.7733
trn = 55.3249, chk = 51.7218 trn = 58.1862, chk = 54.8328 trn = 100.4038, chk = 95.2570
trn = 22.3380, chk = 24.0997 trn = 27.6809, chk = 28.2871 trn = 28.9663, chk = 30.0784 trn = 30.6495, chk = 30.3958
trn = 56.8204, chk = 54.7496 trn = 51.5888, chk = 49.4364 trn = 51.2351, chk = 48.7869 trn = 26.1936, chk = 25.7421 trn = 57.3400, chk = 53.4546
and μCD(x) are MFs. The bell-shaped MFs with the maximum value of 1.0 and the minimum value of 0.0 are selected as μ ðxÞ ¼ bellðx; ai ; bi ; ci Þ ¼ 1þ
1
x−ci ai
ð2Þ
2 bi
where {ai, bi, ci} is the set of parameters. These parameters are designated as premise parameters. The membership layer is the second layer for the weights of every membership function. This layer gets the signals from the preceding layer and then acts as the membership function to represent the fuzzy sets of each input variable. The second layer nodes are non-adaptive. The layer acts as a multiplier for the coming signals and sends out the outcome as wi ¼ μ AB ðxÞ μ CD ðyÞ
ð3Þ
The next layer is considered the rule layer (third layer). All the neurons here act as pre-condition to match the fuzzy rules, i.e. each rule's activation level is calculated, whereby the number of fuzzy rules is equal to the quantity of layers. Every node calculates the normalized weights. The nodes in the 3rd layer are also considered non-adaptive. Each of the nodes calculates the value of the rule's firing strength over the sum of all firing strength rules in the form of wi ¼
wi ; i ¼ 1; 2: w1 þ w2
ð4Þ
The outcomes are referred to as the normalized firing strengths. The 4th layer is responsible for providing the output values as a result of the inference of rules. This layer is also known as the defuzzification layer. Every node in this layer is an adaptive node with the node function O4i ¼ wi xf ¼ wi ðpi x þ qi y þ r i Þ
ð5Þ
where {pi, qi, ri} is the variable set, designated as the consequent parameter. The 5th and final layer is known as the output layer that adds up the coming inputs from the preceding layer. Thereafter, it converts the fuzzyclassification outcomes into the binary (crisp). In this layer, the single
node is non-adaptive. This node calculates total output as the sum of all receiving signals
O5i
¼
X
w i i
X
w f i i f ¼ X w i i
ð6Þ
3. Results and discussion This section presents results of the measurements and ANFIS simulation. First, in Section 3.1, we present analysis of the meteorological measurements for the study area, and then results of ANFIS simulation are presented and discussed in Section 3.2. Finally, we compare results of the ANFIS method with other conventional soft-computing methods, such as ELM, SVM, GP, and ANN. Comparative study in Section 3.3 suggests that the ANFIS method better predicts rainfall than other conventional soft-computing techniques. 3.1. Analysis of measurement data The statistical summary of all the input and output meteorological data from the years 1901 to 2002 for the study area is presented in Table 3. In this study, statistical analysis of the observed meteorological data shows that Patna receives average monthly rainfall of 90.95 mm. This area recorded a maximum monthly rainfall of 534.69 mm. In addition, the average monthly vapor pressure in this region varies from 10.78 to 35.8 hPa with an average of 21.88 hPa. In the last century, this region had maximum air temperature of 43.27 °C and minimum of 6.78 °C. Consequently, it can be established that this region has a wide range of temperature change. In the last century, the region showed significant rainfall. This area had maximum monthly frequency of 18 rainy and a mean monthly frequency of 4.37 wet days. This region had an average monthly cloud cover of 38.89% and maximum monthly cloud cover of 78.55%. Fig. 5 (a–d) shows the temporal distribution of dwet, P s , ea , cc, Tmin, and Tmax from 1901 to 2002.
Table 5 ANFIS regression errors for rainfall prediction for three parameters combinations. Gender ea (hPa) – Tmin (°C) – Tmax (°C) ea (hPa) – Tmin (°C) – dwet (day) ea (hPa) – Tmin (°C) – cc (%) ea (hPa) – Tmax (°C) – dwet (day) ea (hPa) – Tmax (°C) – cc (%) ea (hPa) – dwet (day) – cc (%) Tmin (°C) – Tmax (°C) – dwet (day) Tmin (°C) – Tmax (°C) – cc (%) Tmin (°C) – dwet (day) – cc (%) Tmax (°C) – dwet (day) – cc (%)
trn = 52.7809, chk = 53.4810 trn = 21.1622, chk = 23.0165 trn 49.9383, chk = 66.9012 trn = 20.5157, chk = 22.6228 trn = 48.4034, chk = 71.2558 trn = 21.0219, chk = 22.8693 trn = 24.8394, chk = 27.1548 trn = 48.4166, chk = 59.5954 trn = 25.3907, chk = 25.8150 trn = 24.5479, chk = 25.5062
Fig. 7. Checking and training errors for two selected input parameters (ea and dwet) for rainfall prediction.
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
a
(mm) a
Tmax (°C)
(mm)
a (hPa)
dwet (day)
(a)
a (hPa)
(b)
a a
27
(mm)
(mm)
Tmin (°C)
dwet (day)
Tmax (°C)
dwet (day)
(c) (d)
a
(mm)
a
Tmax (°C)
cc (%)
(e)
(mm)
(%)
dwet (day)
(f)
Fig. 8. The relationship for the combinations of the two-input parameters predicted by ANFIS.
3.2. Results of ANFIS analysis On the dataset, a complete ANFIS search was carried out or selecting the set of optimal combination of the input parameters. The combination of such characteristics would have the most influence on the rainfall prediction (output). Actually, the model of ANFIS was generated in such that it includes a function for each combination. In addition, the functions were trained for single epoch. In this manner, the most influential input parameter on the estimation of the rainfall was determined. Fig. 6 shows the influence of the meteorological input parameters on the rainfall prediction. From this figure, the input parameter dwet has the lowest number of errors while the Tmax has the highest number of
errors. This indicates that dwet and Tmax have the most relevance and the least relevance in respect to the output parameter (rainfall). From Figs. 5 and 6, ones can observe that the dwet, followed by cc, are the two most influential parameters for the rainfall prediction. From the Fig. 6, it is evident that errors of the checking and training are indirectly comparable. This is an indication that no over fitting exists. In another words, selection of more than one meteorological input parameter in the generation of the ANFIS model is worthwhile for exploring. For the verification purposes, the best integration of two receiving parameters was explored. The results as summarized in Tables 4 and 5 demonstrate that the optimal combinations of two and three input parameters attribute for the estimation of the rainfall. In addition, from
28
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
Table 6 User-defined parameters for the ELM, ANN and GP models. ELM Number of layers Neurons – – Learning rule – – – –
ANN 3 Input: 5 Hidden: 3, 6, 10 Output: 1 – – ELM for SLFNs – – – –
GP
Number of layers Neurons Number of iteration Activation function Learning rule – – – –
these tables, among all the parameters examined, optimal combination of ea and dwet is the best predictor of accuracy and the most influential on the rainfall prediction. Table 5 shows a combination of three parameters (ea , Tmin, and dwet) as the optimal combination for the rainfall prediction. Previous studies have also reported that ea and Tmin successfully predict the rainfall. For example, Yong et al. (2010) used ea and Tmin to forecast precipitation using the ANN network along with other meteorological parameters. In another work, O'Gorman and Schneider (2009) stated that rainfall extremes can be better predicted by ea . In this study, we found that dwet is also a better predictor of monthly rainfall. The proposed model has a simple structure. We do not suggest to use more than two inputs in the generation of the ANFIS model (Motamedi et al., 2015a,b). For this purpose, the combination of twoinput parameters was used for further examination using ANFIS. In this respect, the input parameters that were chosen from the initial checking and training datasets were extracted. To enhance the speed of ANIFS process toward selection of inputs, it has been tried to use the function for training of each variable with a single epoch. Subsequently, when the input parameters were fixed, a total of 100 epochs (the quantity of epochs for ANFIS training process) were used. Fig. 7 shows the error curves for these 100 epochs of checking and training for ea and dwet. In this figure, the checking and training errors are illustrated by using solid and dashed curves, respectively. In order to provide a comparison for the predictions of the ANFIS method and a linear regression model, a statistical root mean square error (RMSE) was analyzed against the checking data. The results indicate that the ANFIS regression error was 22.33 while the linear regression error was 29.66. Thus, on the basis of these findings, the ANFIS model could provide a more accurate prediction compared to the linear regression model. Fig. 8 shows the non-linear and monotonic decision surface of ANFIS input–output for prediction of rainfall. From Fig. 8(b), it can be observed that a combination of two-input parameters ( ea and dwet) has the highest variation of all the other combinations. Therefore, the combination of these two input–output parameters is the most influential combination on the prediction of rainfall.
Table 7 Comparative performance statistics of the ANFIS, ELM, SVM, GP, and ANN predictive models. Soft-computing model
ANFIS ELM SVM GP ANN
Statistical indicator RMSE
R2
r
21.16220 25.28897 26.55853 26.56385 26.58873
0.9764 0.9572 0.9525 0.9526 0.9531
0.993424 0.978349 0.975980 0.976035 0.976248
3 Input: 5 Hidden: 3, 6, 10 Output: 1 1000 Sigmoid function Back propagation – – – –
– Neurons Population size Head size Chromosomes Number of genes Mutation rate Crossover rate Inversion rate
– – – Output: 1 512 5–9 20–30 2–3 91% 31% 109%
3.3. Comparative study To show the advantages of the suggested ANFIS approach on a more tangible and definite basis, its accuracy n was compared to the ANN (Kalteh, 2013), ELM (Huang et al., 2004), GP (Koza, 1992), and SVM (Vapnik et al., 1996) methods. RMSE, R2 and r were employed for comparison. The parameters of the ELM, ANN, and GP modeling frameworks employed in this study are presented in Table 6. Radial basis function (RBF), as the most common function, was applied as the kernel function for the SVM model. The three parameters associated with RBF kernels are C, γ, and ε. SVM model accuracy is dependent on the model parameter selection. The optimal values of user-defined SVM parameters are C = 2.47, γ = 0.67, ε = 0.62. The results of prediction accuracy for the test datasets are summarized in Table 7. It can be concluded that the ANFIS model better performs than other soft-computing models. It can be noted that from Table 6 that the ANFIS model yielded significantly better results than other models. Based on the RMSE analysis, it can be established that the ANFIS model outperforms the benchmark models. 4. 5. Conclusions The study carried out a systematic approach to select the most dominant weather parameters to predict the rainfall by the ANFIS methodology. The ANFIS network was used to explore how ea , dwet, cc, Tmin, and Tmax influence the rainfall forecast. This study reveals that the order of parameters with the most to least effect on the accuracy of rainfall forecast is dwet, ea, cc, Tmin, and Tmax. Furthermore, the combination of ea and dwet provides the best prediction. Furthermore, the set of Tmin, dwet, and ea has the best prediction capacity. Further, performance of the ANFIS model was compared to the other soft-computing methods and it can be concluded that the ANFIS can best predict the monthly rainfall. There can be drawbacks in the inclusion of several input variables into the prediction model. Some of the drawbacks are (1) the difficulty of explaining the model as well as (2) distractions and inaccuracies caused by irrelevant parameters, which consequently deteriorate the generalization capacity of the model and make data collection more time consuming. However, methods that permit reduction of the number of input variables can be figured out. If they reduce the complexity of model, they are very useful, incorporate better predictions and insights into the relevance of the variables. Some of the main advantages of the ANFIS are as follows: (1) it is adaptable to optimization and adaptive methods, (2) it is computationally efficient, and (3) handles more complex parameters. Acknowledgments The authors wish to express their gratitude for the funding support provided by the University of Malaya, Ministry of Higher Education,
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
High Impact Research grant (no. UM.C/HIR/MOHE/ENG/34) and the ICT COST Action IC1408 Computationally intensive methods for the robust analysis of non-standard data (CRoNoS). References Akcayol, M.A., 2004. Application of adaptive neuro-fuzzy controller for SRM. Adv. Eng. Softw. 35, 129–137. Al-Ghandoor, A., Samhouri, M., 2009. Electricity consumption in the industrial sector of Jordan: application of multivariate linear regression and adaptive neuro-fuzzy techniques. JJMIE 3. Aldair, A.A., Wang, W.J., 2011. Design an intelligent controller for full vehicle nonlinear active suspension systems. Int. J. Smart Sens. Intell. Syst. 4, 224–243. Andersson, F.O., Åberg, M., Jacobsson, S.P., 2000. Algorithmic approaches for studies of variable influence, contribution and selection in neural networks. Chemom. Intell. Lab. Syst. 51, 61–72. Areed, F.G., Haikal, A.Y., Mohammed, R.H., 2010. Adaptive neuro-fuzzy control of an induction motor. Ain Shams Eng. J. 1, 71–78. Arvor, D., Dubreuil, V., Ronchail, J., Simões, M., Funatsu, B.M., 2014. Spatial patterns of rainfall regimes related to levels of double cropping agriculture systems in Mato Grosso (Brazil). Int. J. Climatol. 34, 2622–2633. Bawa, K.S., Seidler, R., 2015. Deforestation and sustainable mixed-use landscapes: a view from the eastern Himalaya 1. Ann. Mo. Bot. Gard. 100, 141–149. Bellerby, T., Todd, M., Kniveton, D., Kidd, C., 2000. Rainfall estimation from a combination of TRMM precipitation radar and GOES multispectral satellite imagery through the use of an artificial neural network. J. Appl. Meteorol. 39, 2115–2128. Brown, D.P., 2015. Garbage: how population, landmass, and development interact with culture in the production of waste. Resour. Conserv. Recycl. 98, 41–54. Buzzi, A., Davolio, S., Malguzzi, P., Drofa, O., Mastrangelo, D., 2014. Heavy rainfall episodes over Liguria in autumn 2011: numerical forecasting experiments. Nat. Hazards Earth Syst. Sci. 14, 1325–1340. Castellano, G., Fanelli, A.M., 2000. Variable selection using neural-network models. Neurocomputing 31, 1–13. Chan, K.Y., Ling, S.-H., Dillon, T.S., Nguyen, H.T., 2011. Diagnosis of hypoglycemic episodes using a neural network based rule discovery system. Expert Syst. Appl. 38, 9799–9808. Cibas, T., Soulié, F.F., Gallinari, P., Raudys, S., 1996. Variable selection with neural networks. Neurocomputing 12, 223–248. Dai, Q., Rico-Ramirez, M.A., Han, D., Islam, T., Liguori, S., 2015. Probabilistic radar rainfall now casts using empirical and theoretical uncertainty models. Hydrol. Process. 29, 66–79. Dastranj, M.R., Ebroahimi, E., Changizi, N., Sameni, E., 2011. Control DC motorspeed with adaptive neuro-fuzzy control (ANFIS). Aust. J. Basic Appl. Sci. 5, 1499–1504. Datta, B., Mitra, S., Pal, S., 2014. Estimation of average monthly rainfall with neighbourhood values: comparative study between soft computing and statistical approach. Int. J. Artif. Intell. Soft Comput. 4, 302–317. de Abreu-Harbich, L.V., Labaki, L.C., Matzarakis, A., 2015. Effect of tree planting design and tree species on human thermal comfort in the tropics. Landsc. Urban Plan. 138, 99–109. Deo, R.C., Şahin, M., 2015. Application of the extreme learning machine algorithm for the prediction of monthly Effective Drought Index in eastern Australia. Atmos. Res. 153, 512–525. Dhakal, A.S., Sullivan, K., 2014. Shallow groundwater response to rainfall on a forested headwater catchment in northern coastal California: implications of topography, rainfall, and throughfall intensities on peak pressure head generation. Hydrol. Process. 28, 446–463. Dieterle, F., Busche, S., Gauglitz, G., 2003. Growing neural networks for a multivariate calibration and variable selection of time-resolved measurements. Anal. Chim. Acta 490, 71–83. Geetha, A., Nasira, G., 2014. Rainfall prediction using logistic regression technique. Artif. Intell. Syst. Mach. Learn. 6, 246–250. Gong, Z., Forrest, J.Y.-L., 2014. Special issue on meteorological disaster risk analysis and assessment: on basis of grey systems theory. Nat. Hazards 71, 995–1000. Grigorie, T., Botez, R., 2009. Adaptive neuro-fuzzy inference system-based controllers for smart material actuator modelling. Proc. Inst. Mech. Eng. [G] 223, 655–668. Guan, Y., Zheng, F., Zhang, P., Qin, C., 2015. Spatial and temporal changes of meteorological disasters in China during 1950–2013. Nat. Hazards 75, 2607–2623. Gupta, M.D., 2014. Population, poverty, and climate change. World Bank Res. Obs. lkt009. Heald, C.L., Spracklen, D.V., 2015. Land use change impacts on air quality and climate. Chem. Rev. Hosoz, M., Ertunc, H.M., Bulgurcu, H., 2011. An adaptive neuro-fuzzy inference system model for predicting the performance of a refrigeration system with a cooling tower. Expert Syst. Appl. 38, 14148–14155. Huang, G.-B., Zhu, Q.-Y., Siew, C.-K., 2004. Extreme learning machine: a new learning scheme of feedforward neural networks. Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on, IEEE, pp. 985–990. Jang, J.-S.R., 1993. ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man Cybern. 23, 665–685. Kalteh, A.M., 2013. Monthly river flow forecasting using artificial neural network and support vector regression models coupled with wavelet transform. Comput. Geosci. 54, 1–8. Kariminia, S., Motamedi, S., Shamshirband, S., Petković, D., Roy, C., Hashim, R., 2015a. Adaptation of ANFIS model to assess thermal comfort of an urban square in moderate and dry climate. Stoch. Env. Res. Risk A. 1–15.
29
Kariminia, S., et al., 2015b. Modelling thermal comfort of visitors at urban squares in hot and arid climate using NN-ARX soft computing method. Theor. Appl. Climatol. 1–14. Khajeh, A., Modarress, H., Rezaee, B., 2009. Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers. Expert Syst. Appl. 36, 5728–5732. Koza, J.R., 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press. Kuligowski, R.J., Barros, A.P., 1998. Localized precipitation forecasts from a numerical weather prediction model using artificial neural networks. Weather Forecast. 13, 1194–1204. Kumar, P., Kishtawal, C., Pal, P., 2014. Impact of satellite rainfall assimilation on weather research and forecasting model predictions over the Indian region. J. Geophys. Res.Atmos. 119, 2017–2031. Kumar, S., Sharma, A., Shaw, R., Chauhan, S., 2015. Indigenous resilience and adaptation in high altitude arid zone communities. Mountain Hazards and Disaster Risk Reduction. Springer, pp. 177–197. Kundu, P.K., Marks, D.A., Travis, J.E., 2014. Statistical intercomparison of idealized rainfall measurements using a stochastic fractional dynamics model. J. Geophys. Res.-Atmos. 119, 10,139–110,159. Kurnaz, S., Cetin, O., Kaynak, O., 2010. Adaptive neuro-fuzzy inference system based autonomous flight control of unmanned air vehicles. Expert Syst. Appl. 37, 1229–1234. Kutzbach, J.E., 1967. Empirical eigenvectors of sea-level pressure, surface temperature and precipitation complexes over North America. J. Appl. Meteorol. 6, 791–802. Kwong, C., Wong, T., Chan, K.Y., 2009. A methodology of generating customer satisfaction models for new product development using a neuro-fuzzy approach. Expert Syst. Appl. 36, 11262–11270. Lemaire, G., Franzluebbers, A., de Faccio Carvalho, P.C., Dedieu, B., 2014. Integrated crop– livestock systems: strategies to achieve synergy between agricultural production and environmental quality, agriculture. Ecol. Environ. 190, 4–8. Maheswaran, R., Khosa, R., 2014. A wavelet-based second order nonlinear model for forecasting monthly rainfall. Water Resour. Manag. 28, 5411–5431. Makarynskyy, O., Makarynska, D., Rayson, M., Langtry, S., 2015. Combining deterministic modelling with artificial neural networks for suspended sediment estimates. Appl. Soft Comput. 35, 247–256. Manoj, S.B.A., 2011. Identification and control of nonlinear systems using soft computing techniques. Int. J. Model. Optim. 1, 24. Maqsood, I., Khan, M.R., Huang, G.H., Abdalla, R., 2005. Application of soft computing models to hourly weather analysis in southern Saskatchewan, Canada. Eng. Appl. Artif. Intell. 18, 115–125. Mehran, A., AghaKouchak, A., 2014. Capabilities of satellite precipitation datasets to estimate heavy precipitation rates at different temporal accumulations. Hydrol. Process. 28, 2262–2270. Motamedi, S., Roy, C., Shamshirband, S., Hashim, R., Petković, D., Song, K.I., 2015b. Prediction of ultrasonic pulse velocity for enhanced peat bricks using adaptive neuro-fuzzy methodology. Ultrasonics 61, 103–113. Motamedi, S., Shamshirband, S., Hashim, R., Petković, D., Roy, C., 2015a. Estimating unconfined compressive strength of cockle shell–cement–sand mixtures using soft computing methodologies. Eng. Struct. 98 (2015b), 49–58. Müller, N., Kuttler, W., Barlag, A.-B., 2014. Counteracting urban climate change: adaptation measures and their effect on thermal comfort. Theor. Appl. Climatol. 115, 243–257. Nastos, P., Paliatsos, A., Koukouletsos, K., Larissi, I., Moustris, K., 2014. Artificial neural networks modeling for forecasting the maximum daily total precipitation at Athens, Greece. Atmos. Res. 144, 141–150. Nicholson, S.E., 2014. The predictability of rainfall over the greater horn of Africa. Part I: prediction of seasonal rainfall. J. Hydrometeorol. 15, 1011–1027. Nielsen, J.E., Thorndahl, S., Rasmussen, M.R., 2014. A numerical method to generate high temporal resolution precipitation time series by combining weather radar measurements with a nowcast model. Atmos. Res. 138, 1–12. O'Gorman, P.A., Schneider, T., 2009. The physical basis for increases in precipitation extremes in simulations of 21st-century climate change. Proc. Natl. Acad. Sci. 106, 14773–14777. Ortiz-García, E., Salcedo-Sanz, S., Casanova-Mateo, C., 2014. Accurate precipitation prediction with support vector classifiers: a study including novel predictive variables and observational data. Atmos. Res. 139, 128–136. Parry, M.L., 2007. Climate Change 2007: Impacts, Adaptation and Vulnerability: Contribution of Working Group II to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press. Petković, D., Issa, M., Pavlović, N.D., Pavlović, N.T., Zentner, L., 2012a. Adaptive neurofuzzy estimation of conductive silicone rubber mechanical properties. Expert Syst. Appl. 39, 9477–9482. Petković, D., Issa, M., Pavlović, N.D., Zentner, L., Ćojbašić, Ž., 2012b. Adaptive neuro fuzzy controller for adaptive compliant robotic gripper. Expert Syst. Appl. 39, 13295–13304. Ramana, R.V., Krishna, B., Kumar, S., Pandey, N., 2013. Monthly rainfall prediction using wavelet neural network analysis. Water Resour. Manag. 27, 3697–3711. Ramesh, N., Thayakaran, R., Onof, C., 2013. Multi-site doubly stochastic Poisson process models for fine-scale rainfall. Stoch. Env. Res. Risk A. 27, 1383–1396. Ravi, S., Sudha, M., Balakrishnan, P., 2011. Design of intelligent self-tuning GA ANFIS temperature controller for plastic extrusion system. Model. Simul. Eng. 2011, 12. Sayemuzzaman, M., Mekonnen, A., Jha, M.K., 2015. Diurnal temperature range trend over North Carolina and the associated mechanisms. Atmos. Res. 160, 99–108. Schleiss, M., Chamoun, S., Berne, A., 2014. Stochastic simulation of intermittent rainfall using the concept of “dry drift”. Water Resour. Res. 50, 2329–2349. Shamshirband, S., et al., 2015a. Soft-computing methodologies for precipitation estimation: a case study. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 8, 1353–1358.
30
R. Hashim et al. / Atmospheric Research 171 (2016) 21–30
Shamshirband, S., Tavakkoli, A., Roy, C.B., Motamedi, S., Song, K.I., Hashim, R., Islam, S.M., 2015b. Hybrid intelligent model for approximating unconfined compressive strength of cement-based bricks with odd-valued array of peat content (0–29%). Powder Technol. (pp.). Sharma, A., Bose, M., 2014. Rainfall prediction using k-NN based similarity measure. Recent Advances in Information Technology. Springer, pp. 125–132. Singh, D., Tsiang, M., Rajaratnam, B., Diffenbaugh, N.S., 2014. Observed changes in extreme wet and dry spells during the South Asian summer monsoon season. Nat. Clim. Chang. 4, 456–461. Singh, R., Kainthola, A., Singh, T., 2012. Estimation of elastic constant of rocks using an ANFIS approach. Appl. Soft Comput. 12, 40–45. Sivakumar, R., Balu, K., 2010. ANFIS based distillation column control. Int. J. Comput. Appl. Spec. Issue Evol. Comput. 2, 67–73. Sowmya, K., John, C., Shrivasthava, N., 2015. Urban flood vulnerability zoning of Cochin City, southwest coast of India, using remote sensing and GIS. Nat. Hazards 75, 1271–1286. Subash, N., Singh, S., Priya, N., 2011. Variability of rainfall and effective onset and length of the monsoon season over a sub-humid climatic environment. Atmos. Res. 99, 479–487.
Tian, L., Collins, C., 2005. Adaptive neuro-fuzzy control of a flexible manipulator. Mechatronics 15, 1305–1320. Valverde, M., Araujo, E., Velho, H.C., 2014. Neural network and fuzzy logic statistical downscaling of atmospheric circulation-type specific weather pattern for rainfall forecasting. Appl. Soft Comput. 22, 681–694. Van Eekelen, M., et al., 2015. A novel approach to estimate direct and indirect water withdrawals from satellite measurements: a case study from the incomati basin, agriculture. Ecol. Environ. 200, 126–142. Vapnik, V., Golowich, S.E., Smola, A., 1996. Support vector method for function approximation, regression estimation, and signal processing. Advances in Neural Information Processing Systems 9. Citeseer. Yong, W., et al., 2010. The study of rainfall forecast based on neural network and GPS precipitable water vapor. Environmental Science and Information Application Technology (ESIAT), 2010 International Conference on, IEEE, pp. 17–20. Zhou, T., Nijssen, B., Huffman, G.J., Lettenmaier, D.P., 2014. Evaluation of real-time satellite precipitation data for global drought monitoring. J. Hydrometeorol. 15, 1651–1660.