Integrating an artificial intelligence approach with k

1 downloads 0 Views 2MB Size Report
1st Int. Congress on Environ. Modelling and Software,. Lugano, Switzerland, June 2002. Rao S, Sreenivasulu V, Bhallamudi SM, Thandaveswara B, Sudheer K.
Integrating an artificial intelligence approach with k-means clustering to model groundwater salinity: the case of Gaza coastal aquifer (Palestine) Jawad S. Alagha, Mohammed Seyam, Md Azlin Md Said & Yunes Mogheir

Hydrogeology Journal Official Journal of the International Association of Hydrogeologists ISSN 1431-2174 Hydrogeol J DOI 10.1007/s10040-017-1658-1

1 23

Your article is protected by copyright and all rights are held exclusively by SpringerVerlag GmbH Germany. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com”.

1 23

Author's personal copy Hydrogeol J DOI 10.1007/s10040-017-1658-1

PAPER

Integrating an artificial intelligence approach with k-means clustering to model groundwater salinity: the case of Gaza coastal aquifer (Palestine) Jawad S. Alagha 1 & Mohammed Seyam 2 & Md Azlin Md Said 3 & Yunes Mogheir 4

Received: 18 August 2016 / Accepted: 5 August 2017 # Springer-Verlag GmbH Germany 2017

Abstract Artificial intelligence (AI) techniques have increasingly become efficient alternative modeling tools in the water resources field, particularly when the modeled process is influenced by complex and interrelated variables. In this study, two AI techniques—artificial neural networks (ANNs) and support vector machine (SVM)—were employed to achieve deeper understanding of the salinization process (represented by chloride concentration) in complex coastal aquifers influenced by various salinity sources. Both models were trained using 11 years of groundwater quality data from 22 municipal wells in Khan Younis Governorate, Gaza, Palestine. Both techniques showed satisfactory prediction performance, where the mean absolute percentage error (MAPE) and correlation coefficient (R) for the test data set were, respectively, about 4.5 and 99.8% for the ANNs model, and 4.6 and 99.7% for SVM model. The performances of the developed models

were further noticeably improved through preprocessing the wells data set using a k-means clustering method, then conducting AI techniques separately for each cluster. The developed models with clustered data were associated with higher performance, easiness and simplicity. They can be employed as an analytical tool to investigate the influence of input variables on coastal aquifer salinity, which is of great importance for understanding salinization processes, leading to more effective water-resources-related planning and decision making. Keywords Hydrology . Water recourses management . Artificial neural networks (ANNs) . Support vector machine (SVM) . Palestine

Introduction * Mohammed Seyam [email protected] Jawad S. Alagha [email protected] Md Azlin Md Said [email protected] Yunes Mogheir [email protected] 1

Islamic University of Gaza, Gaza, Palestine

2

Faculty of Engineering, Civil Engineering Department, University of Buraimi, P.O. Box 890, P.C. 512, Al Buraimi, Sultanate of Oman

3

School of Civil Engineering, Universiti Sains Malaysia, 14300 Nibong Tebal, Pulau Pinang, Malaysia

4

Civil Engineering Department, Engineering Faculty, Islamic University of Gaza, Gaza, Palestine

About 50% of the world’s population lives in coastal areas within 60 km of the coastline; moreover, the majority of the largest cities in the world are located in these areas. This results in high water demand posing enormous pressure on water resources. In arid and semi-arid regions the problem is more severe, because groundwater is usually the only source of water, additionally the high water demand periods coincide with low groundwater recharge periods (Narayan et al. 2007; Post 2005). Coastal aquifers are complex, dynamic and heterogeneous hydrogeological systems. These aquifers are often hydraulically connected with the adjacent seas or oceans; therefore, over-exploitation of water from these aquifers can lead to saltwater intrusion phenomena. This is a widespread worldwide problem that gradually causes deterioration of groundwater quality, notably increasing its salinity (Petalas et al. 2009; Singh 2012; Werner et al. 2013). Groundwater salinity is

Author's personal copy Hydrogeol J

usually represented by the concentration of chloride (Cl) ions (Melloul and Collin 2000). Cl− is a negative ion of the element chlorine (NSE 2008). It is one of the most common elements existing in water and soil (Huang and Pang 2011). In groundwater, it is naturally present especially in deep bedrock aquifers, and mostly exists in the form of sodium chloride (NaCl; Aichele 2004; Sarala and Babu 2012). Cl− ions are unlikely to be influenced by changes in geochemical conditions such as oxidizing or reducing conditions and pH. The Cl− ion is also more persistent than other ions, because it is neither absorbed nor readily decayed by microorganisms (Aichele 2004; Mathur and Jayawardena 2005). As Cl− precipitates only at very high concentrations, it is considered as a conservative tracer of water cycle processes (Huang and Pang 2011). The concentration of Cl− in groundwater is usually reduced due to dilution, dispersion, or diffusion along the flow path (Mizumura 2003). Seawater intrusion is often caused by a drop in groundwater level in coastal aquifers as a result of over-pumping, landuse changes, and climate variations (Werner et al. 2013). In addition to seawater intrusion, salinity of coastal aquifers may originate from other sources, including migration of saline water from adjacent aquifers/sub-aquifers or from deeper aquifer layers, dissolution of bedrock layers that contain chloride, and anthropogenic sources due to ground-surface industrial, municipal and agricultural activities (Metcalf and Eddy 2000; Vengosh and Rosenthal 1994). Elevated salinity in drinking water may cause renal stones and development of hypertension specially for persons suffering from heart or kidney problems (Aichele 2004; Virkutyte and Sillanpää 2006); therefore, most water quality standards, e.g., World Health Organization (WHO), state that chloride concentration in drinking water has to be less than 250 mg/L (Organization, 2008). Salinity impacts the agriculture sector as well, where high-salinity soil and irrigation water negatively affect plant yield and growing rates (Abyaneh et al. 2005; Kincaid and Findlay 2009). Modeling coastal aquifer salinity is a highly complex and non-linear process (Sreekanth and Datta 2010). It is usually characterized by a lack of sufficient knowledge of the interrelated influencing variables, and in some cases imperfect understanding of the fundamental form of the model that governs saline water transport in heterogeneous systems (Mayer et al. 2002). Management of any coastal aquifer usually aims at maximizing the groundwater pumping rates while satisfying specific constraints pertaining to coastal aquifer protection from saline migration (George and Mantoglou 2012). Generally, the key component of the coastal aquifer management process is the seawater intrusion management through effective mitigation measures (Werner et al. 2013). The standard approach for managing seawater intrusion is to combine seawater intrusion simulation models with optimization algorithms in order to

determine the optimal operation strategy (e.g. pumping scheme) for the formulated optimization problem (Baú 2012; George and Mantoglou 2012; Rao et al. 2004); furthermore, economic aspects are usually included in the groundwater optimization models. Reducing remediation costs has become one of the objective functions in optimization models (Cunha 2002). Therefore coastal aquifer management includes a compromise between technical, environmental, economic and social criteria. Management options for coastal aquifers have been extensively investigated during the last few decades (Singh 2012). Physical and numerical models have been widely utilized for simulation of seawater intrusion—for example, Narayan et al. (2007) utilized a variable-density flow and solute transport model, SUTRA, for modeling seawater intrusion under various pumping and recharge conditions. Others applied SEAWAT code to solve the density-dependent groundwater flow and solute transport governing equations (Lin et al. 2009). The computer codes MODFLOW and MT3DMS have been utilized for the same purpose (Shammas and Jacks 2007). As for optimization models, linear programming (LP) and dynamic programming (DP) are very popular techniques for solving seawater intrusion optimization problems (Singh 2012). Surrogate modeling tools have also been utilized for coastal aquifer management. For instance, genetic programming (GP) and modular neural network (MNN) have been utilized and linked to a multi-objective genetic algorithm (MOGA) to derive the optimal pumping strategies in coastal aquifers (Sreekanth and Datta 2010). It is evident that process-based models have been the default groundwater modeling tools worldwide (Javadi and AlNajjar 2007); however, the accuracy of these models highly depends on the understanding of the underlying physical processes, in addition to the availability of detailed and accurate data about hydrological system, which are not usually available especially in developing countries due to technical and financial constraints, resulting in the models’ unsatisfactory performance (Coppola et al. 2005; Krishna et al. 2008). These limitations led to adopting a totally different approach, which is artificial intelligence (AI), for hydrological modeling purposes. AI techniques have shown satisfactory performance notably when the hydrological process is difficult to be accurately described and/or when the available data are insufficient for applying numerical and physical models, which is usually the case for groundwater quality problems such as salinity (Trichakis et al. 2011). AI techniques have become very popular and efficient tools for modeling complicated processes using relatively less cost and effort. The strength of AI techniques originates from their ability to simulate the human brain’s behavior for solving complex problems (Chen et al. 2008; Dixon 2005; Iliadis and Maris 2007; May and Sivakumar 2009; Rajanayaka et al. 2002; Seyam et al. 2016).

Author's personal copy Hydrogeol J

As one of the AI techniques, artificial neural networks (ANNs) are structures composed of input, output, and at least one hidden layer, each of which has one or more simple interconnected adaptive elements called ‘nodes’ that have high ability to perform a huge number of parallel computations (Basheer and Hajmeer 2000; Yesilnacar et al. 2008). Each node in any layer is connected to all nodes in the next layer. These links are given a weight that represents its connection strength (Singh et al. 2009). Compared to ANNs, a support vector machine (SVM) is a new learning system that has been developed based on the statistical learning theory aiming at minimizing the generalized model error rather than just minimizing the training error, which consequently increases SVM generalization ability (Asefa et al. 2006; Behzad et al. 2010; Seyam and Mogheir 2011). In recognition of their high capabilities, the use of AI techniques in hydrological applications has widely increased—for example, ANNs have been successfully used in modeling riverwater quality variables (Singh et al. 2009), spatiotemporal groundwater level simulation (Nourani et al. 2008), assessing seawater quality parameters (Palani et al. 2008), modeling and spatially interpolating arsenic concentration (Chowdhury et al. 2010), and stream flow prediction (Seyam and Othman 2014). Taormina et al. (2012) employed ANNs to simulate the hourly groundwater levels in a coastal aquifer system of the Venice lagoon, Italy and Khaki et al. (2014) employed ANNs and a neuro-fuzzy system for the assessment of groundwater quality. Nadiri et al. (2014) employed the Bayesian AI Model for hydraulic conductivity estimation, while Alagha et al. (2014) employed AI for nitrate concentration modeling in groundwater in Gaza coastal aquifer (Palestine). Khaki et al. (2015) used AI to simulate groundwater level in Langat River basin, Malaysia. Gholami et al. (2015) integrated dendrochronology and ANN for simulating groundwater level fluctuations in alluvial aquifers. Likewise, the hydrological applications of SVMs and comparison of its performance with other tools have recently gained greater attention—for instance, SVM has been employed for stream flow predictions (Asefa et al. 2006), flood forecast modeling (Yu et al. 2006), and river flow discharge modeling (Wang et al. 2009). Yoon et al. (2010) compared the ANNs and SVM performance in predicting groundwater level fluctuation, and found that SVM slightly outperformed ANNs. Behzad et al. (2010) also reached the same conclusion when applying both techniques for predicting groundwater level at various prediction horizons. On the other hand, other studies reported that ANNs slightly outperformed SVM when applied to identification of wells contaminated by nitrate. Based on the literature, fewer SVM applications have tackled groundwater contamination processes compared with ANNs; furthermore, comparison of ANNs and SVM for different hydrological systems is still an attractive research area.

To the best of the authors’ knowledge, no study to date has compared the performance of ANNs and SVM techniques for modeling groundwater chloride concentration; moreover, there is an increasing research trend in relation to techniques for improving AI prediction ability. This paper aims to achieve more understanding of groundwater salinization processes in coastal aquifers based on few monitoring data sets using two AI techniques, namely ANNs and SVM. Moreover, a proposed clustering technique was integrated in the modeling process for improving the AI models’ performance and prediction ability, which in turn deepens the understanding of groundwater salinization processes. Such understanding leads to a more rational groundwater-resources management strategy. The study investigates all potential sources of groundwater salinity including seawater intrusion and lateral flow from other aquifers, in addition to anthropogenic sources. The originality of the paper comes from integrating the k-means clustering technique with an artificial intelligence approach for modeling coastal aquifer salinity, which is a new topic to the best of the authors’ knowledge. Data from Gaza coastal aquifer in Palestine, which is a very complex hydrogeological system, are used for the models’ application and validation. Developing areas such as the case of Gaza Strip, often suffer from lack of financial and technical capabilities; hence, the major challenge in these regions is to understand groundwater trends and to model the most sensitive groundwater quality parameters using cost-effective techniques using only a few monitoring data. This challenge has been overcome where the developed model has only six available limited monitoring data sets as input variables.

Methods Study area The Gaza Strip area is a part of the Palestinian occupied territories located at the eastern coast of the Mediterranean Sea between longitudes 34° 2″ and 34° 25″ east, and latitudes 31°16″ and 31°45″ north as shown in Fig.1a (Aish 2011; United Nations Environment Programme 2003). It is one of the most densely populated areas in the world, where 1.4 million inhabitants live in an area of 365 km2 with average density of about 4,000 inhabitants per km2 (Palestinian Central Bureau of Statistics 2006). The Gaza Strip is located in an arid to semiarid region with annual average rainfall of about 325 mm. The rainy season is from October till March; meanwhile, other months are dry seasons (United Nations Environment Programme 2003). The mean temperature ranges from 25 °C in summer to 13 °C in winter (Qahman and Larabi 2006). Land slope of Gaza Strip gently decreases from about 90 m above mean sea level in the east to the mean sea level in the west (United Nations Environment Programme 2003).

Author's personal copy Hydrogeol J

Fig. 1 a–b Khan Younis Governorate location map and Gaza coastal aquifer (GCA) layout; these figures show current political borders based on the United Nations (2004) map. c Hydrogeological cross section A–A of GCA

Author's personal copy Hydrogeol J

Agriculture is the main economic activity in Gaza Strip where agricultural areas constitute more than 60% of the overall Gaza Strip area (Almasri and Ghabayen 2008). Gaza Strip is administratively divided into five governorates, among which Khan Younis Governorate, which is the study area, has the largest area of about 112 km2 with a total population of 280,000 inhabitants (Palestinian Central Bureau of Statistics 2006; United Nations Environment Programme 2003). Gaza coastal aquifer (GCA) is the only natural source of water in Gaza Strip. Water is pumped from the aquifer by more than 3,000 municipal and agricultural wells; among them about 1,100 wells exist in Khan Younis Governorate (Qahman and Larabi 2006). Gaza coastal aquifer, which is a highly heterogeneous hydrogeological system, is a part of the coastal aquifer that extends, from Gaza Strip in the south, about 120 km northward along the Mediterranean coastline. The GCA’s width varies from 3 to 10 km in the north to about 20 km in the south (Fig. 1b; Yakirevich et al. 1998). Gaza coastal aquifer thickness varies from about 120 m in the west (at the shoreline) to few meters in the east (Baalousha 2006); meanwhile, the depth to water level ranges from about 60 m below ground surface in the east to few meters near the coastline (United Nations Environment Programme 2003). Gaza coastal aquifer is composed of layers of dune sand, sandstone, calcareous sandstone, and silt. It also contains several siltyclayey impermeable layers that partially intercalate and subdivide it into sub-aquifers (Baalousha 2006; Melloul and Collin 2000). Groundwater flow in GCA as a whole is generally from the southeast to the northwest; however, flow direction may change due to high abstraction rates from some wells (Al-Agha and El-Nakhal 2004; Almasri and Ghabayen 2008). There is a connection between GCA and the Eocene aquifer which is located in the east. This connection leads to increase in GCA groundwater salinity in the eastern part. A typical cross section of GCA at Khan Younis area is depicted in Fig. 1c (Alagha et al. 2012; Yakirevich et al. 1998). Data collection and model development Depending on general knowledge about GCA salinity sources, and to develop the input–output response matrix between the potential influencing variables and the groundwater salinity, datasets about case-study wells were collected from the database of the related governmental institutions working in the water and environmental sector. The available data covered the period between 1999 and 2010. Table 1 presents an example of the collected data for well P/154 in November 2010. During model development, several available inputs were tried (shown in Table 1) such as well age, total rainfall, etc., but the final best model resulted in inclusion of only the adopted six input variables: two are constants (well screen depth, and well distance to Khan Younis Center, KYC);

Table 1 Example of the input–output response matrix of well P/154 in November 2010. KYC Khan Younis Center Parameter (units)

Value

Clf (mg/L) Clo (mg/L) LCLURC SRC

188 186

Overall recharge (m3) Municipal abstraction (m3) Agricultural abstract (m3)

94,547 196,617 69,216

Distance from KYC (km) Aquifer thickness (m)

2.45 31

Bottom of the well screen, (m) Groundwater level elevation (m) Well age (years)

69 −2.00a 8.5

Rainfall depth, RF (mm) Effective rainfall recharge, Rr (m3) Estimated surface Cl load (kg)

146 58,311 22,395

a

0.87 0.7

(−) below the mean sea level

another two are obtained by regular monitoring data (chloride concentration, and total monthly well abstraction); and the remaining two variables are related to groundwater recharge and can be easily obtained by a simple and quick calculation method (Thiessen polygon technique). Therefore, simplicity and easiness are the most significant advantages of the AI models developed in this study. The hydrological monitoring system has been significantly damaged due to the three wars that have taken place in Gaza (2009, 2012 and 2014). In recent years, the available hydrological data have been very limited. Adopting these data in the modeling process will reduce the models’ performance; hence, only historical records from 1999 to 2010 were used. Gaza Strip is an extreme paradigm on how an unstable political environment, disastrous economic situation, decaying environmental conditions and unplanned human activities have combined to further deteriorate the groundwater quality (Shomar 2011). A schematic example of time intervals used for inputvariable calculations is illustrated in Fig. 2. Time lags were also considered, where longer time periods for both recharge and abstraction were calculated and considered as additional model inputs. This was conducted because any variable might take longer to affect any hydrological phenomena (Yoon et al. 2010). The following are descriptions of the data collected in Table 1: & &

Clf: Cl concentration in November 2010. Cl o: the past record of Cl concentration, that is, in May 2010.

Author's personal copy Hydrogeol J Fig. 2 Schematic of time intervals used for input variables calculations

&

&

&

& &

Land-cover land-use recharge coefficient (LCLURC) in the well’s area. This was obtained by analyzing three aerial photos of the study area (1999, 2003, 2007) via ERDAS IMAGINE 11 and ArcGIS 10 software. The area inside each Thiessen polygon was assigned into three LCLU categories (open, urban, and agricultural areas). Open area refers to non-utilized areas. Then LCLURC for each polygon was calculated based on the weighted average of the LCLURC of each LCLU category obtained from a previous study carried out on the same study area (Hamdan et al. 2007). Soil recharge coefficient (SRC), which is dependent on the soil type and texture in the well’s catchment area, and it was obtained by calculating the weighted average of SRC for each LCLU category inside the Thiessen polygon; then the weighted average SRC for the Thiessen polygon as a whole was calculated. The basic Gaza Strip soil-type-classification maps and their associated recharge coefficient values were prepared by Metcalf and Eddy (2000). Cumulative (overall) recharge from the surface to the aquifer for the past 6 months (in this example from 15 May to 15 November), due to all recharge sources including rainfall, leakage of water distribution networks, unsewered areas, etc., considering SRC and LCLURC. Calculations of recharge due to leakage of water distribution networks and unsewered areas were based on determination of the built-up areas and population within each Thiessen polygon. The required parameters to calculate these recharge components, such as the average per capita water pumped, wastewater production rates, percentage of water networks leakage, and other parameters, were obtained from previous studies and reports (Al-Mahallawi 2005; Palestinian Central Bureau of Statistics 2006; Palestinian Water Authority 2006; World Bank 2009) Cumulative municipal abstraction for each well for the past 6 months. Cumulative agricultural abstraction for the past 6 months. This was estimated based on the average irrigation quantity for each hectare multiplied by the total agricultural area, in hectares, within the Thiessen polygon which was obtained from LU analysis.

&

& &

& & & &

Distance to Khan Younis Center (KYC), which is accounted for the effects of both seawater intrusion and lateral flow from the adjacent eastern aquifer. As a general trend, the groundwater salinity increases as the distance to the shoreline decreases, likewise in relation to the eastern border; therefore, the distance between the well and the ‘KYC line’ was used as an input expressing the well’s location. The KYC line was drawn at the middle of the maximum width of the study area (12.5 km). The distance from the well to the KYC line was then easily obtained by measuring the length of the perpendicular line between the well and the KYC line (Fig. 5). Aquifer or sub-aquifer thickness: this was measured using available aquifer profiles. Depth to the bottom of the well screen from the ground surface: this accounted for the potential for highly salty water lenses to participate in contaminating the well, and was obtained from the well’s data sheets. The stable groundwater level of the municipal well at the middle of the time interval (in this example in August 2010). Well age, determined by calculating the duration from the starting operation date of the well to the middle of time interval. Cumulative rainfall depth (in mm) in the past 6 months. Effective rainfall recharge during the past 6 months. This was calculated for the three LCLU categories (open, agricultural and urban) by the following formula: Rr ¼

3



RF ALCLU SRCLCLU LCLURCLCLU

ð1Þ

LCLU¼1

&

where Rr is total effective groundwater recharge due to rainfall for each Thiessen polygon (m3); RF is total effective groundwater recharge due to rainfall for each Thiessen polygon (m3); ALCLU is area of each LCLU category within the Thiessen polygon (m2); SRCLCLU is Weighted average soil recharge coefficient for each LCLU category; and LCLURCLCLU is land-cover land-use recharge coefficient. Estimated surface Cl load. This was calculated by assessing the chloride loads (in kg) produced by the

Author's personal copy Hydrogeol J

&

potential anthropogenic sources, such as leakage from unsewered areas, water distribution networks, etc. Other data were also collected such as eight water quality parameters which were used for well clustering: electrical conductivity (EC), total dissolved solids (TDS), nitrate (NO3), sulfate (SO4), hardness (Hard), calcium (Ca), magnesium (Mg), and sodium (Na).

All data matrix values were normalized to have zero mean and unit variance, which is a standard practice when using the multivariate statistical and artificial intelligence techniques. Data normalization is usually conducted to avoid misclassification attributed to a high difference in variable ranges, and eliminates the effect of different units (Samani et al. 2007; Sundaray 2010; Trichakis et al. 2011). Both ANNs and SVM models were applied for the 22 wells as one group; hereinafter this group will be nominated as unclustered. Afterward, the k-means clustering technique was applied for clustering the 22 wells according to their similarity with respect to various chemical parameters. Cluster analysis (CA) is an exploratory multivariate statistical tool that aims at understanding the data in a better manner by arranging the observations into groups (or clusters) based on their similarities, such that elements included in the same group have a high degree of association (Desai et al. 2010; Pejman et al. 2009; Singh et al. 2008). Next, the AI models were applied separately on each cluster, and these separated models were assembled together forming an aggregated clustered model. The k-means technique is one of the simplest and the most widely used clustering technique to classify samples into distinct clusters, where the number of clusters is predetermined at the start of the analysis. This technique produces different k clusters with the greatest possible distinction based on minimizing variability within clusters, and maximizing variability between clusters (Güler et al. 2002).

techniques, including advance modeling techniques such as ANNs, SVM, fuzzy-rule-based systems (FRBS), and genetic algorithms (GAs), etc. Each of these techniques has a specific set of advantages and disadvantages, based on data availability and modeling conditions. The decision to choose any of the GCA salinity modeling techniques mainly depends on two elements: first, the level of understanding of the physical GCA salinity processes; and second, the availability of hydrological data that describe the GCA salinity and related variables. Figure 3 shows the appropriateness of modeling techniques based on these two elements. Figure 3 shows that in the case of both sufficient understanding of the physical processes and available data, the process-based models are the suitable modeling technique. The model in this case is developed depending on the understanding of the physical processes; the model may then be calibrated using the available data. In the case of sufficient data but insufficient understanding of the physical processes, the AI-based models are the suitable modeling technique. If both data availability and understanding of the physical process are insufficient, statistical methods can be applied only as an initial tool to improve the general understanding of the hydrological processes (Basheer and Hajmeer 2000; Sivakumar and Berndtsson 2010). After data collection in the study area, it was noticed that the description of the physical characteristics of the GCA salinity and the related parameters had not yet been sufficiently addressed. However, enough data had been successfully collected for the GCA salinity and the related variables; and accordingly, AI-based models are the most suitable modeling technique for GCA salinity prediction and modeling based on the availability of sufficient modeling

Selection of the appropriate GCA salinity modeling technique No single modeling technique can explore all the hydrological modeling and prediction processes, particularly those with complex processes such as GCA salinity. The wide range of modeling techniques are dependent on different aspects of modeling mechanisms, and thus no single technique can fully investigate all the features of complex hydrological processes (Sivakumar and Berndtsson 2010). Many modeling techniques could be adopted for modeling GCA salinity and they can generally be categorized into two main approaches: process-based models and data-driven models (DDMs). Data-driven models include statistical methods and artificial intelligence (AI)

Fig. 3 Appropriateness of GCA salinity modeling techniques. Adapted from (Basheer and Hajmeer 2000)

Author's personal copy Hydrogeol J

data. In this study, modeling of GCA salinity based on a few monitoring data sets was performed using two AI techniques namely ANNs and SVM. Performance evaluation criteria Four different performance evaluation criteria were used to evaluate models: the correlation coefficient (R), and three error indicators—namely, mean absolute error (MAE; Daren Harmel and Smith 2007); mean average percentage error (MAPE; Yu and Yang 2000); and root mean square error (RMSE; Yoon et al. 2010). The formulae to calculate the error indicators are: 1 n ∑ jOi −Pi j n i¼1    Oi −Pi  1 100% MAPE ¼ ∑ni¼1  Oi  n rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 n RMSE ¼ ∑ ðOi −Pi Þ2 n i¼1 MAE ¼

ð2Þ ð3Þ

All 28 municipal wells in Khan Younis Governorate were intended to be used for applying the AI models; however, only 22 were considered (Fig. 4) because these 22 wells had periodic water quality records for at least 3 years, whereas the remaining 6 wells were either newly constructed or had a lot of missing records. The Thiessen polygons technique was used to delineate the catchment area around each well, which was used for calculations of recharge and other variables. This technique has been widely used in many hydrological applications (Bekesi and McConchie 2002; Bekesi et al. 2009; Goovaerts 2000; Hajhamad and Almasri 2009). All municipal and agricultural wells in Khan Younis Governorate (about 1,100 wells) were plotted together on the map, and afterward, Thiessen polygons were created. Then polygons belonging to the considered 22 municipal wells were used for further analyses. It is worth mentioning that routine water quality analyses are regularly performed in Gaza Strip municipal wells twice a year, in spring (May) and in autumn (November) by the ministry of health.

ð4Þ

where n = number of data pairs (observations); Oi = the ith observed value; Pi= the ith predicted value; O = mean of the observed values.

Results and discussion Chloride concentrations analysis Groundwater chloride concentrations in the case study wells were very high compared with the WHO drinking water standard, and only four out of the 22 case study wells had mean Cl concentrations below this standard as depicted in Fig. 4. Salinity levels (as Cl concentration) in groundwater in many Gaza Strip areas exceed 2,000 mg/L. Furthermore, less than 5% of Gaza Strip municipal water wells meet the WHO chloride standard. Khan Younis Governorate has the most serious situation in relation to groundwater salinity, whereby the highest chloride concentration in Gaza Strip was recorded in one of the Khan Younis wells with the Cl concentration of 2,652 mg/L, which is 10 times the WHO standard (Shomar et al. 2010). The salinity of GCA has constantly increased over time due to over-exploitation, which has resulted in seawater migration towards the major abstraction centers in urban areas (Qahman and Larabi 2006). Additional sources of GCA salinity are: (1) saline water flux from the neighboring eastern Eocene aquifer, (2) salty water that exists in lenses in many locations of GCA at deeper levels, (3) and various ground-surface Cl sources such as recharge of irrigation water and percolation of landfill leachate (Yakirevich et al. 1998; Zoller et al. 1998).

Result of the clustering process using k-means By applying the k-means clustering technique on the 22 case study wells, the wells were clustered into three clusters based on their water quality characteristics as shown in Table 2 and in Fig. 5. Well cluster No. 1 is characterized by relatively low salinity compared with the overall mean. Built-up (urban) areas characterized by high population are the dominant land use of all cluster No. 1 wells (except for well 189A). For well cluster No. 2, salinity is relatively high compared with the overall mean. These wells differ in location, and are characterized by mixed land use. The concentrations of all chemical parameters including Cl in cluster No. 3 wells are relatively low. Land-use categories of the well’s areas primarily comprise open areas associated with some agricultural activities; therefore, the well areas of cluster No. 3 wells are characterized by high groundwater recharge potential attributed to the existence of sandy soils and open areas. ANNs results and discussion Different network architectures and various training algorithms were evaluated and optimized. The architecture that gave the best results for both unclustered and aggregated clustered models was the multi-layer perceptron (MLP) feed-forward neural network with one hidden layer. The Levenberg-Marquardt (LM) technique was used as the training algorithm. Results of the best ANNs models are shown in Table 3. The test set was used to evaluate the models (Dixon 2005; Iliadis and Maris 2007). RMSE, MAE, MAPE, and R of the best model were 21.0 mg/ L, 15.1 mg/L, 3.7%, and 99.8%, respectively. Comparison between the predicted and the observed chloride concentrations in

Author's personal copy Hydrogeol J Fig. 4 Mean Cl concentration in the case study wells between 1999 and 2010 compared with the WHO standard

the case study wells in 2009 (Fig. 6) indicated the high model simulation performance. It was noticed that the model’s evaluation indicators for the aggregated clustered model (test set) were about 20% less compared with the unclustered model indicating the merit of well clustering on ANNs performance. The clustering-induced improvement could be related to the fact that well clustering arranges wells into groups such that wells falling in the same group have a high degree of similarity and common characteristics. Accordingly, when applying the ANNs model for each cluster separately, it is easier for the model to grasp the common variables that affect output; moreover, the relative weight

Table 2 Clustering of case study wells by the k-means technique associated with land-use categories in the well’s area

Well cluster No.

Well cluster No. 1

Well cluster No. 2

Well cluster No. 3

Well ID

(influence) of input variables on the model’s output is almost the same for the same group. The clustering process resulted also in improving the ANNs model’s generalization, which in turn increases the model’s suitability to be utilized for future prediction. The model’s generalization ability is usually measured by evaluating of the model’s results of the test data set. For this study, it was noticed from Table 3 that the error indicators for the aggregated clustered model of the test data set were slightly less than that for the overall data set. Meanwhile the performance of the unclustered model for the overall data set was better than that for the test data set. The effect of clustering

Land-use categories in well area Open areas (%)

Agricultural areas (%)

Urban areas (%)

L/127

33

14

53

L/159A L/189A

47 43

3 52

50 5

L/43 L/87 M/12 L/190 L/41 L/I/286 SP/1 MN/1 L/86A NJ1 L 187

16 6 21 15 12 43 30 37 40 41 31

20 11 36 70 23 49 47 33 15 34 65

64 83 43 15 65 8 23 30 45 25 4

L/181 P/146 P/154 L/176 L/182 L/184 K/179 K/19

76 70 73 80 79 53 70 42

21 30 23 16 20 45 23 33

3 3 4 4 1 2 7 15

Author's personal copy Hydrogeol J Fig. 5 Spatial location of the three clusters of municipal wells within the Khan Younis Governorate

could be more obvious by investigating the input variables of each separate model as shown in Table 4. The five input variables of the unclustered model ordered according to their weights were—Clo, overall recharge, municipal abstraction, distance to KYC, and screen-bottom depth. Meanwhile, for cluster No. 1 model, LU recharge coefficient replaced distance to KYC. Almost all cluster No. 1 wells have the same distance to KYC; hence, this variable was insignificant for this cluster. Moreover, built-up areas are the dominant land use for this cluster, so the recharge capacity basically depends on the LU recharge coefficient. It was also noticed that screenbottom depth was ranked in second place, indicating its high significance especially because deeper saline-water lenses commonly exist in the areas of cluster No. 1 wells. Input variables of cluster No. 2 were the same as the unclustered but with different relative weights. These wells had relatively high salinity compared with the overall mean, and are spatially scattered over the study area. The input variables’ ranking expressed the main three sources of elevated salinity levels, which are—seawater intrusion, lateral flow, and effect of saline lenses. The first two sources were Table 3

Evaluation of both unclustered and clustered ANNs models

Model Overall model performance Unclustered model Aggregated clustered model Improvement (%) Test model performance Unclustered model Aggregated clustered model Improvement (%)

RMSE

MAE

MAPE %

R%

23.5 21.1 11.5

17.8 15.88 12.38

4.18 3.98 5.88

99.8 99.8 -

25.1 21.0 19.6

19.0 15.1 25.7

4.5 3.7 20.5

99.8 99.8 -

expressed by distance to KYC; meanwhile, screen-bottom depth was the indicator of the effect of saline lenses. The input variables of cluster No. 3 model were comprised only of Clo, municipal abstraction, and overall recharge, which are the three common input variables for all models. Salinity values of these wells were relatively low, and the wells’ areas were characterized by high groundwater recharge. Other variables were insignificant because these wells had relatively the same distance to KYC and almost all had relatively small screen-bottom depths. The results of all models revealed the high influence of Clo on different models, where it was the most influencing input variable. The results of the best ANNs model in this study are more accurate than the results obtained by (Seyam and Mogheir 2011). They developed an ANNs-based salinity model for GCA, where the study area of the current study was included in that area. Their results for MAPE and R were 14 and 98.6%. It is believed that the superiority of the model developed in this study is related to accounting for other input variables, considering land-cover land-use changes, well clustering, and dependence on the Thiessen polygon area for variable calculations. To achieve deeper understanding of groundwater salinization processes, the effects of different input variables on well salinity were assessed by investigating response graphs as illustrated in Fig. 7. It is obvious that as the overall recharge (Fig. 7a) increases, salinity decreases; meanwhile, as municipal abstraction (Fig. 7a), Clo (Fig. 7b) and screen-bottom depth (Fig. 7d) increase, salinity increases too. The response graph for the distance to KYC (Fig. 7c) indicates that when the distance is small, salinity is high, which may be related to high municipal abstraction in KYC coupled with low recharge rates due to high population density in this area, which is characterized by high urban activity. Meanwhile, as distance to KYC increases, the population density declines, and open and

Author's personal copy Hydrogeol J Fig. 6 Observed vs predicted Cl concentrations in the case study wells in 2009

agricultural areas increase, consequently recharge increases leading to salinity decrease. At distance more than 2 km from KYC on both sides, salinity starts to increase again as the distance to KYC increases due to seawater intrusion in the west and lateral flow in the east.

SVM was less than that of ANNs. Figure 8 depicts RMSE and MAPE for both ANNs and SVM models for the test data set.

SVM results and discussion

Both ANNs and SVM showed a high ability to capture the complex relationship between input variables and groundwater salinity levels in a highly complex hydrogeological system, as depicted in Fig. 9. Comparing the results of ANNs and SVM models indicated that both techniques performed were almost the same for unclustered models. However ANNs slightly outperformed SVM for the aggregated clustered model.

Different SVM models were evaluated and optimized until the best performance was achieved. The radial basis function (RBF) was used as a Kernel function, and SVM hyperparameters were optimized by minimizing 10-fold cross validation estimates of the prediction error. The performance results of the best SVM unclustered and aggregated clustered models are shown in Table 5. It is worth mentioning here that the input variables used for SVMs models were the same as those used for ANNs models. The results indicated that the aggregated clustered model gave relatively better results compared with the unclustered model; however, the improvement that resulted from clustering of

Table 4 Ranking of the input variables weights of unclustered and aggregated clustered ANNs models

Model

Unclustered Cluster No. 1 Cluster No. 2 Cluster No. 3

Comparison between ANNs and SVM

General discussion Developing an accurate, simple and cost effective model is one of the main concerns of hydrologists, especially in

Input variables Clo

Overall recharge

Municipal abstraction

Distance to KYC

Screen-bottom depth

LCLURCLCLU

1 1

2 4

3 5

4 -

5 2

3

1

5

4

2

3

-

1

3

2

-

-

-

Author's personal copy Hydrogeol J

Fig. 7 Response graphs of input variables of the unclustered ANNs model. a Overall recharge and abstraction; b Clo; c distance to KYC; d screenbottom depth

Table 5 Evaluation of both unclustered and aggregated clustered SVM’s models Model

RMSE

MAE

MAPE %

R%

Overall model performance Unclustered model Aggregated clustered model Improvement (%) Test model performance

23.4 21.6 8.2

17.7 17.1 3.0

4.3 4.2 2.0

99.8 99.8

Unclustered model Aggregated clustered model Improvement (%)

24.2 24.3 0.4

19.3 17.4 10.7

4.6 4.1 10.8

99.7 99.8

developing regions where detailed and complete data sets about hydrological processes are usually unavailable. The Fig. 8 Evaluation of the simulation performance of both ANNs and SVM models for the test data set

most significant advantages of the AI models developed in this study are their simplicity and easiness. These models require only six easily obtained input variables, among which two are constants (well screen depth, and well distance to KYC), another two are obtained by regularly monitoring data (Clo, and monthly well abstraction), and the remaining two are related to groundwater recharge and can be easily obtained by a simple and quick calculation method (Thiessen polygon technique). In the developed models, all potential salinity sources and influencing variables in the study area were investigated including the effect of seawater intrusion, water flux from the adjacent saline aquifer, the effect of salty water in lenses at deeper aquifer layers, and ground-surface sources such as recharge of irrigation water. All aforementioned variables were investigated and included as model

Author's personal copy Hydrogeol J

Fig. 9 Observed vs predicted Cl concentrations resulting from the a aggregated clustered ANNs model, b aggregated clustered SVM model

input variables, then only the most influential ones were considered in the final model based on the sensitivity analysis. Moreover, groundwater recharge and ground-surface Cl load were calculated based on the analysis of actual sequential aerial photos considering the effect of LCLU change. Additionally, a geographic information system (GIS) and statistical analysis were integrated with the AI techniques to get high accuracy of the input variables’ calculations and to achieve the best model performances, which are considered among the advantages of this research. Limitations of the study The limitations and obstacles faced during this research work can be summarized as follows: (1) irregularity in the routine monitoring of groundwater quality parameters, for which a lot of records are missing, which in turn may negatively affect the prediction ability of the models; (2) there are numerous uncertainties pertaining to the nature of the available data, whereby the effect of this problem was mitigated by further investigation and double checking of data obtained from multiple sources, as well as conducting site visits to wells locations; (3) low resolution of the available aerial photos that may negatively affect the accuracy of map classification, whereby the effect of this problem was mitigated by further investigation, conducting field visits to the areas around the wells, and manual correction of the classification results of the aerial photos.

Conclusions Despite their simplicity, both ANNs and SVM showed capability to predict the groundwater salinity and to capture the complex relationship between input variables and groundwater salinity levels in a highly complex hydrogeological system. The study demonstrated the effectiveness of well clustering as a pre-modeling technique on the models’ performance, especially for ANNs. This technique more accurately captured the input-output relationships due to the relative similarity in

characteristics among wells grouped at the same cluster. The positive effect of data clustering was obvious even though the clustered models had sparse data sets compared with the original unclustered model, which in turn was supposed to negatively affect model performance. Accordingly, it is recommended to cluster wells, stations, and sampling points before applying AI techniques particularly for heterogeneous systems; however, the number of clusters has to be kept to a minimum, such that data scarcity associated with clustering does not affect the model’s performance. Analysis of the effect of the models’ input variables on groundwater salinity obtained through sensitivity analysis can help in deepening the understanding of physical salinity processes and their actual influencing sources. This will enable further utilization of the developed model as a decision support tool for assessing the consequences of different management scenarios related to aquifer salinity, and which will be very helpful when drawing up policies and strategies for groundwater salinity management. Furthermore, a reliable simulation model is the main prerequisite for developing a groundwater optimization model; therefore, given its high accuracy and simplicity, the simulation approach adopted in this study can be applied for complex groundwater modeling problems particularly in developing regions that usually suffer from financial and technical constraints, leading to rationale groundwater management. It is recommended to utilize other AI techniques and AIbased hybrid techniques to assessing their performance for modeling groundwater quality. Developing AI hybrid models that incorporate AI models with regional climate change models in order to predict the effects of climate change on the future groundwater quality and quantity. The utilized modeling approach could be applied in any similar coastal aquifers, especially those located in the Mediterranean Sea due to the similarity of climate and topographic features.

Compliance with ethical standards Conflict of interest The authors declare that they have no conflict of interest.

Author's personal copy Hydrogeol J

References Abyaneh HZ, Nazimi AH, Neyshabori MR, Majzoobi GH (2005) Chloride estimation in ground water from electrical conductivity measurement. Tarim Bilimleri Dergisi 11(1):110–114 Aichele S (2004) Arsenic, nitrate, and chloride in groundwater, Oakland County, Michigan. US Geol Surv Fact Sheet 20043120. Available at http://pubs.usgs.gov/sir/2004/5060/. Accessed August 2017 Aish AM (2011) Water quality evaluation of small-scale desalination plants in the Gaza Strip, Palestine. Desalin Water Treat 29(1–3): 164–173 Al-Agha MR, El-Nakhal HA (2004) Hydrochemical facies of groundwater in the Gaza Strip, Palestine/Faciès hydrochimiques de l’eau souterraine dans la Bande de Gaza, Palestine. Hydrol Sci J 49 (3) Al-Mahallawi K (2005) Modeling interaction of land use, urbanization and hydrological factors for the analysis of groundwater quality in Mediterranean zone: example of the Gaza strip, Palestine. PhD Thesis, University of Lille for Science and Technology, Lille, France Alagha J, Said M, Mogheir Y (2014) Modeling of nitrate concentration in groundwater using artificial intelligence approach: a case study of Gaza coastal aquifer. Environ Monit Assess 186(1):35–45 Alagha JS, Said MAM, Mogheir Y Seyam M (2012) Modelling of chloride concentration in coastal aquifers using artificial neural networks: a case study: Khanyounis Governorate Gaza StripPalestine. Caspian J Appl Aci Res 2(AICCE’12 & GIZ’12):158– 165 Almasri MN, Ghabayen SMS (2008) Analysis of nitrate contamination of Gaza coastal aquifer, Palestine. J Hydrol Eng 13:132 Asefa T, Kemblowski M, McKee M, Khalil A (2006) Multi-time scale stream flow predictions: the support vector machines approach. J Hydrol 318(1–4):7–16 Baalousha H (2006) Desalination status in the Gaza Strip and its environmental impact. Desalination 196(1–3):1–12 Basheer I, Hajmeer M (2000) Artificial neural networks: fundamentals, computing, design, and application. J Microbiol Methods 43(1):3– 31 Baú DA (2012) Planning of groundwater supply systems subject to uncertainty using stochastic flow reduced models and multi-objective evolutionary optimization. Water Resour Manag 26(9):2513–2536 Behzad M, Asghari K, Coppola EA Jr (2010) Comparative study of SVMs and ANNs in aquifer water level prediction. J Comput Civ Eng 24:408 Bekesi G, McConchie J (2002) The use of aquifer-media characteristics to model vulnerability to contamination, Manawatu region, New Zealand. Hydrogeol J 10(2):322–331 Bekesi G, McGuire M, Moiler D (2009) Groundwater allocation using a groundwater level response management method: Gnangara groundwater system, Western Australia. Water Resour Manag 23(9):1665–1683 Chen SH, Jakeman AJ, Norton JP (2008) Artificial intelligence techniques: an introduction to their use for modelling environmental systems. Math Comput Simul 78(2):379–400 Chowdhury M, Alouani A, Hossain F (2010) Comparison of ordinary kriging and artificial neural network for spatial mapping of arsenic contamination of groundwater. Stoch Env Res Risk A 24(1):1–7 Coppola EA Jr, Rana AJ, Poulton MM, Szidarovszky F, Uhl VW (2005) A neural network model for predicting aquifer water level elevations. Ground Water 43(2):231–241 Cunha MDC (2002) Groundwater cleanup: the optimization perspective (a literature review). Eng Optim 34(6):689–702 Daren Harmel R, Smith PK (2007) Consideration of measurement uncertainty in the evaluation of goodness-of-fit in hydrologic and water quality modeling. J Hydrol 337(3–4):326–336

Desai AM, Rifai H, Helfer E, Moreno N, Stein R (2010) Statistical investigations into indicator bacteria concentrations in Houston metropolitan watersheds. Water Environ Res 82(4):302–318 Dixon B (2005) Applicability of neuro-fuzzy techniques in predicting ground-water vulnerability: a GIS-based sensitivity analysis. J Hydrol 309(1–4):17–38 George K, Mantoglou A (2012) Development of a multi-objective optimization algorithm using surrogate models for coastal aquifer management. J Hydrol Gholami V, Chau KW, Fadaee F, Torkaman J, Ghaffari A (2015) Modeling of groundwater level fluctuations using dendrochronology in alluvial aquifers. J Hydrol 529(Part 3):1060–1069 Goovaerts P (2000) Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. J Hydrol 228(1):113–129 Güler C, Thyne GD, McCray JE, Turner KA (2002) Evaluation of graphical and multivariate statistical methods for classification of water chemistry data. Hydrogeol J 10(4):455–474 Hajhamad L, Almasri MN (2009) Assessment of nitrate contamination of groundwater using lumped-parameter models. Environ Model Softw 24(9):1073–1087 Hamdan SM, Troeger U, Nassar A (2007) Stormwater availability in the Gaza Strip, Palestine. Int J Environ Health 1(4):580–594 Huang T, Pang Z (2011) Estimating groundwater recharge following land-use change using chloride mass balance of soil profiles: a case study at Guyuan and Xifeng in the Loess Plateau of China. Hydrogeol J 19(1):177–186 Iliadis LS, Maris F (2007) An artificial neural network model for mountainous water-resources management: the case of Cyprus mountainous watersheds. Environ Model Softw 22(7):1066–1072 Javadi A, Al-Najjar M (2007) Finite element modeling of contaminant transport in soils including the effect of chemical reactions. J Hazard Mater 143(3):690–701 Khaki M, Yusoff I, Islami N (2014) Application of the artificial neural network and neuro-fuzzy system for assessment of groundwater quality. Clean Soil Air Water 43(4):551–560 Khaki M, Yusoff I, Islami N (2015) Simulation of groundwater level through artificial intelligence system. Environ Earth Sci. https:// doi.org/10.1007/s12665-014-3997-8 Kincaid DW, Findlay SEG (2009) Sources of elevated chloride in local streams: groundwater and soils as potential reservoirs. Water Air Soil Pollut 203(1):335–342 Krishna B, Satyaji Rao Y, Vijaya T (2008) Modelling groundwater levels in an urban coastal aquifer using artificial neural networks. Hydrol Process 22(8):1180–1188 Lin J, Snodsmith JB, Zheng C, Wu J (2009) A modeling study of seawater intrusion in Alabama Gulf Coast, USA. Environ Geol 57(1):119– 130 Mathur S, Jayawardena L (2005) Modelling migration of contaminants from waste disposal facility. Int J Environ Stud 62(1):15–34 May DB, Sivakumar M (2009) Prediction of urban stormwater quality using artificial neural networks. Environ Model Softw 24(2):296– 302 Mayer AS, Kelley C, Miller CT (2002) Optimal design for problems involving flow and transport phenomena in saturated subsurface systems. Adv Water Resour 25(8):1233–1256 Melloul A, Collin M (2000) Sustainable groundwater management of the stressed coastal aquifer in the Gaza region. Hydrol Sci J 45(1):147– 159 Metcalf and Eddy Inc. (2000) Coastal Aquifer Management Plan (CAMP). Final model report (task 7). USAID Study Task 3, vol 1. USAID, Washington, DC Mizumura K (2003) Chloride ion in groundwater near disposal of solid wastes in landfills. J Hydrol Eng 8(4):204–213 Nadiri A, Chitsazan N, Tsai F, Moghaddam A (2014) Bayesian artificial intelligence model averaging for hydraulic conductivity estimation. J Hydrol Eng 19(3):520–532

Author's personal copy Hydrogeol J Narayan KA, Schleeberger C, Bristow KL (2007) Modelling seawater intrusion in the Burdekin Delta irrigation area, North Queensland, Australia. Agric Water Manag 89(3):217–228 Nourani V, Mogaddam AA, Nadiri AO (2008) An ANN-based model for spatiotemporal groundwater level forecasting. Hydrol Process 22(26):5054–5066 NSE (2008) The drop on water: chloride. Guidelines for drinking-water quality, vol 1, 3rd edn. https://novascotia.ca/nse/water/docs/ droponwaterFAQ_Chloride.pdf. Accessed August 2017 Palani S, Liong SY, Tkalich P (2008) An ANN application for water quality forecasting. Mar Pollut Bull 56(9):1586–1597 Palestinian Central Bureau of Statistics (2006) Small area population, revised estimates 2004–2006. Palestinian Central Bureau of Statistics, Ramalla, Palestine Palestinian Water Authority P (2006) Environmental assessment of North Gaza Emergency Sewage Treatment Plant project (final report). Palestinian Water Authority, Gaza Strip Pejman A, Bidhendi GRN, Karbassi A, Mehrdadi N, Bidhendi ME (2009) Evaluation of spatial and seasonal variations in surface water quality using multivariate statistical techniques. Int J Environ Sci Technol 6(3):467–476 Petalas C, Pisinaras V, Gemitzi A, Tsihrintzis VA, Ouzounis K (2009) Current conditions of saltwater intrusion in the coastal Rhodope aquifer system, northeastern Greece. Desalination 237(1):22–41 Post V (2005) Fresh and saline groundwater interaction in coastal aquifers: is our technology ready for the problems ahead? Hydrogeol J 13(1):120–123 Qahman K, Larabi A (2006) Evaluation and numerical modeling of seawater intrusion in the Gaza aquifer (Palestine). Hydrogeol J 14(5): 713–728 Rajanayaka C, Samarasinghe S, Kulasiri D (2002) Solving the inverse problem in stochastic groundwater modelling with artificial neural networks. 1st Int. Congress on Environ. Modelling and Software, Lugano, Switzerland, June 2002 Rao S, Sreenivasulu V, Bhallamudi SM, Thandaveswara B, Sudheer K (2004) Planning groundwater development in coastal aquifers/ Planification du développement de la ressource en eau souterraine des aquifères côtiers. Hydrol Sci J 49(1):155–170 Samani N, Gohari-Moghadam M, Safavi A (2007) A simple neural network model for the determination of aquifer parameters. J Hydrol 340(1–2):1–11 Sarala C, Babu R (2012) Assessment of groundwater quality parameters in and around Jawaharnagar, Hyderabad. Int J Sci Res Publ 2(10) Seyam M, Mogheir Y (2011a) Application of artificial neural networks model as analytical tool for groundwater salinity. J Environ Prot 2(01):56 Seyam M, Mogheir Y (2011b) A new approach for groundwater quality management. Islam Univ J (Ser Nat Stud Eng) 19(1):157–177 Seyam M, Othman F (2014) The influence of accurate lag time estimation on the performance of stream flow data-driven based models. Water Resour Manag 28(9):2583–2597 Seyam M, Othman F, El-Shafie A (2016) RBFNN versus empirical models for lag time prediction in tropical humid rivers. Water Resour Manag 28(9):2583–2597 Shammas MI, Jacks G (2007) Seawater intrusion in the Salalah Plain Aquifer, Oman. Environ Geol 53(3):575–587 Shomar B (2011) Groundwater contaminations and health perspectives in developing world case study: Gaza Strip. Environ Geochem Health 33(2):189–202 Shomar B, Fkher S, Yahya A (2010) Assessment of groundwater quality in the Gaza strip, Palestine using GIS mapping. J Water Resour Protect 2(2):93–104

Singh A (2012) An overview of the optimization modelling applications. J Hydrol 466:167–182 Singh KP, Basant A, Malik A, Jain G (2009) Artificial neural network modeling of the river water quality: a case study. Ecol Model 220(6): 888–895 Singh UK, Kumar M, Chauhan R, Jha PK, Ramanathan A, Subramanian V (2008) Assessment of the impact of landfill on groundwater quality: a case study of the Pirana site in western India. Environ Monit Assess 141(1):309–321 Sivakumar B, Berndtsson, R (2010) Advances in data-based approaches for hydrologic modeling and forecasting. World Scientific, Singapore, pp 463–477 Sreekanth J, Datta B (2010) Multi-objective management of saltwater intrusion in coastal aquifers using genetic programming and modular neural network based surrogate models. J Hydrol 393(3):245– 256 Sundaray SK (2010) Application of multivariate statistical techniques in hydrogeochemical studies: a case study—Brahmani–Koel River (India). Environ Monit Assess 164(1):297–310 Taormina R, Chau K-W, Sethi R (2012) Artificial neural network simulation of hourly groundwater levels in a coastal aquifer system of the Venice Lagoon. Eng Appl Artif Intell 25(8):1670–1676 Trichakis IC, Nikolos IK, Karatzas G (2011) Artificial neural network (ANN) based modeling for karstic groundwater level simulation. Water Resour Manag 25(4):1143–1152 United Nations Environment Programme (2003) Desk study on the environment in the occupied Palestinian territories. UNEP, Geneva United Nations (2004). General maps, no. 3584, Rev. 2, January 2004. UN Geospatial Information Section. http://www.un.org/Depts/ Cartographic/map/profile/israel.pdf. Accessed August 2017 Vengosh A, Rosenthal E (1994) Saline groundwater in Israel: its bearing on the water crisis in the country. J Hydrol 156(1):389–430 Virkutyte J, Sillanpää M (2006) Chemical evaluation of potable water in eastern Qinghai Province, China: human health aspects. Environ Int 32(1):80–86 Wang WC, Chau KW, Cheng CT, Qiu L (2009) A comparison of performance of several artificial intelligence methods for forecasting monthly discharge time series. J Hydrol 374(3):294–306 Werner AD, Bakker M, Post VEA, Vandenbohede A, Lu C, AtaieAshtiani B, Simmons CT, Barry DA (2013) Seawater intrusion processes, investigation and management: recent advances and future challenges. Advan Water Resour 51:3–26 World Bank (2009) Assessment of restrictions on Palestinian water sector development. World Bank, Washington, DC Yakirevich A, Melloul A, Sorek S, Shaath S, Borisov V (1998) Simulation of seawater intrusion into the Khan Yunis area of the Gaza Strip coastal aquifer. Hydrogeol J 6(4):549–559 Yesilnacar MI, Sahinkaya E, Naz M, Ozkaya B (2008) Neural network prediction of nitrate in groundwater of Harran plain, Turkey. Environ Geol 56(1):19–25 Yoon H, Jun SC, Hyun Y, Bae GO, Lee KK (2010) A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer. J Hydrol 396(1):128–138 Yu PS, Yang TC (2000) Fuzzy multi-objective function for rainfall-runoff model calibration. J Hydrol 238(1–2):1–14 Yu PS, Chen ST, Chang I (2006) Support vector regression for real-time flood stage forecasting. J Hydrol 328(3–4):704–716 Zoller U, Goldenberg LC, Melloul AJ (1998) The Bshort-cut^ enhanced contamination of the Gaza Strip coastal aquifer. Water Res 32(6): 1779–1788