Optimal Dynamic Monitoring Network Design and ... - Springer Link

70 downloads 1301 Views 422KB Size Report
Nov 6, 2008 - Springer Science + Business Media B.V. 2008. Abstract ... fluxes are then utilized for designing an optimal monitoring network for the first stage.
Water Resour Manage (2009) 23:2031–2049 DOI 10.1007/s11269-008-9368-z

Optimal Dynamic Monitoring Network Design and Identification of Unknown Groundwater Pollution Sources Bithin Datta · Dibakar Chakrabarty · Anirban Dhar

Received: 3 October 2007 / Accepted: 14 October 2008 / Published online: 6 November 2008 © Springer Science + Business Media B.V. 2008

Abstract The identification of unknown pollution sources is a prerequisite for designing of a remediation strategy. In most of the real world situations, it is difficult to identify the pollution sources without a scientifically designed efficient monitoring network. The locations of the contaminant concentration measurement sites would determine the efficiency of the unknown source identification process to a large extent. Therefore coupled and iterative sequential source identification and dynamic monitoring network design framework is developed. The coupled approach provides a framework for necessary sequential exchange of information between monitoring network and source identification methodology. The preliminary identification of unknown sources, based on limited concentration data from existing arbitrarily located wells provides the initial rough estimate of the source fluxes. These identified source fluxes are then utilized for designing an optimal monitoring network for the first stage. Both the monitoring network and source identification process is repeated by sequential identification of sources and design of monitoring network which provides the feedback information. In the optimal source identification model, the Jacobian matrix which is the determinant for the search direction in the nonlinear optimization model links the groundwater flow-transport simulator and the optimization method. For the optimal monitoring network design, the integer programming based optimal design model requires as input, simulated sets of concentration data. In the proposed methodology, the concentration measurement data from the designed and

B. Datta · A. Dhar School of Engineering, James Cook University, Townsville, Australia D. Chakrabarty Department of Civil Engineering, National Institute of Technology, Silchar, India B. Datta (B) Discipline of Civil and Environmental Engineering, School of Engineering, James Cook University, Townsville, QLD 4811, Australia e-mail: [email protected]

2032

B. Datta et al.

implemented monitoring network are used as feedback information for sequential identification of unknown pollution sources. The potential applicability of the developed methodology is demonstrated for an illustrative study area. Keywords Groundwater pollution · Source identification · Monitoring network design · Optimization models Notation The following symbols are used in this paper: b C C∗ c cL cU cik Dm  D f() f g  I i j Kxx Kxx /K yy k  k kr l Ndp Ndl N Pl p Qp q qL qU qik Sw T t

thickness of aquifer dissolved mass fraction solute concentration of fluid sources concentration vector lower bound of concentration vector upper bound of concentration vector concentration at spatiotemporal location (i, k) apparent molecular diffusivity of solute in solute in a porous medium including tortuosity effects dispersion tensor flow and transport simulation function volumetric adsorbate source gravitational acceleration identity tensor corresponds to spatial location corresponds to source location longitudinal hydraulic conductivity hydraulic conductivity ratio corresponds to temporal location solid matrix permeability tensor relative permeability to fluid flow corresponds to design stage number of disposal periods number of disposal locations number of total potential monitoring locations during a particular design stage  maximum number of wells, permitted for installations during stage l fluid pressure fluid mass source source characterization decision vector lower bound of source characterization decision vector upper bound of source characterization decision vector flux at spatiotemporal location (i, k) water saturation cumulative number of time steps at the end of a particular monitoring network design stage  time

Unknown groundwater pollution sources

v wik Zc αL αT w τ   slj ε δ η κ μ ξ ρ ω χi act est obs pert SD sim trim

2033

average fluid velocity weight corresponding spatiotemporal location (i, k) set of spatiotemporal concentration observation locations longitudinal dispersivity transverse dispersivity solute mass in source fluid due to production reactions error objective function design implementation stage set of all potential monitoring well locations which are in the resultant direction of flow velocity for particular source location j porosity standard normal random variate for concentration weight constant for concentration last time step index in the preceding monitoring network design stage ( − 1) fluid viscosity error factor fluid density standard normal variate binary decision variable, value of 1 indicating installation at potential monitoring location i during stage  actual value estimated value observed value perturbed value standard deviation value simulated value trimmed value

1 Introduction Efficiency and reliability of groundwater pollution source identification depends on availability and adequacy of observation data. Pollution in an aquifer quite often is detected by a few sparsely and arbitrarily distributed water supply wells. The amount of concentration data collected from these wells is generally limited. Moreover, the location of these wells may not be optimal for identifying the unknown pollution sources in terms of magnitude, location, and disposal period. Once pollution is detected in the groundwater system, the first step towards remediation involves detection of spatiotemporally distributed unknown pollution sources. Also, performance of any remediation strategy greatly depends on feedback information over the management horizon. Thus adequate amount of observed spatiotemporal concentration data is necessary for reliable identification of unknown groundwater pollution sources. A source identification methodology combined with dynamic monitoring network design is proposed. It uses a classical nonlinear optimization model externally linked to a flow and transport simulation model. It is demonstrated through illustrative problems, that it is possible to improve the source identification by sequentially designing an optimal monitoring network and utilizing the data as feedback information from this network for the identification.

2034

B. Datta et al.

In real life situations, an appropriate strategy for identifying unknown pollution sources with limited amount of concentration data, obtained from arbitrarily located wells would be to first make a preliminary estimation of pollution sources. These preliminary source estimates may then be utilized for designing an efficient quality monitoring network for a certain design period. Once such a monitoring network is implemented, data subsequently collected from these wells, along with the already available concentration data, can be utilized for better identification of the unknown pollution sources. This combined procedure of monitoring network design and source identification, can be continued until all the unknown pollution source characteristics are estimated with certain degree of accuracy or reliability. Identification problems essentially belong to the category of inverse problem, which are often ill-posed (Yeh 1986) and a unique solution does not necessarily exist. Also, the solutions are susceptible to small changes in the input data (Liu and Ball 1999). Comprehensive review of source identification methodologies can be found in work by Atmadja and Bagtzoglou (2001b), Michalak and Kitanidis (2004), and Sun et al. (2006a). Numerous works related to pollution source identification are available, like least square regression and linear programming with response matrix approach (Gorelick et al. 1983), statistical pattern recognition (Datta et al. 1989), random walk based backward tracking model (Bagtzoglou et al. 1992), nonlinear maximum likelihood estimation (Wagner 1992), nonlinear optimization with embedding technique (Mahar and Datta 1997, 2000, 2001), correlation coefficient optimization (Sidauruk et al. 1997), backward probabilistic model (Neupauer and Wilson 1999), geostatistical inversion approach (Snodgrass and Kitanidis 1997; Butera and Tanda 2003; Michalak and Kitanidis 2004), Tikhonov regularization (Skaggs and Kabala 1994; Liu and Ball 1999), quasi-reversibility (Skaggs and Kabala 1995; Bagtzoglou and Atmadja 2003), marching-jury backward beam equation (Atmadja and Bagtzoglou 2001a; Bagtzoglou and Atmadja 2003), genetic algorithm based approach (Aral et al. 2001; Mahinthakumar and Sayeed 2005; Singh and Datta 2006), artificial neural network approach (Singh and Datta 2004, 2007; Singh et al. 2004), constrained robust least square approach (Sun et al. 2006a, b), robust geostatistical approach (Sun 2007). However, only few studies have incorporated monitoring network within the source identification framework. Monitoring network design methodologies have been proposed in different studies like Massmann and Freeze (1987), Meyer and Brill (1988), McKinney and Loucks (1992), Cieniawski et al. (1995), Hudak et al. (1995), Datta and Dhiman (1996), Mahar and Datta (1997), Reed et al. (2000), Reed and Minesker (2004), Wu et al. (2005). In depth review of monitoring network design studies can be found in Loaiciga et al. (1992) and ASCE Task Committee (2003). However, the designed optimal network is not dynamic in nature. Till date only a few studies stressed the need for time variant network design. Loaiciga (1989) solved a mixed integer programming problem to minimize the variance of the estimation error in spatiotemporal sense. Grabow et al. (2000) proposed a sequential groundwater monitoring network design procedure. The empirically based model required solution of inverse problems, and was conceptually Bayesian, in that the model operated within uncertainty bounds and the model parameters were updated with new information. Montas et al. (2000) reported a space-time design to specify monitoring well locations, and a sampling schedule that minimizes plume characterization error, while satisfying constraints on the maximum number of wells and allowable number of active wells. Mugunthan and

Unknown groundwater pollution sources

2035

Shoemaker (2004) designed monitoring networks for sequential installation based on contaminant mass estimation. Nunes et al. (2004a, b) solved the problem of temporal redundancy reduction in monitoring networks based on historical time series data. Recently, Dhar and Datta (2007) developed a multiobjective dynamic monitoring network design methodology. They have showed that, it is economically more efficient to design a dynamic monitoring network that can be implemented in stages to characterize the transient pollutant plume in the aquifer. Variation of dynamic monitoring network design methodology is also reported in Sreenivasulu and Datta (2008). Optimal identification of unknown groundwater pollution sources needs pollution plume monitoring data. The reliability of optimal characterization of pollution sources is dependent on the monitoring data. An optimal monitoring network solely designed for improving the identification process is therefore, an important aspect. The proposed methodology addresses this important aspect of combining a specifically designed optimal dynamic monitoring network and an optimal source identification model for improving the efficiency of source identification. The methodology proposed in this study essentially combines identification of unknown pollution sources with an optimal monitoring network design which is dynamic in nature. The proposed methodology is an improvement over the combined source identification and monitoring network design model presented by Mahar and Datta (1997). The developed methodology combines both a linked optimization-simulation model using an external independent simulator for source identification, and an integer programming based optimization model for dynamic optimal monitoring network design. The potential utility of the developed methodology is demonstrated for an illustrative study area.

2 Methodology The proposed methodology involves an iterative procedure using an optimal source identification model, and an optimal dynamic monitoring network design model. The methodology consists of three distinct steps. In the first step, an optimal source identification model is utilized for preliminary identification of unknown pollution sources using measured concentration data collected from sparsely and arbitrarily located initially existing wells. In the second step, these preliminary identification results are utilized for designing an optimal pollutant monitoring network. The integer programming based monitoring network design model requires sets of time varying simulated concentration data at potential monitoring well locations. The designed monitoring network is then implemented for pollutant concentration data collection. In the third step, the data collected from the implemented monitoring network, along with the existing data are utilized for better identification of the unknown pollution sources. This completes one step of the proposed methodology. This process is again repeated in the next stage with the optimal set of observation wells now available for measuring concentrations. The developed methodology is capable of designing optimal monitoring networks which evolve with time. The objective is to improve identification of the unknown pollution sources. The developed model thus prescribes a dynamic monitoring network design in the source identification procedure. A schematic representation of the proposed methodology is shown in Fig. 1.

2036 Fig. 1 Schematic representation of developed methodology

B. Datta et al.

Unknown groundwater pollution sources

2037

2.1 Simulation Model The simulation model SUTRA (Voss 1984) is used in this study for simulating the flow and transport processes. Potentially, other numerical simulation models can be incorporated. SUTRA employs a two-dimensional hybrid Galerkin finite element and integrated finite-difference method to approximate the governing partial differential equations. It is capable of simulating fluid density dependent saturated or unsaturated groundwater flow, and either single species reactive solute transport or thermal energy transport. However, this study considers the simulation of steady or transient groundwater flow and transport of conservative single species solute for a fluid of constant density. Groundwater fluid mass balance is expressed as (Voss 1984):      ∂ (ε Sw ρ) kk ρ r   p − ρ g −∇ (1) ∇ = QP ∂t μ Solute transport of single species contaminant is simulated using (Voss 1984): ∂ (ε Sw ρ C)  · (ε Sw ρ v C) = −f −∇ ∂t      · ∇C  · ε Sw ρ Dm  +∇ + ε Sw ρ w + Q P C∗ I+D

(2)

The flow and transport simulation model together simulates the concentrations in space and time for the study area. 2.2 Source Identification Model The aim of the source identification model is to determine the unknown groundwater pollution source characteristics (location, disposal duration, and solute mass flux or volume disposal rates). The objective function can be stated as minimizing the error which is equal to the weighted sum of the squared deviations between observed and simulated concentrations for all spatiotemporal measurements. The optimal source identification model can be formulated as: Minimize :



(3)

Subject to: τ −



2 wik cik obs − cik = 0

(4)

 c=f q

(5)

cL ≤ c ≤ cU

(6)

q L ≤ q ≤ qU

(7)

(i, k) ∈Z c

This formulation ensures that the derivatives of the objective function are known analytically which reduces the computational burden. This is particularly advantageous, where the constraints are nonlinear (Chakrabarty 2001). The constraint

2038

B. Datta et al.

set 5 is the externally linked flow and transport simulation model that transforms the source mass flux or volume disposal rate, at various potential source locations at various disposal periods into spatiotemporal concentrations in the aquifer as a function of q. The decision variables of the optimization model are the unknown source fluxes, i.e. q. The simulated concentration c as obtained from the linked simulation model solutions are also treated as variables in the optimization model. These simulated concentrations at specified observation locations are included as variables for defining the objective function as per Eq. 4. The linked simulation model computes the concentration values based on candidate solutions generated by the optimization algorithm. The observed concentration values at observation locations and the hydrologic flow and transport parameter values are specified inputs to the linked simulation-optimization model. The initial and boundary condition for the study area are also specified as inputs. In order to solve the identification problem, it is necessary to iterate between an optimization method and a flow and transport simulator. Therefore, the constraint set represented by Eq. 5, is an implicit set of constraints in the optimization model. Constraint set (6) ensures that once resulting concentrations are evaluated for an assumed set of source characteristics, only those sets of q are considered acceptable, which result in simulated concentrations cik within some predefined lower and upper bounds. For example, the actual values of these lower and upper bounds may be calculated by subtracting and adding respectively, some tolerances to the corresponding observed concentration values. The lower and upper bounds on the sources (Eq. 7) ensures that practically acceptable source fluxes are considered. In the source identification models presented here, all the lower bounds on the decision variables q are specified as zero. The formulation requires specification of weights wik . In all the performance evaluations reported in this study, the following weights are chosen:

1 wik = 2 cik obs + η

(8)

According to Keidser and Rosbjerg (1991), it is preferable to add a constant (η) to the measured concentration to prevent large difference at low concentrations to dominate the solution. In this study, the value of η is assumed to be 100.0 ppm. In groundwater pollution source identification model, either spatiotemporally varying source mass fluxes or volume disposal rates, and the solute concentrations are considered as unknown decision variables. In this study, the nonlinear optimization algorithm NPSOL (Gill et al. 1986) is used to solve the source identification problem. NPSOL is designed to minimize a nonlinear objective function subject to linear and/or nonlinear constraints. Repeated calls to the flow and transport simulator SUTRA are essential for estimating the Jacobian and the gradients of the objective function, whenever it is required. These required modifications are implemented as part of the solution algorithm for the developed linked simulation-optimization model. The proposed methodology is generic in nature. NPSOL is used here only for solution of an illustrative problem. Other nonlinear programming algorithms can be utilized also.

Unknown groundwater pollution sources

2039

2.3 Optimal Monitoring Network Design Model The dynamic monitoring network design model is based on integer programming. This model is an improved version of the models developed by Mahar and Datta (1997). Given a set of potential monitoring well locations, and a maximum number of monitoring wells to be installed, the model selects those locations for which the sum of trimmed mean concentrations at all time steps of the design stage become maximum. Estimation of trimmed mean concentration requires simulation of concentrations at each potential monitoring well location at each time step of a design stage. Each set of simulated concentrations corresponds to a perturbed set of source identification results obtained by solving the source identification model in the preceding stage. The concept of trimmed mean concentrations is incorporated in the monitoring network design model with a view to minimizing the effect of extreme concentrations, resulting from randomly generated fluxes. These unacceptable or outlier concentrations at a potential monitoring network well location may be generated due to randomly generated source fluxes lying in the tail regions of the distribution used for random generation. Use of trimmed mean concentrations may not be required if the distribution used for random generation of fluxes can incorporate specified upper and lower bounds. A trimmed mean is basically a compromise between the sample mean and the sample median. It is less sensitive to outlier than the sample mean but more sensitive than the sample median. A trimming percentage, varying between five and twenty five, is generally recommended by statisticians (Devore and Farnum 1999). The stage wise monitoring network design model for dynamic implementation of a monitoring network can be stated as an optimization model to be solved using integer programming. The formulation of the monitoring network design model for a particular stage of implementation, , is presented below: Maximize

T N  

k ci trim χi

(9)

i=1 k=κ+1

Subject to χi ≥ χi−1 , ∀ ≥ 1 N  i=1



χi ≤

 

Pl

(10) (11)

l=1

χi ≥  , ∀ j

(12)

i∈ slj

χi ≡ {0, 1}

(13)

In the above formulation, it is assumed that once a particular potential monitoring well location is identified to be optimal by the monitoring network design model at a particular design stage, the same well location continues to be used as a monitoring well in the subsequent design stages. Constraint set (10) implements this condition.

2040

B. Datta et al.

The constraint set (11) specifies the upper limits on the number of monitoring wells to be implemented. The constraint set (12) ensures that in each design stage, at least one monitoring well is installed along the resultant direction of velocity for each potential source location. 2.4 Estimation of Trimmed Mean Concentrations The integer programming based monitoring network design model requires as input, trimmed mean concentration at each potential monitoring well location at each time step of a particular monitoring network design stage. All simulated concentrations at a particular potential monitoring well location at a given time, resulting from the set of perturbed source fluxes are initially ranked in order of their magnitudes. A specified number of smaller and the larger concentration data are then deleted (trimmed) from each end of the ranked concentration data set. The number of concentration data to be deleted depends on the trimming percentage assumed. The mean of the remaining concentration data gives the trimmed mean for the given set of concentration data. In the performance evaluations reported here, fifteen percent trimming is assumed. 2.5 Incorporating Measurement Errors Measurement errors are incorporated by perturbing the simulated concentrations, which is analogous to collecting and then testing multiple samples of contaminated groundwater at each spatiotemporal observation locations. It is assumed that each perturbed datum can be sampled from a normal distribution. The mean of the normal distribution is this exact datum, and the standard deviation being equal to some fraction (ξ ) of the magnitude of the datum. Therefore, the observation data used for evaluation are obtained using the following relationship:

k



ci obs = cik sim + ξ cik sim δ , ∀i, k (14) It is possible to incorporate other forms of error model also (Skaggs and Kabala 1994). 2.6 Simulation of Pollution Plume Realizations Using Perturbed Source Fluxes Statistical perturbation is used to account for uncertainties and inaccuracies in the estimated source fluxes. The source fluxes as identified at a particular stage are perturbed by adding a random error term to each of these already identified source fluxes. The errors added to these source fluxes are assumed to follow a normal distribution. The mean of the distribution is assumed to be the respective identified source fluxes. The standard deviation of the distribution is assumed to be equal to the standard deviation of all recently identified source fluxes. The perturbation of source fluxes, can, in principle, be performed by using any other suitable distribution without any loss of generality of the proposed methodology. However, if the distributions used have specific upper and lower bounds, trimmed mean values may be replaced by ordinary mean values. The perturbed source flux are obtained as:

k



qi pert = qik est + qik SD ω , ∀i, k (15)

Unknown groundwater pollution sources

2041

It, may however be noted that during perturbation of source fluxes, only those standard normal deviates are accepted which result in perturbed source fluxes to be either greater than or equal to zero. 2.7 Performance Evaluation Criteria In order to quantify the performance of the proposed source identification methodology, a normalized error estimate for source fluxes (NEE f ) is considered in this study. The NEE f in percent can be defined as: N Ndl 

dp  

 qk

NEEf (%) =



i est

k=1 i=1 N Ndl

dp   k=1 i=1

 − qik act 

× 100

(16)

qik act

3 Application of Developed Methodology The combined and iterative monitoring network design and pollution source identification methodology uses two models at each design stage. A linked optimizationsimulation model is utilized for identifying the unknown pollution sources. The sources identified by the optimal source identification model are perturbed to generate a set of different source fluxes. The number of sets of source fluxes used in this study for illustrative application is twenty. The randomly generated sets of source fluxes are used for generating equal number of simulated concentrations at potential monitoring well locations for a particular monitoring network design stage. The flow and contaminant transport simulator SUTRA (Voss 1984) is used for generating these concentration realizations. The SUTRA model is also used as an external independent module for the optimal source identification process. In the performance evaluation reported in this study, the number of pollutant plume realizations is kept to a constant value of twenty. The simulated concentration realizations, corresponding to different perturbed sets of identified source fluxes, are then processed to determine the trimmed mean concentrations at different potential monitoring well locations at different times of the monitoring network design stage. These trimmed mean concentrations are then used for designing an optimal monitoring network using the integer programming based model. After implementation of this optimal monitoring network, additional concentration data are obtained at these installed monitoring wells at specified time intervals. These additional data along with the available concentration measurement data at existing wells are then utilized for improving the source identification results. This iterative procedure of monitoring network design and the source identification can continue for a number of stages, until the source characteristics are reliably estimated. The performance of the proposed methodology is evaluated for an illustrative study area shown in Fig. 2. This represents a two dimensional, homogeneous, isotropic, confined aquifer. The values of various parameters, along with the grid and time step sizes are given in Table 1. The boundary conditions, four potential source locations and three initially existing observation wells are shown in Fig. 2.

2042

B. Datta et al.

Fig. 2 Plan view of the study area

Potential monitoring wells are numbered in between 1–198 and can be calculated as 11 × (Column Number − 1) + Row Number. The sources of a typical conservative pollutant are assumed to be active during the first 16 months of the eleven year time horizon. Each potential source location is associated with four source fluxes. The disposal period of each source flux is assumed to be four months. Therefore, the total number of source fluxes that is unknown to the source identification model is 16. Steady state flow and transient transport are considered for these performance evaluations. The background concentration in the aquifer is assumed to be 100 ppm. The source fluxes are identified at the ends of 3, 6, and 11 years since the source became active. First stage covers a period of 3 years (fourth year to sixth year), while the second stage covers a period of 5 years (7th year to 11th year). 3.1 Preliminary Estimation of Sources The preliminary estimation of source fluxes for the illustrative study area is performed using data collected from three arbitrary located existing wells. The limited amount of data available at 2 months interval for the initial years (first to third year) are utilized for identifying the source fluxes at the end of 3 year period. The error factor (ξ ) is specified to be 0.10. The source identification results are given in Table 2.

Table 1 Different aquifer and discretization parameters

Parameter

Unit

Value

Kxx Kxx /K yy ε αL αT b x y t

m/s – – m m m m m month

1.10 × 10−4 1.00 0.25 40.00 9.60 30.50 50.00 50.00 2.00

Unknown groundwater pollution sources

2043

Table 2 Source fluxes identified with ξ = 0.10 Disposal period

Source locations

Actual

Initial

Stage-I

Stage-II

1

S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 S1 S2 S3 S4 NEE f

47.00 0.00 30.00 52.80 15.00 0.00 58.80 45.60 37.00 0.00 0.00 32.00 0.00 0.00 35.00 29.50

46.83 6.00 35.52 33.92 25.09 0.88 44.93 11.36 15.78 1.89 20.06 2.94 11.77 12.67 30.14 0.00 57.66

40.16 0.70 39.86 55.77 31.16 0.00 31.41 47.14 20.60 0.00 19.97 32.50 4.18 3.62 34.01 21.22 31.20

44.12 0.71 34.69 46.99 24.17 0.27 41.31 57.44 22.21 0.47 20.15 22.50 7.63 0.61 28.01 29.91 29.63

2

3

4

The normalized error estimates for source fluxes (NEE f ). The NEE f is estimated to be 57.66% for these initial estimates based on data from three initially existing arbitrarily located wells. 3.2 Monitoring Network Design (First Stage) Based on the source fluxes identified at the end of the initial three years, a monitoring network is designed for the first design stage (fourth–sixth year). The illustrative study area along with potential monitoring well locations and the existing observation wells are shown in Fig. 2. In this performance evaluation, it is assumed that the number of potential monitoring well locations is 198. The numbers assigned to these potential monitoring wells are also shown in Fig. 2. The coordinates of the four corners of the potential monitoring network block are (350 m, 150 m), (350 m, 650 m), (1200 m, 150 m), and (1200 m, 650 m). The distance between two nearest potential monitoring wells is 50 m. The integer programming formulation is utilized for determining the optimal monitoring well locations corresponding to different specified number of monitoring wells. The solution results are presented in Table 2. In this table, the optimal monitoring well locations are represented by the respective potential monitoring well numbers. 3.3 Source Identification at the end of 6 Years Only four new optimally located monitoring wells identified in the first stage are assumed to be installed at the end of the first three year period for the purpose of performance evaluation. The data collected from these four wells, along with the available data for the initial 3 year period, from the three initially existing wells are utilized for identifying source fluxes, at the end of six years. The concentration data are considered to be available at every two months interval. An error factor (ξ ) of

2044

B. Datta et al.

0.10 is assumed for incorporating the measurement errors in the concentration data. The NEE f estimated for this case is 31.20%. Therefore, substantial improvement in the identification of source fluxes is achieved at the end of the first stage of the monitoring network design. 3.4 Monitoring Network Design (Second Stage) The second stage (7th–11th year) optimal monitoring network is designed on the basis of the source fluxes identified at the end of 6 years (first stage). The source fluxes estimated at the end of 6 years are perturbed to generate 20 sets of perturbed source fluxes. These sets are further utilized for generating twenty realizations of pollutant plumes. The integer programming based optimization model is utilized again for designing an optimal monitoring network. Table 3 presents different optimal monitoring well locations corresponding to maximum number of monitoring wells permitted. 3.5 Source Identification at the End of 11 Years Eight additional monitoring wells, identified in the second stage, are assumed to be installed at the end of 6 years (beginning of second stage). As pointed out earlier, the concentration data from the four optimal monitoring wells assumed to have been installed at the beginning of the first design stage are also utilized. The concentration data from these 12 optimally located wells for the two different periods (at an interval of 2 months), along with the data from three arbitrarily located existing wells for the first 3 years are utilized for identification of sources at the end of eleven years. Measurement errors using an error factor (ξ ) of 0.10 are incorporated into the observed concentration data for identifying the source fluxes. The source identification results

Table 3 Monitoring network obtained as solution of the optimization model Stage

P

Optimal well locations

I

4 5 6 8 10 12 15 18 20 4 5 6 8 10 12 15 18 20

69,75,82,84 69,75,82,84,85 69,74,75,82,84,85 69,74,75,82,84,85,95,96 63,69,73,74,75,82,84,85,95,96 63,69,73,74,75,82,84,85,86,95,96,107 63,64,69,73,74,75,82,84,85,86,95,96,97,106,107 62,63,64,69,73,74,75,82,83,84,85,86,94,95,96,97,106,107 62,63,64,69,72,73,74,75,82,83,84,85,86,94,95,96,97,105,106,107 190,192,194,196 183,190, 192,194,196 183,190, 192, 194,195,196 172,183,184,190,192,194,195,196 172,173,183,184,190,192,193,194,195,196 161,172,173,182,183,184,190,192,193,194,195,196 160,161,162,171,172,173,182,183,184,190,192,193,194,195,196 150,160,161,162,171,172,173,181,182,183,184,185,190,192,193,194,195,196 150,151,160,161,162,171,172,173,174,181,182,183,184,185,190,192,193,194,195,196

II

Unknown groundwater pollution sources

2045

are given in Table 2. The estimated value of NEE f is 29.63%. This shows that only marginal improvement in the source identification errors is achieved compared to the error in the preceding stage. The improvement in identification errors is substantial when the first stage of monitoring network is assumed to be installed. This improvement is not very evident at the end of the second stage. However, the stage wise errors depend to a large extent on the study area chosen, and the initially existing arbitrary well locations. The identification errors are also dependent on the assumption of measurement error characteristics. These performance evaluations for a two stage design, no doubt, establish the potential applicability of the proposed methodology. The linked optimization-simulation model for source identification is computationally much more efficient compared to the embedded optimization model (Mahar and Datta 1997). The CPU time required for solving the proposed source identification problem at any given stage varies between 4.6 and 6.5 min in Ultra Sparc II (400 MHz, SunOS 5.8) system. The number of calls made to SUTRA ranges from 1,624 to 2,100.

4 Discussion and Conclusions In real world scenarios, quite often, the source identification process is required to be started with data collected from a few randomly located wells. These limited amounts of data may not be sufficient for solving the source identification problem. Under such circumstances, the existing arbitrary monitoring network is required to be augmented in an optimal way so that the data collected from the augmented network is useful in improving the source identification results. To address this issue of augmenting the monitoring network, combined source identification and monitoring network design methodology is proposed. This coupled approach helps to improve the source identification process. The source identification model utilizes nonlinear optimization algorithm NPSOL for solution purpose. The monitoring network design model is dynamic in nature reflecting transient pollutant plume movement. The performance of the combined source identification and monitoring network design model is evaluated for an illustrative study area with two design stages. The source identification problem as defined here is nonlinear optimization problem due to the governing flow and transport equations incorporated as binding constraints. Therefore, the global optimality of any solution cannot be guaranteed. In addition, the issues of non-uniqueness of the solution exists, even if the solutions are global optimal. The objective function ensures that the resulting spatiotemporal concentrations are close to the observed concentrations. However, the performance of the methodology is tested using contaminant sources which are actually known for evaluation purpose, but not known to the optimization model. Such an evaluation can establish the potential applicability of the methodology. It is also helpful to use multiple initial solutions to test the optimal solution for optimality. Although, a global optimality cannot be guaranteed, a number of initial solutions resulted in similar solution results. The proposed methodology is presented for designing a dynamic monitoring network for stage wise augmentation of an initially existing arbitrary network of

2046

B. Datta et al.

observation wells. The developed monitoring network design model is dynamic in nature, as it evolves with time. The goal is better identification of sources. The monitoring network designed for two stages, spanning over 3 and 5 years respectively. Source identification results are obtained using sequentially increasing amount of data at the end of 3, 6, and 11 years respectively. All these evaluations are performed for an illustrative study area. Augmentation of an existing monitoring network also results in augmentation in the form of measured concentration data available to the source identification model. This helps in improving the source identification results. The monitoring network design is illustrated for two different design stages, spanning two different design periods. Substantial improvement in the source identification result is observed after installation and subsequent collection of pollutant concentration data from the first stage optimal monitoring network. NEE f is reduced to 31.20% from 57.66%. Improvement in the source identification results is only marginal after implementation of the second stage monitoring network. This indicates that the identification of sources may not always improve substantially, even by installing more number of monitoring wells. However, the stage wise errors depend to a large extent on the study area chosen, and the initially existing arbitrary observation well locations. The identification errors are also dependent on the assumption of measurement error characteristics. These two stage evaluations, no doubt, show the potential applicability of the proposed methodology. The actual states as defined by the spatiotemporal concentrations are the basis for estimation of the sources. The identified sources, if not accurate, would certainly predict a state different from the existing one in the field. The feedback information that is utilized in the identification of the source fluxes in the next stage use the actual observed concentrations which are, different from those simulated based on the source identification. Feedback information utilized is thus the concentrations, at designed monitoring locations. Therefore, the methodology as proposed uses feedback information or the actual state of the system, but, only at locations in the designed monitoring network. Therefore, the source identification and the design of the network improve sequentially with the feed back information. The potential feasibility and efficiency of iteratively using a coupled monitoring network design model and an optimal source identification model is demonstrated by an illustrative example problem. The performance of the methodology is evaluated for a conservative pollutant. The proposed methodology appears to be useful for designing an optimal monitoring network for the purpose of improving unknown pollution source identification process. The solution results for an illustrative study area show that it is indeed possible to improve the efficiency of source identification process using an optimally designed monitoring network implemented sequentially and providing concentration observation feedback information. Other formulations are also possible for the optimal monitoring network design problem incorporating multiple objectives of operation, which may be more appropriate in terms of improving the efficiency of source identification. Some modifications in the proposed formulation may be necessary depending on practical circumstances. The proposed monitoring network design methodology incorporates uncertainties as random errors. However, a more rigorous analysis may be accomplished by adopting soft computing techniques such as fuzzy logic (Ning and Chang 2004) and grey system theory (Chang and Hernandez 2008).

Unknown groundwater pollution sources

2047

References Aral MM, Guan JI, Maslia ML (2001) Identification of contaminant source location and release history in aquifers. J Hydrol Eng 6(3):225–234. doi:10.1061/(ASCE)1084-0699(2001)6:3(225) ASCE Task Committee (2003) Long-term groundwater monitoring: the state of the art, Reston, VA Atmadja J, Bagtzoglou AC (2001a) Pollution source identification in heterogeneous porous media. Water Resour Res 37(8):2113–2125. doi:10.1029/2001WR000223 Atmadja J, Bagtzoglou AC (2001b) State of the art report on mathematical methods to reliable of groundwater pollution source identification. Environ Forensics 2(3):205–214. doi:10.1006/ enfo.2001.0055 Bagtzoglou AC, Atmadja J (2003) Marching-jury backward beam equation and quasi-reversibility methods for hydrologic inversion: application to contaminant plume spatial distribution recovery. Water Resour Res 39(2). SBH 10–1:10–14 Bagtzoglou AC, Dougherty DE, Tompson AFB (1992) Application of particle methods to reliable identification of groundwater pollution sources. Water Resour Manag 6:15–23. doi:10.1007/ BF00872184 Butera I, Tanda MG (2003) A geostatistical approach to recover the release history of groundwater pollutants. Water Resour Res 39(12):1372. doi:10.1029/2003WR002314 Chakrabarty D (2001) Identification of unknown groundwater pollution sources and simultaneous parameter estimation using linked optimization-simulation approach. PhD Dissertation, I.I.T. Kanpur, India Chang NB, Hernandez EA (2008) Optimal expansion strategy for a sanitary sewer system under uncertainty. Environ Model Assess 13(1):93–113 Cieniawski SE, Eheart JW, Ranjithan S (1995) Using genetic algorithm to solve a multiple objective groundwater monitoring problem. Water Resour Res 31(2):399–409 Datta B, Dhiman SD (1996) Chance-constrained optimal monitoring network design for pollutants in groundwater. J Water Resour Plan Manage 122(3):180–188. doi:10.1061/(ASCE)07339496(1996)122:3(180) Datta B, Beegle JE, Kavvas ML, Orlob GT (1989) Development of an expert-system embedding pattern-recognition techniques for pollution-source identification, Technical Report: PB-90– 185927/XAB, OSTI ID: 6855981, Dept. of Civil Engineering, California Univ., Davis, CA (USA) Devore JL, Farnum NR (1999) Applied statistics for engineers and scientists. Brooks/Cole Publishing Company, USA Dhar A, Datta B (2007) Multiobjective design of dynamic monitoring networks for detection of groundwater pollution. J Water Resour Plan Manage 133(4):329–338. doi:10.1061/(ASCE)07339496(2007)133:4(329) Gill PE, Murray W, Saunders MA, Wright MH (1986) User’s guide for NPSOL (version 4.0): a Fortran package for nonlinear programming, Technical Report SOL 86-2, Dept. of Operation Research, Stanford University, Stanford, CA Gorelick SM, Evans B, Ramson I (1983) Identifying sources of groundwater pollution: an optimization approach. Water Resour Res 19(3):779–790. doi:10.1029/WR019i003p00779 Grabow G, Yoder DC, Mote CR (2000) An empirically-based sequential ground water monitoring network design procedure. J Am Water Resour Assoc 36(3):549–566. doi:10.1111/ j.1752-1688.2000.tb04286.x Hudak PF, Loaiciga HA, Marino MA (1995) Regional-scale ground water quality monitoring via integer programming. J Hydrol (Amst) 164(1–4):153–170. doi:10.1016/0022-1694(94)02559-T Keidser A, Rosbjerg D (1991) A comparison of four inverse approaches to groundwater flow and transport parameter identification. Water Resour Res 27(9):2219–2232. doi:10.1029/91WR00990 Liu C, Ball WP (1999) Application of inverse methods to contaminant source identification from aquitard diffusion profiles at Dover AFB, Delaware. Water Resour Res 35(7):1975–1985. doi:10.1029/1999WR900092 Loaiciga HA (1989) An optimization approach for groundwater quality monitoring network design. Water Resour Res 25(8):1771–1782. doi:10.1029/WR025i008p01771 Loaiciga HA, Charbeneau RJ, Everett LG, Fogg GE, Hobbs BF, Rouhani S (1992) Review of ground-water quality monitoring network design. J Hydraul Eng 118(1):11–37. doi:10.1061/ (ASCE)0733-9429(1992)118:1(11) Mahar PS, Datta B (1997) Optimal monitoring network and ground-water-pollution source identification. J Water Resour Plan Manage 123(4):199–207. doi:10.1061/(ASCE)07339496(1997)123:4(199)

2048

B. Datta et al.

Mahar PS, Datta B (2000) Identification of pollution sources in transient groundwater system. Water Resour Manage 14(6):209–227. doi:10.1023/A:1026527901213 Mahar PS, Datta B (2001) Optimal identification of ground-water pollution sources and parameter estimation. J Water Resour Plan Manage 127(1):20–29. doi:10.1061/(ASCE)07339496(2001)127:1(20) Mahinthakumar G, Sayeed M (2005) Hybrid genetic algorithm—local search methods for solving groundwater source identification inverse problems. J Water Resour Plan Manage 131(1):45–57. doi:10.1061/(ASCE)0733-9496(2005)131:1(45) Massmann J, Freeze RA (1987) Groundwater contamination from waste management sites: the interaction between risk-based engineering design and regulatory policy. I: Methodology. Water Resour Res 23(2):351–367. doi:10.1029/WR023i002p00351 McKinney DC, Loucks DP (1992) Network design for predicting groundwater contamination. Water Resour Res 28(1):133–147. doi:10.1029/91WR02397 Meyer PD, Brill ED Jr (1988) A method for locating wells in a groundwater pollution monitoring network under conditions of uncertainty. Water Resour Res 24(8):1277–1282. doi:10.1029/ WR024i008p01277 Michalak AM, Kitanidis PK (2004) Estimation of historical groundwater contaminant distribution using the adjoint state method applied to geostatistical inverse modeling. Water Resour Res 40:W08302. doi:10.1029/2004WR003214 Montas HJ, Mohtar RH, Hassan AE, AlKhal FA (2000) Heuristic space-time design of monitoring wells for contaminant plume characterization in stochastic flow fields. J Contam Hydrol 43(3–4): 271–301. doi:10.1016/S0169-7722(99)00108-4 Mugunthan P, Shoemaker CA (2004) Time varying optimization for monitoring multiple contaminants under uncertain hydrogeology. Bioremediat J 8(3–4):129–146. doi:10.1080/ 10889860490887509 Neupauer RM, Wilson JL (1999) Adjoint method for obtaining backward-in-time location and travel probabilities of a conservative groundwater contaminant. Water Resour Res 35(11):3389–3398. doi:10.1029/1999WR900190 Ning SK, Chang NB (2004) Optimal expansion of water quality monitoring network by fuzzy optimization approach. Environ Monit Assess 91(1–3):145–170. doi:10.1023/B:EMAS.0000009233. 98215.1f Nunes LM, Cunha MC, Ribeiro L (2004a) Groundwater monitoring network optimization with redundancy reduction. J Water Resour Plan Manage 130(1):33–43. doi:10.1061/(ASCE)07339496(2004)130:1(33) Nunes LM, Cunha MC, Ribeiro L (2004b) Optimal space-time coverage and exploration costs in groundwater monitoring networks. Environ Monit Assess 93(1–3):103–124. doi:10.1023/ B:EMAS.0000016795.91968.13 Reed P, Minesker BS (2004) Striking the balance: long-term groundwater monitoring design for conflicting objective. J Water Resour Plan Manage 130(2):140–149. doi:10.1061/(ASCE)07339496(2004)130:2(140) Reed P, Minesker BS, Valocchi AJ (2000) Cost-effective long term groundwater monitoring design using genetic algorithm and global mass interpolation. Water Resour Res 36(12):3731–3741. doi:10.1029/2000WR900232 Sidauruk P, Cheng AH-D, Ouazar D (1997) Ground water contaminant source and transport parameter identification by correlation coefficient optimization. Ground Water 36:208–214. doi:10.1111/j.1745-6584.1998.tb01085.x Singh RM, Datta B (2004) Groundwater pollution source identification and simultaneous parameter estimation using pattern matching by artificial neural network. Environ Forensics 5(3):143–159. doi:10.1080/15275920490495873 Singh RM, Datta B (2006) Identification of groundwater pollution sources using GA-based linked simulation optimization model. J Hydrol Eng 11(2):101–109. doi:10.1061/(ASCE)10840699(2006)11:2(101) Singh RM, Datta B (2007) Artificial neural network modeling for identification of unknown pollution sources in groundwater with partially missing concentration observation data. Water Resour Manage 21(3):557–572. doi:10.1007/s11269-006-9029-z Singh RM, Datta B, Jain A (2004) Identification of unknown groundwater pollution sources using artificial neural networks. J Water Resour Plan Manage 130(6):506–514. doi:10.1061/ (ASCE)0733-9496(2004)130:6(506) Skaggs TH, Kabala ZJ (1994) Recovering the release history of a groundwater contaminant. Water Resour Res 30(1):71–79. doi:10.1029/93WR02656

Unknown groundwater pollution sources

2049

Skaggs TH, Kabala ZJ (1995) Recovering the release history of a groundwater contaminant plume: method of quasi-reversibility. Water Resour Res 31(11):2669–2673. doi:10.1029/95WR02383 Snodgrass MF, Kitanidis PK (1997) A geostatistical approach to contaminant source identification. Water Resour Res 33(4):537–546. doi:10.1029/96WR03753 Sreenivasulu C, Datta B (2008) Dynamic optimal monitoring network design for transient transport of pollutants in groundwater aquifers. Water Resour Manage 22(6):651–670. doi:10.1007/ s11269-007-9184-x Sun AY (2007) A robust maximum likelihood approach to contaminant source identification. Water Resour Res 43(2):W02418. doi:10.1029/2006WR005106 Sun AY, Painter SL, Wittmeyer GW (2006a) A constrained robust least squares approach for contaminant source release history identification. Water Resour Res 42(4):W04414. doi:10.1029/ 2005WR004312 Sun AY, Painter SL, Wittmeyer GW (2006b) A robust approach for contaminant source location and release history recovery. J Contam Hydrol 88(3–4):29–44. doi:10.1016/j.jconhyd.2006.06.006 Voss CI (1984) A finite-element simulation model for saturated-unsaturated, fluid-densitydependent ground-water flow with energy transport or chemically-reactive single-species solute transport: U.S. Geological Survey Water-Resources Investigations Report 84–4369, 409 Wagner BJ (1992) Simultaneous parameter estimation and contaminant source characterization for coupled groundwater flow and contaminant transport modeling. J Hydrol (Amst) 135:275–303. doi:10.1016/0022-1694(92)90092-A Wu J, Zheng C, Chien CC (2005) Cost-effective sampling network design for contaminant plume monitoring under general hydrogeological conditions. J Contam Hydrol 77(1–2):41–55 doi:10.1016/j.jconhyd.2004.11.006 Yeh WW-G (1986) Review of parameter identification procedures in groundwater hydrology: the inverse problem. Water Resour Res 22:95–108. doi:10.1029/WR022i002p00095