Jul 16, 2000 - nitrate monitoring network in the south of Portugal is used to illustrate the method. ..... two points a significant distance apart in a graphical PCA.
WATER RESOURCES RESEARCH, VOL. 40, W02406, doi:10.1029/2003WR002469, 2004
Groundwater nitrate monitoring network optimization with missing data L. M. Nunes,1 E. Paralta,2 M. C. Cunha,3 and L. Ribeiro4 Received 9 July 2003; revised 28 December 2003; accepted 13 January 2004; published 24 February 2004.
[1] A method for designing groundwater monitoring networks to define the extent of
agricultural contamination is proposed. The method is particularly well suited to reducing existing networks where data are missing from time series records. A simulated annealing optimization algorithm is used to minimize the variance of the estimation error obtained by kriging in combinatorial problems, created by selecting an optimal subset of stations from the original set. Optimization is performed for several measuring times, obtaining an equal number of optimized small-dimension networks; stations that repeat more often in these networks are selected to make part of the final network. A compliance groundwater nitrate monitoring network in the south of Portugal is used to illustrate the method. The original 89-station network was converted into 16 stations. Results show that considerable reductions in operating costs (about 80%) are compatible with a cost-effective network INDEX TERMS: capable of detecting noncompliance with national and European norms. 1894 Hydrology: Instruments and techniques; 1869 Hydrology: Stochastic processes; 1831 Hydrology: Groundwater quality; 1848 Hydrology: Networks; KEYWORDS: groundwater, heuristics, monitoring, nitrates, simulated annealing Citation: Nunes, L. M., E. Paralta, M. C. Cunha, and L. Ribeiro (2004), Groundwater nitrate monitoring network optimization with missing data, Water Resour. Res., 40, W02406, doi:10.1029/2003WR002469.
1. Introduction [2] This article focuses on the design of a groundwater monitoring network (GMN) to detect contamination with nitrates from agricultural origin. This is an issue that has been in the mind of general public, scientists, governmental agencies and legislators for some time now. The concern is well justified if one looks at European statistics: although the data are as yet incomplete, more than 25% of the regions monitored so far exceed the 25 mg/L national guide level for groundwater nitrate concentration (Portuguese Law 236/98) and in 13% of the regions the 50 mg/L European water quality standard for drinking water (Drinking Water Directive 98/83/EC [European Union (EU), 1998]) is exceeded in more than 25% of the monitoring stations [Scheidleder et al., 1999]. [3] The European Union legislated for groundwater protection in 1980 (Groundwater Directive 80/68/EEC [EU, 1980]), and in 1991 a specific legislative framework was established for nitrates of agricultural origin (Nitrate Directive 91/676/EEC [EU, 1991]). The latter establishes goals for preventing pollution and lowering existing nitrate concentrations. To monitor the effectiveness of such actions Member States must monitor groundwater quality regularly. 1 Faculty of Marine and Environmental Sciences, University of Algarve, Campus de Gambelas, Faro, Portugal. 2 Hydrogeology Department, Geological and Mining Institute, Portuguese Geological Survey, Estrada da Portela, Alfragide, Portugal. 3 Civil Engineering Department, University of Coimbra, Coimbra, Portugal. 4 Instituto Superior Te´cnico, Lisbon Technical University, Lisbon, Portugal.
Copyright 2004 by the American Geophysical Union. 0043-1397/04/2003WR002469$09.00
In Portugal a national groundwater monitoring network, with hundreds of stations, has been operating since the eighties, but unfortunately there are many gaps. Three major aquifer systems were considered particularly vulnerable to nitrate contamination and have been monitored in detail since 1996. The case study presented in this article is not in any of those systems, but it nevertheless reveals nitrate concentrations constantly above 50 mg/L. The existing monitoring network consists of 89 stations distributed randomly over a 50 km2 area. The aim of the study is to reduce the number of stations to arrive at a more costeffective network. A variance reduction based objective function is defined and optimized with a heuristic optimization algorithm (simulated annealing). [4] Given the high costs associated with the implementation of an environmental monitoring network, it is essential to develop efficient procedures for designing one. Moreover, network design depends on the spatial and temporal distribution of physicochemical parameters, which may be unknown before sampling, or which may evolve into a different distribution due to the influence of an external event (e.g., introduction of some pollution source, or alteration of flow direction). Thus a monitoring network should not be considered as a static measuring tool; instead it should be checked for efficiency from time to time, and adjusted if needed. Loaiciga et al. [1992] categorized the general approaches to network design as hydrogeologic, when no advanced geostatistical method is applied, and statistical otherwise. This classification is somewhat limiting, because it excludes promising advanced statistical techniques such as those derived from information theory [Harmancioglu et al., 1998]. However, it is nevertheless useful since it helps to distinguish between techniques that do not incorporate spatial statistical information, and those
W02406
1 of 18
W02406
NUNES ET AL.: GROUNDWATER NITRATE MONITORING OPTIMIZATION
that do. Loaiciga et al. [1992] classified statistical methods as simulation, variance-based, and probability-based. The main difference between these methods lies in the formulation of the objective function to be optimized, and the difference between different approaches to the constraints and in the way optimization is undertaken. Probabilitybased methods include variants of the others by including the values taken by the stated variable and the probability of not detecting high values. In our review no distinction was made and probability-based methods were included in both simulation and variance-based methods. [5] Simulation methods consider uncertainty in the hydraulic conductivity field and therefore uncertain head distribution and velocity fields. Random hydraulic conductivity fields are usually estimated by conditional simulations, given a spatial covariance model obtained with the existing field data. The resulting differences in the estimated velocity fields will affect the optimal distribution of monitoring stations according to some objective function (e.g., depending on the probability of failing to detect contamination). Examples of this method are given by Massmann and Freeze [1987a, 1987b] and Wagner and Gorelick [1989]. [6] Variance-based methods, also known as variance reduction methods, consider that the uncertainty associated with a given monitoring network may be determined by the variance of the estimation error obtained by kriging. A given spatial distribution of stations has an uncertainty that depends on the particular locations. If one station is removed or another is added, the accuracy will usually decrease in the first case and increase in the second. Also, if the number of stations is not changed, and only their location altered, accuracy will change. Variance of the errors of estimation is therefore used as an objective function. Here two alternatives are proposed, one based directly on the kriging variance as calculated from the kriging system, and another based on the variance of the errors, calculated with the estimated and known values at each point. Journel [1987] and Delhomme [1978] advocate that the first is not a measure of local uncertainty at the estimated points, and therefore should not be used with this objective; Delhomme proposes that the latter is used instead. Despite this controversy, kriging variance has been extensively used for monitoring network design. Examples are given by Bras and Rodrı´guezIturbe [1976], Rouhani [1985], Loaiciga [1989], Rouhani and Hall [1988], Pardo-Igu´zquiza [1998]. Rouhani and Fiering [1986] showed that these methods are robust (‘‘insensitive to design errors, random or otherwise, in the estimates of those parameters affecting design choice’’) and resilient (as the ‘‘ability to accommodate surprises and to survive under unanticipated perturbations’’). [7] The reduction of the dimension of an existing monitoring network is an optimization combinatorial problem: which subset of stations should be kept in the new design. When the number of stations is large, the number of possible combinations is too great for all possible combinations to be tested, even in very fast computers. A common practical decision is to accept one good solution even if it is not the optimum one. There are some heuristic optimization algorithms that can deal with combinatorial problems. First there is a general class of strictly descending algorithms that includes sequential exchange with node-swap or with node-
W02406
substitution, downhill simplex, and search with multiple randomly generated starting solutions. A second general class of algorithms, not strictly descending, includes simulated annealing and tabu search, and combinations of them. Other algorithms based on nature, like genetic algorithms and ant colony optimization (see, e.g., an introductory article by Dorigo et al. [1996]), are also showing very good results. Most of these methods have been applied to monitoring network design with varying degrees of success. However, simulated annealing is the technique that has been most frequently applied to monitoring network design, probably because it is the oldest of the nonstrictly descending methods and has shown to be very efficient [Lee and Ellis, 1996]. [8] GMN optimization has been widely applied, with two basic objectives: (1) network augmentation and (2) original network design. Optimization for network reduction is a much less common application. Examples of the latter are described by Knopman and Voss [1989], Meyer et al. [1994], and Reed et al. [2000], for point sources using flow and transport models. Grabow et al. [1993] proposed a method for network reduction without the need to simulate mass transport, stated as being applicable for both point and diffuse sources, though only used by the authors for a point source. The method proposed below is directed at diffuse contamination, particularly at agricultural contaminants, and applied to nitrogen (measured as nitrate).
2. Case Study [9] The case study area (50 km2) is located in Alentejo region (south Portugal) in an area known as Gabbro of Beja aquifer system (Figure 1), which covers an area of about 350 km2 in the Ossa-Morena geotectonic unit. Only the most relevant geologic and hydrogeologic aspects are reviewed here. Complete information is available from Quesada et al. [1994], Duque [1997], Paralta [2001], and Paralta and Ribeiro [2001]. [10] The gabbro-dioritic shallow aquifer is one of the most productive formations in the Alentejo region, when compared with other hard aquifers. The aquifer is unconfined in the altered zones nearer the surface to a depth of about 30 m, while after crossing unaltered rock, water circulation occurs mainly through secondary porosity. [11] In the study area, well productivity ranges from 1.5 to 18 L/s with most frequent average values around 6.5 L/s. Most frequent transmissivity values vary between 50 and 100 m2/day. Recharge is estimated at between 10% and 20% of average annual rainfall (584 mm/yr). Hydrochemical characterization indicates that these waters have high mineralization, with total dissolved solids ranging from 400 mg/L to 900 mg/L, appearing in the water as calcium-magnesium bicarbonate and magnesiumcalcium bicarbonate. [12] The current groundwater quality is a long-term consequence of major changes in agriculture in Alentejo since 1930 – 1940, when cereal crops replaced indigenous forest. The land is mainly used to produce wheat and sunflower, with maize as an alternative crop. With such intensive cereal cultivation, the usual fertilizer application is estimated to be between 100 – 120 kg N/ha/yr. Since the seventies, high nitrate concentrations have been
2 of 18
W02406
NUNES ET AL.: GROUNDWATER NITRATE MONITORING OPTIMIZATION
W02406
Figure 1. Study area (shaded, 50 km2).
detected in some public wells near Beja. Average nitrate content in public and private wells nowadays is often over 50 mg/L.
3. Methods 3.1. Reducing the Dimension of an Existing Groundwater Monitoring Network [13] It is proposed to solve the problem of reducing an existing groundwater monitoring network by (1) defining a fixed number of stations to be included in a firstapproximation GMN; (2) at each measuring time, estimating the subset of stations from the original complete set that has the highest accuracy given the first-approximation GMN; (3) repeating the previous step for all measuring times; (4) computing the frequency with which stations appear in the estimated first-approximation GMN sets; and (5) including the stations most often found in the final GMN. [14] The first-approximation GMN dimension may be a tricky parameter. If there are too many data missing from the time series, the number of stations may be too low,
leading to a very small final GMN, whereas if it is too high, a very large final GMN will result. A good approximation is to use a value similar to the expected final GMN, which can be determined by classical sampling statistical methods [Thompson, 1992; Cochran, 1977]. Step 2 is handled in the context of variance reduction techniques, together with optimization by simulated annealing (SA). Estimation error variance is the measure of accuracy used as the objective function to be minimized by SA when iteratively selecting subsets from the original complete set. Both methods are discussed in more detail below. First-approximation GMN optimization is then carried out for each measuring time. In step 4 a final GMN is obtained by choosing the set of stations that appear most frequently in the optimized first-approximation GMNs. [15] If such a small fraction of values is missing from the time series data that they may be considered complete, other variance reduction methods for GMN optimization may be used instead, particularly those based on the spacetime analysis [Buxton and Pate, 1994; Pardo-Igu´zquiza, 1998] and on time-only analysis [Amorocho and Espildora,
3 of 18
NUNES ET AL.: GROUNDWATER NITRATE MONITORING OPTIMIZATION
W02406
1973; Caselton and Husain, 1980; Harmancioglu and Yevjevich, 1987]. 3.2. Estimation Error Variance [16] The values of a stated variable z(x) at the n sampled points in the field can be considered as realizations of a set of random variables Z(x) in a field . A set of random variables Z(xi) defined in a field is a random function Z(x): Z ðxÞ ¼ fZ ðxi Þgxi 2
z(xi) represents a random variable, and Z is a realization of a random function. The most common theory considers that the random function is invariant under translation. Strictly, the restrictive hypotheses are applied only to the first two moments, hence they are only required to exist and be independent of space coordinates (second order stationarity), or the spatial covariance of the variable Z should be dependent only on the distance h between two coordinates. In this case only the spatial increments have to be stationary (intrinsic stationarity). If these increments are made at step h, then the resulting expression is called the variogram. [17] An estimate of the mean value of a stated variable in an area A is V ¼
1 A
Z
[18] The estimation variance is expressed by [Journel and Huijbregts, 1978], s2E ð AÞ ¼
n X
s2E ¼
ki Z ð x i Þ
i¼1
which is unbiased if the sum of the weights ki is one. This is a common requirement in several estimation methods, and also in kriging. This method has been chosen because the k are determined so as to minimize the variance of the estimation error. The ordinary kriging system is [Journel and Huijbregts, 1978]: 8X n > > ki gðhÞ þ m ¼ gðhiA Þ > < i¼1
n X > > > ki ¼ 1 : i¼1
where n is the number of samples used to estimate the value at A, m is the Lagrange parameter, and g(hiA) is the average variogram between the point i and the area A when one extreme of the vector h is fixed at xi and the other extreme describes the area A independently. The average variogram is: gðhiA Þ ¼
1 A
Z
1 M
w 1X s2 ðAi Þ w i¼1 E
ð2Þ
min s2E Subject to
ð3Þ
gðxi ; uÞdu w ¼ constant
which may be approximated numerically by gðhiA Þ
ð1Þ
where Ai denotes each of the areas to be estimated. [20] After setting the number of stations, w, a method of leave-one-out is used to calculate the estimation variance at each station (location), by solving the kriging system w times with w-1 stations. We refer to the method of ‘‘leave one out’’ as the method by which one station out of the w stations is removed and its value (and the kriging variance) estimated using the remaining stations; this is repeated until all stations have been removed. Only after this is the average kriging variance calculated. [21] Equation (2) reflects the accuracy of estimating with a set of stations, and the more accurate the estimates, the lower the mean estimation variance. Therefore stations where the contribution to estimation variance is low will be preferred to others where it is higher. 3.3. Optimizing the Monitoring Network [22] The optimization problem can be stated in the following way: what is the ‘‘best’’ subset of w stations (chosen according to their location), out of the initial set of
stations, in terms of the mean estimation variance?. Which can be stated as
A
M X
ki gðhiA Þ gðhAA Þ þ m
The estimation variance for A is a measure of the estimation accuracy of V. Because sE2 (A) only depends on the geometric configuration of the data points once a variogram model is defined, it is possible to change data locations and calculate the estimation variance again without ever taking a sample. The spatial arrangement of points that minimizes sE2(A) has the lowest estimation error for A, and this therefore best reflects the spatial correlation introduced in the variogram model. [19] As the estimation variance is a local measure of accuracy (at A), a global measure is needed to allow the comparison of two alternative spatial monitoring configurations. Here a simple mean of the estimation variance is used. Woldt and Bogardi [1992] proposed alternatively the use of the estimation variance weighted by prior suspicion of the value V, and the average measure of kriged contamination level weighted by estimation variance as the global measure.
A
V ¼
n X i¼1
Z ð xÞdx
A linear estimation of V can be obtained from n data points by
W02406
g xi ; xj
xj 2 A; xi 2
j¼1
with M being the number of points used to discretize A.
[23] If the number of stations is small then all combinations of w in can be tested; however, if it is large the number of combinations becomes very large and the problem is classified as a difficult combinatorial optimization problem, for which an exhaustive search of all possible combinations is not possible in a reasonable amount of time. However, these 4 of 18
W02406
NUNES ET AL.: GROUNDWATER NITRATE MONITORING OPTIMIZATION
problems may be tackled by heuristic algorithms that iteratively look for better solutions by trial and error. One of these is the well-known simulated annealing (SA) algorithm. It is one of the threshold algorithms included in the class of local search algorithms. The other two, as defined by Aarts and Korst [1990], are: iterative improvement, where only sE2 reducing neighbors are accepted, and threshold accepting, where a deterministic nonincreasing threshold sequence is used, allowing neighbor solutions with larger sE2 to be accepted, but only to a limited extent, because the threshold value is fixed, and always decreases according to a very rigid control on the size of the sE2 difference, sE2. Simulated annealing uses a more flexible control on the values of the threshold, allowing transitions out of a local minimum at nonzero temperatures. [24] SA was first introduced by Kirkpatrick et al. [1983] as an algorithm to solve very well known combinatorial optimization problems, reducing the risk of falling into local minima (or metastable solutions) common to iterative improvement methods. These authors proposed the use of the Metropolis et al. [1953] procedure from statistical mechanics. This procedure generalizes iterative improvement by incorporating controlled uphill steps (to worse solutions). The procedure states the following: consider that the change in the objective function is sE2 ; if sE2 0, then the change in the system is accepted and the new configuration is used as the starting point in the next step; if sE2 > 0 then the probability that the change is accepted is determined by P(sE2 ) = exp( sE2 /t), where t is a control parameter called temperature; a random number uniformly distributed in the interval (0,1) is taken and compared with the former probability; if this number is lower than P(sE2 ) then the change is accepted. The control parameter t is usually termed temperature. The SA algorithm runs as follows: (1) the system is melted at a high temperature (initial temperature, t1); (2) the temperature is decreased gradually until the system freezes (because no better solutions are found and the probability of uphill steps is near zero); (3) at each iteration the Metropolis procedure is applied; (4) if any of the stopping criteria are reached the algorithm is stopped and the best solution found is presented. [25] The generic SA algorithm for a minimization, considering a neighborhood structure N and an objective function sE2 has the following pseudocode. [26] 1. Select an initial solution Xbest. [27] 2. Select an initial temperature t1 > 0. [28] 3. Select a temperature reduction factor. [29] 4. Repeat. 4.1. Repeat. 4.1.1. Randomly select X 2 N(Xbest); sE2 = sE2(X) sE2(Xbest). 4.1.2. If sE2 < 0, then Xbest = X; otherwise, generate random z uniformly in (0, 1). 4.1.3. If z < exp( sE2/t), then Xbest = X; until iterations = max_iterations. 4.2. Set t = a t; until stopping condition = true. [30] 5. Xbest is the optimal solution found. [31] Several improvements have been proposed to speed up the process, namely, by limiting the number of iterations at each temperature, i.e., defining the number max_iterations. The dimension of the Markov chain has been proposed as a
W02406
function of the dimension of the problem [Kirkpatrick et al., 1983]: temperature is maintained until 100 solutions (iterations), or 10 successful solutions have been tested, whichever comes first. stands for the number of stations in the problem. These authors also proposed that the annealing is stopped (stopping criterion) if, after three consecutive temperatures, the number of acceptances is not achieved. If the average value of the OF does not change after a preestablished number of temperature decreases (RSTOP), then the annealing may be stopped. These parameters control the time spent at each temperature and the total running time. Along with these dynamic criteria, a static one may be used to halt the process when a minimum temperature, tmin, is reached. This will guarantee that the annealing stops if none of the dynamic criteria is fulfilled, even before the total number of iterations is attained. In our algorithm both the dynamic and the static criteria were implemented. [32] The initial temperature, t1, is calculated by running a fast (rapid temperature decrease) schedule and picking up the temperature for which more than 95% of the iterations are accepted. Temperature is usually decreased at a constant rate, a, usually close to one (e.g., 0.90 or higher). Aarts and Korst [1990] showed that SA can find optimal solutions if equilibrium is attained at each temperature (constant objective function mean and variance) and proposed a temperature schedule dependent on the objective function variance to guarantee that. Despite this very attractive characteristic, such a schedule tends to converge too slowly. Other t schedules for optimality have also been proposed by Geman and Geman [1984], Hajek [1988], and Siarry [1997]. These however may take too long for many practical problems [Cohn and Fielding, 1999]. The wealth of practical experience with the faster t schedule used here indicates that the solutions found should be good local optimal ones. In practical terms, good approximate solutions are a compromise between relatively good solutions in an amount of time significantly smaller than the one necessary to guarantee the very good quality solutions provided (in theory) by slower schedules. [33] Simulated annealing has had many applications in water management studies and in related fields, and proved to be a good optimization algorithm for difficult combinatorial problems. Examples are given by Kuo et al. [2001, 2003] and Berkowitz and Hansen [2001] in the optimization of crop irrigation; by Fallat and Dosso [1998] and Popov and He [2000] in geophysical data inversion for groundwater detection; by Kuo et al. [1992], Rizzo and Dougherty [1996], Skaggs et al. [2001a, 2001b], and Johnson and Rogers [2000] for groundwater remediation optimization schemes; by Cunha and Sousa [1999] for water distribution network design; by Johnson and Rogers [2001], Wang and Zheng [1998], Cunha [1999], Dougherty and Marryott [1991], and Marryott et al. [1993] for groundwater management; and by Ferreyra et al. [2002], Brus et al. [2000], Dixon et al. [1999], Meyer et al. [1994], Storck et al. [1997], and Lee and Ellis [1996] for sampling optimization.
4. Application to the Case Study [34] A specific hydrochemical monitoring campaign was carried out by the Mining and Geological Institute of
5 of 18
NUNES ET AL.: GROUNDWATER NITRATE MONITORING OPTIMIZATION
W02406
Table 1. Parameters of Spherical Space Variogram Models for the 19 Measuring Times Measuring Times Tm
j j
C0, (mg/L)2
C, (mg/L)2
r, m
July 1997 September 1997 January 1998 March 1998 May 1998 June 1998 July 1998 August 1998 November 1998 January 1999 March 1999 May 1999 July 1999 October 1999 November 1999 January 2000 March 2000 May 2000 July 2000
28 19 19 36 41 36 50 50 53 47 50 52 48 59 63 65 65 59 67
345.0 962.0 609.0 975.0 742.5 675.0 862.0 865.0 836.8 849.2 540.1 317.3 530.8 270.6 715.8 487.4 308.0 250.0 393.6
700.0 1931.4 2107.0 1440.0 987.4 1084.5 1341.5 1405.0 1073.7 1124.2 637.6 1477.6 1156.4 644.9 986.1 1007.5 1368.7 1289.3 850.9
3000.0 2762.0 4336.5 1404.0 2538.0 1969.0 3450.0 3342.0 994.0 2887.0 2020.0 1977.0 1870.0 1700.0 1240.0 2400.0 1177.0 3000.0 1427.0
Portugal (IGM) to assess the spatial and temporal variability of nitrate concentrations in the Gabbro of Beja aquifer system, in the vicinity of the city of Beja (Figure 1). Monitoring was undertaken between July 1997 and July 2000 in deep (30 – 40 m) and shallow (