Optimization of water level monitoring network ... - Wiley Online Library

3 downloads 238 Views 1MB Size Report
Optimization of water level monitoring network in polder systems using information theory. Leonardo Alfonso,1 Arnold Lobbrecht,1,2 and Roland Price1.
WATER RESOURCES RESEARCH, VOL. 46, W12553, doi:10.1029/2009WR008953, 2010

Optimization of water level monitoring network in polder systems using information theory Leonardo Alfonso,1 Arnold Lobbrecht,1,2 and Roland Price1 Received 27 November 2009; revised 26 August 2010; accepted 1 September 2010; published 30 December 2010.

[1] A method for siting water level monitors based on information theory measurements is presented. The first measurement is joint entropy, which evaluates the amount of information content that a monitoring set is able to collect, and the second measurement is total correlation, which evaluates the level of dependency or redundancy among monitors in the set. In order to find the most convenient set of places to put monitors from a large number of potential sites, a multiobjective optimization problem is posed under two different considerations: (1) taking into account the costs of placing new monitors and (2) considering the cost of placing monitors too close to hydraulic structures. In both cases, the joint entropy of the set is maximized and its total correlation is minimized. The costs are considered in terms of information theory units, for which additional terms affecting the objective functions are introduced. The proposed method is applied in a case study of the Delfland region, Netherlands. Results show that total correlation is an effective way to measure multivariate independency and that it must be combined with joint entropy to get results that cover a significant proportion of the total information content of the system. The maximization of joint entropy gives results that cover between 82% and 85% of the total information content. Citation: Alfonso, L., A. Lobbrecht, and R. Price (2010), Optimization of water level monitoring network in polder systems using information theory, Water Resour. Res., 46, W12553, doi:10.1029/2009WR008953.

1. Introduction [2] In order to manage any water system properly, a reliable hydrometric network for collection of the hydrologic cycle variables is required. Several methods to design these networks have been proposed in the literature. A comprehensive review can be found in the work by Mishra and Coulibaly [2009], in which statistically based methods, information theory‐based methods, user survey, hybrid methods, and physiographic components and sampling strategies are mentioned. According to the purpose of the monitoring network, the authors classified the networks as surface water (precipitation or streamflow), groundwater, or water quality networks. In this paper, emphasis is given to water level gauges used in controlled (polder) systems, which are essentially flat, and where water level fluctuations are kept within a predefined range using hydraulic structures such as pumps and weirs. In this kind of system, the information provided by the gauges plays a significant role in the decision making process for the operation of the structures. These water level gauges, moreover, are expensive to operate and maintain and do not necessarily provide all the information required to adequately describe the state of a water system, so the size and scope of the network must be carefully considered. Such a consideration requires the use of methods to estimate the amount of information that can 1

Hydroinformatics and Knowledge Management, UNESCO‐IHE, Delft, Netherlands. 2 HydroLogic BV, Amersfoort, Netherlands. Copyright 2010 by the American Geophysical Union. 0043‐1397/10/2009WR008953

be collected at a given location in order to place new monitors optimally or to relocate existing monitors within specific constraints. [3] The main idea behind the design (or optimization) of any monitoring network is to reduce the uncertainty associated with the estimation of the variables of interest at the nonmonitored locations. Although the concept of uncertainty has been traditionally linked with statistical variance because of its simplicity, Amorocho and Espildora [1973] first introduced information theory (IT) to water resources applications, pointing out that statistical variance was not an objective index of quality when comparing predicted values of a hydrologic model and the series of data records. Diverse authors have applied IT concepts to the design or evaluation of monitoring networks for different purposes, such as water quality, groundwater, or rainfall gauging stations. Husain [1989] presented a methodology to estimate regional hydrologic uncertainty and information for gauged and ungauged areas in a basin, in order either to select an optimum number of stations from a dense network or to expand a sparse network. The first was done by maximizing the mutual information and the second was done by identifying minimum hydrologic information zones. In order to evaluate rainfall gauge networks, Krstanovic and Singh [1992] used the principle of maximum entropy. After deriving expressions for entropy, joint entropy, and transinformation, the authors presented criteria to keep or eliminate a rain gauge that depend on a reduction or gain of information at a particular rain gauge. Yang and Burn [1994] presented an approach to data collection design in which a measure of information flow between hydrometric gauges was used; this information flow was measured with entropy theory. A

W12553

1 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

comprehensive literature review on the use of information theory in water resources can be found in work by Singh [1997].

[8] The amount of information that is contained in two variables X, Y is given by the joint entropy, HðX ; Y Þ ¼ 

1.1. Monitoring Networks in Polders [4] A polder is a low‐lying area that is artificially disconnected from the hydrologic regime of neighboring regions by means of hydraulic structures and dikes in order to keep the water level within convenient ranges in the area of interest. Polders drain excess water independently into higher areas or to the sea, forming a system that requires observation of water levels in every polder so that a proper management of the hydraulic structures that connect them is achieved. Therefore, significant financial effort is needed to measure the water levels in these systems adequately, so ways to optimize these corresponding monitoring networks are of interest. [5] The problem of locating water level monitors in polders was first addressed by Alfonso et al. [2010], who developed an information theory‐based approach for the design of water level monitoring stations using three pairwise criteria for measuring independency among the monitors. The present paper has two main objectives: (1) to explore a multiobjective optimization solution to basically the same problem presented by Alfonso et al. [2010] and (2) to include additional, practical constraints in the original problem. The new solution is complementary to the method described by Alfonso et al. [2010] and is designed to include additional constraints that cannot be accommodated by the first method. 1.2. Information Theory [6] IT provides useful expressions to measure information, understood as the reduction in uncertainty. In this paper two expressions are used: (1) entropy (H), to estimate the information content of a random variable, and (2) total correlation (C), to measure the nonlinear dependency among two or more random variables. Details of each expression are presented below. 1.2.1. Entropy (Information Content) [7] Entropy H(X) indicates how surprising it is, on average, to get a symbol x from a random variable X that can take the possible symbols x1,x2,…xn each with probability p(xi): H ðX Þ ¼ 

n X

pðxi Þ log pðxi Þ

ð1Þ

n X m X     p xi ; yj log p xi ; yj

ð2Þ

i¼1 j¼1

in which p(xi, yj) is the joint probability of the variables X and Y. 1.2.2. Total Correlation [9] The concept of total correlation [McGill, 1954; Watanabe, 1960], C(X1,X2,…,XN), provides a direct and effective way of assessing the dependency among multiple variables: C ðX1 ; X2 ; . . . ; XN Þ ¼

N X

H ðXi Þ  H ðX1 ; X2 ; . . . ; XN Þ

ð3Þ

i¼1

It can be noted that for the case of N = 2, total correlation is equivalent to the well‐known transinformation (or mutual information). The term H(X1,X2,…,XN) is the multivariate joint entropy (equation (2) for the case of N variables) of the set X1,X2,…,XN. Total correlation can be calculated by following the grouping property of mutual information [Kraskov et al., 2005], which for the three variables X, Y, and Z is as follows. [10] 1. A new variable A is built up by agglomerating X and Y in such a way that H(A) = H(X,Y). The procedure of “agglomeration” consists of placing in A a unique value for every unique combination of the corresponding records in X and Y. For instance, if X = [1,2,1,2,1] and Y = [2,3,1,3,2], then one of the options to agglomerate X and Y to build the variable A is to put all the corresponding digits (or symbols) of X and Y together, i.e., A = [12,23,11,23,12]. [11] 2. Following the same concept, a new variable B is built by the agglomeration of A and Z, also with the condition H(B) = H(A,Z). [12] 3. The mutual information between the selected pairs for agglomeration, i.e., C(X,Y) = H(X) + H(Y) − H(A) and C(A,Z) = H(A) + H(Z) − H(B), are calculated. [13] 4. The total correlation of X, Y, and Z is calculated by summing up the partial total correlations obtained for each built variable, i.e., C(X,Y,Z) = C(X,Y) + C(A,Z). [14] As can be noted, this method does not need to assess H(X1,X2,…, XN); therefore, the estimation of the joint probability distribution p(x1,x2,…,xN) is not needed. After having calculated C(X1,X2,…,XN) by following the steps above, the multivariate joint entropy H(X1, X2,…, XN) can then be calculated from equation (3) as

i¼1

H ðX1 ; X2 ; . . . ; XN Þ ¼

Units are nats if the base is of the logarithm is e and bits if it is 2. In this paper the latter will be used. It is important to note that some authors (e.g., T. Schneider, Information theory primer with an appendix on logarithms, 2000, available at http://www.ccrnp.ncifcrf.gov/∼toms/paper/primer/) call H(X) “uncertainty” instead of “entropy” to avoid confusion with the thermodynamic entropy. In this paper, the probabilities p(xi) are estimated by the histogram‐based (relative frequency) method with a given bin size or number of classes or intervals as used, for example, by Markus et al. [2003], Maruyama et al. [2005], and Mishra et al. [2009]. Further explanation on the estimation of probabilities is given in section 2.2.

W12553

N X

H ðXi Þ  C ðX1 ; X2 ; . . . ; XN Þ

ð4Þ

i¼1

Although Alfonso et al. [2010] used the total correlation concept to check the dependency of the monitors obtained by using pairwise criteria, in this paper total correlation is used directly as an objective in the optimization process, as described in section 2.

2. Optimization of Joint Entropy and Total Correlation in Monitoring Network Design [15] Regarding the use of information theory in the design of monitoring networks, the analysis is based on looking for

2 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

those locations where the information content about a particular water‐related variable is a maximum, so that a monitoring device placed there has “potential information” in the sense that once placed, it would reduce uncertainty by providing information [Mogheir and Singh, 2002]. For more than one variable (i.e., more than one monitoring device), joint entropy is used because it represents the information content of the set of monitoring devices, which can be maximized, as Caselton and Husain [1980] did for the case of reducing the size of an existing rainfall monitoring network. [16] Additionally, minimizing transinformation between monitors is the basis of designing monitoring networks by applying information theory. Naturally, the placement of two monitors that provide exactly the same information is not optimum. In other words, the redundancy of the monitors should be as small as possible [Mishra and Coulibaly, 2009]. In this paper, total correlation is used to measure redundancy among multiple variables. [17] The main contribution of this paper is that joint entropy and total correlation are independent objectives that must be optimized. A Venn diagram can be used to illustrate this idea, where the information content of a monitor is represented by the area of a circle; the information shared between variables corresponds to their overlapping areas, and the information content of the set of variables is described by the total area covered by the circles [Cover and Thomas, 1991; Ruddell and Kumar, 2009]. Suppose, for example, that 10 potential locations are available to place three monitors; the equivalent Venn diagram might appear as in Figure 1a, which represents the information content of 10 variables and their common information. The task is to select the most informative set of three variables that, simultaneously, are least interdependent; that is, the summation of the overlapped areas is a minimum. Three possible solutions for this generic example are shown in Figures 1b, 1c, and 1d. The relationships in terms of information content for each solution are shown in the first row of Figure 1; the area representing the joint entropy of each solution set is shown by the total covered area in the second row of Figure 1 (to be maximized), and the overlapping areas representing total correlation are shown in the third row (to be minimized). 2.1. Optimal Water Level Monitoring Location [18] The aim is to find the set of new stations X = {X1, X2, …,XM} that optimally complement the set of N already existing stations E = {E1,E2,…,EN} in such a way that the joint set of M + N stations S = {E, X} provides the maximum possible information content with the minimum shared information between them. The first objective is described by the joint entropy, equation (4), whereas the second is described by the total correlation, which is estimated following the steps described previously in section 2. To understand why the second objective is needed, the situation in which two water level gauges are located extremely close to each other can be considered: they would, in effect, record the same time series; they would have the same information content; and therefore both stations would be completely dependent (redundant). [19] Additionally, every single station should be placed where the highest information content can be extracted. Under this consideration, water system points with constant or quasi‐constant water level records should not be selected because they do not provide any further information (i.e.,

W12553

water level value does not change, so just one record would be enough to describe the state of that point). This consideration can be added as a constraint in the multiobjective optimization problem (MOOP) by excluding low‐entropy points from the decision variable set, which in turn implies a reduction in the search space and therefore a reduction of the computational effort. The definition of the threshold to identify low‐entropy points is described in section 2. [20] Taking into account the previous considerations, the MOOP is mathematically formulated as follows:   minfC SM ;N  ¼ C ðX1 ; X2 ; :::; XM ; E1 ; E2 ; :::; EN Þg maxfH SM;N ¼ H ðX1 ; X2 ; :::; XM ; E1 ; E2 ; :::; EN Þg

ð5Þ

subject to H(Xi) greater than the low‐entropy threshold and where N plus M equals the number of monitors to place. 2.2. Methods and Tools [21] It is considered that every point within a water system is, in principle, a potential location for a monitoring point. For this purpose, the methodology considers the use of hydrodynamic and rainfall‐runoff models to generate a water level time series at a dense, finite set of calculation points. In this way, a number of water level records are generated with a predefined record length, from which the information‐related measures are calculated. [22] The procedure to estimate the probabilities required for the calculation of equations (1), (2), and (4) follows a frequency analysis applied for the transformed water level time series using the floor function, yielding a regular discretization of water levels (in terms of level) at irregular intervals of time. In this way, high‐frequency, low‐amplitude water level changes produced, for example, by dynamic waves or pumping stations are filtered out. A detailed analysis and the implications are presented by Alfonso et al. [2010]. [23] It is important to note that the solutions obtained by solving equation (5) will only be optimal for the time step at which the data records are generated by the model. Therefore, the model must be manipulated according to the final aim and use of the monitoring network by considering the proper time and space steps. Similarly, the resultant monitoring network will be adequate for capturing the information content of the physics of the runoff process associated with the rainfall event used to produce the water level time series. That is, every single rainfall event has associated with it an optimal monitoring network, because different pump stations start pumping at different times and for different periods. For this study a rainfall event of a 5 year return period was used (much less extreme than the one used by Alfonso et al. [2010]) as it is of particular interest in assessing regular flood risks for the region. The MOOP is solved using the non‐ sorted genetic algorithm (NSGA‐II) by Deb et al. [2002] and is implemented in the NSGAX software [Barreto et al., 2006]. 2.3. Case Study [24] A case study in a subdistrict of Delfland, Netherlands, is considered. It has an area of 18.80 km2 that is divided in 130 hydrologic response units (artificial subcatchments connected by small canals, which have their water levels below sea level and the main storage basin), 14 pump stations (9 with a capacity larger than 15 m3/h), 21 weirs, and 40 different target water levels (Figure 2). The pumps are

3 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

Figure 1. Venn diagrams illustrating the proposed optimization problem. (a) Information content of 10 variables and their common information. (b) Possible solutions for the selection of (top) three monitor locations, obtained by maximizing (middle) joint entropy and minimizing (bottom) total correlation.

Figure 2. Canal network and target water levels in the polder system, Pijnacker region, Delfland, Netherlands. Water is drained from low areas to high areas throughout pump stations to the storage basin. 4 of 13

W12553

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

W12553

Figure 3. Definition of low‐entropy points to be discarded from the search space for the optimization, according to the relative frequency of the entropy of the points in the Delfland system.

operated to keep the water levels between predefined limits, according to interests such as flood control, navigation, agriculture, and recreation. The region is mainly rural, with some urban areas (5.7 km2) and glasshouses (2.82 km2). A more detailed description of Delfland water management can be found in work by Lobbrecht [1997]. The existing water level monitoring network fails to provide the proper information about the state of the system, especially under extreme weather conditions. This is because the measurements are basically used to check whether the current water levels are between the on/off levels of the pumps and how to operate them accordingly. The measurement points of the nine biggest pumping stations are further identified as existing pump monitors (EPM). The hydrodynamic model used to generate the water level time series was run with a typical rainfall event for the area. Even though the rainfall event lasted about 19 h, the model was run for a simulation period of 8 days and 17 h, with a time step of 15 min, in order to take into account all of the operations of the hydraulic structures before and after the event. Under these conditions, 1520 water level time series with a length of 810 records were obtained for every computational point in the water system. [25] The threshold to define low‐entropy points to be discarded from the search space in equation (5) was defined by looking at the relative frequency of the entropies in the system. Figure 3 shows that more than 50% of the points have an entropy value below 0.1 bits, which represents less than 7% of the point with maximum entropy. This implies that the search space is dramatically reduced by a factor of several hundreds. This value is used in equations (6) and (7), introduced in section 3.

3. Approaches to Pose the Optimization Problem for Delfland Case [26] Two practical situations occur in the Delfland water system: the need to introduce financial restrictions on installing new monitors (approach a), and the problem of the accuracy of measurements taken near hydraulic structures because of small water level fluctuations (approach b).

Although both situations should be considered simultaneously, they are applied separately in order to facilitate the analysis of the results. For both situations it is assumed that nine monitors in total need to be located in the system. 3.1. Approach a [27] For the first situation we define an additional term u*M, representing the cost (in informative units) of having to place M new monitors. The parameter u is a constant with cost units of bits per new monitor. The term u*M is then added to the total correlation (u bits of redundancy are added to the set) and subtracted from the joint entropy (u bits of joint information are subtracted from the set) every time the set contains a new monitor. In this way, the optima are kept separate for both objectives as M increases. Although u may differ according to the location of a particular monitor, a constant value is considered to simplify the problem. For the subsequent experiments, u is defined equal to 1 bit per new monitor; a sensitivity analysis of this parameter is presented at the end of the paper. The resultant optimization problem can be written as   minfF1 ¼ C SM ;N þ uM ¼ C ðX1 ; X2 ; :::; XM ; E1 ; E2 ; :::; EN Þ þ uM g maxfF2 ¼ H SM ;N  uM ¼ H ðX1 ; X2 ; :::; XM ; E1 ; E2 ; :::; EN Þ  uM g

ð6Þ

subject to H(Xi) > 0.1 bits and 0 ≤ M ≤ 9; 0 ≤ N ≤ 9; M + N = 9. 3.2. Approach b [28] For the second situation we introduce the term q*v, which represents the cost, in informative units, of having to place monitors close to hydraulic structures. The number of times the distance ds (minimum distance between a monitor and a structure evaluated over all possible combinations of monitors and structures) is violated by a particular solution is v; q is a constant with cost units of bits per violation of minimum distance. It is assumed ds = 50 m and q = 1 bit per violated ds. As in the first situation, the term q*v is added to the total correlation and is subtracted from the joint entropy

5 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

of each evaluated set to keep the optima away from the ideal point (min C and max H) as v increases. The resultant optimization problem can be written as   minfF1 ¼ C SM;N þ qv ¼ C ðX1 ; X2 ; :::; XM ; E1 ; E2 ; :::; EN Þ þ qvg maxfF2 ¼ H SM ;N  qv ¼ H ðX1 ; X2 ; :::; XM ; E1 ; E2 ; :::; EN Þ  qvg

ð7Þ

subject to H(Xi) > 0.1 bits and 0 ≤ M ≤ 9; 0 ≤ N ≤ 9; M + N = 9. [29] In order too solve equations (6) and (7), NSGA‐II [Deb et al., 2002] was used with the following evolutionary parameters: crossover probability 90%, mutation probability 1/9. The evolutionary parameters of number of population and generations were set after several experiments with different values; the use of 500 populations and 2000 generations was found convenient, because with bigger values the solutions do not improve significantly for the two situations.

4. Analysis of Results [30] In order to facilitate the analysis, the calculation points of the system have been labeled with integer numbers from 1 to 1520. It is worth noting that the total joint entropy of the system (i.e., the joint information contained in these 1520 points as a single set) is Hsys = H(X1,X2,…, X1520) = 4.91 bits, a value that represents the ideal amount of information that the network of monitors S should provide. The results obtained are evaluated with respect to this value. It must be noted that in the authors’ previous paper, the value for Hsys = 9.2 bits was obtained using a rainfall event with a much higher return period. 4.1. Results for Approach a [31] The Pareto‐optimal set of solutions obtained for the first situation is presented in Figure 4, where the solutions are characterized according to the number of existing monitors that were picked up in the optimization process (0, 1, 2, and 3 EPM). From this point onward, the notation SM,N will be used to show that the set of monitors S is composed of M new monitors and N existing monitors. [32] The solution S0.9 (corresponding to the full set of EPM) is not presented Figure 4 because it is out of the figure scale, but its location can be observed in Figure 5. This solution has a very small total correlation (close to 0) but also a relatively small information content (1.04 bits); one record taken at each of these monitors would jointly provide slightly more than 1 bit of information on average, or about 20% of the information of the state of the system, meaning that the current monitoring network is far from optimal. [33] Several interesting facts can be mentioned (see Figures 4 and 5). [34] 1. Solutions for N = 4, 5, 6, 7, 8, and 9 are always dominated by other solutions, so they are not found to be part of the Pareto front. [35] 2. Existing monitors 541, 1337, 669, and 842 are not selected in any scenario. [36] 3. Solutions for S3,6 always include monitors 1187, 465, and 56. These N = 6 (new) monitors make the joint entropy vary between 3 and 3.5 bits (between 60% and 70% of Hsys) and between 0.2 and 0.6 bits in total correlation terms.

W12553

[37] 4. All the previously commented solutions (S3,6) are always dominated by solutions that consider fewer existing monitors and more new monitors in the final set. [38] 5. For the scenario S2,7 we found solutions with joint entropy between 3 and 4 bits that range between 0.2 and 1.1 bits of total correlation. Only three combinations of two existing monitors (1187, 56), (1187, 465), and (1203, 56) are part of the Pareto front of optimal solutions. [39] 6. From Figure 4 it is clear that this Pareto front is closer to the ideal value (C = 0, HJ = 4.91), and therefore it dominates the previously discussed solution S3,6. [40] 7. For the case of S1,8, the resultant Pareto front dominates practically all the solutions obtained for the previously discussed scenarios. [41] In order to characterize the solutions, the extremes of the Pareto front are analyzed (Figure 4), where Xa identifies the solution that maximizes the joint entropy (bottom right extreme) and Ya identifies the solution that minimizes the total correlation (top left extreme). The subindex a is provided to distinguish the solutions obtained for approach a and approach b. First, the solution at Xa, which maximizes the (negative) joint entropy, places (also new) monitors at S9,0 = (587, 991, 286, 1030, 57, 458, 42, 175, 1204), with joint entropy of 4.31 bits (85% of Hsys) and total correlation of 1.19 bits. Second, the solution at Ya, which minimizes the total correlation, corresponds to the selection of (all new) monitors S9,0 = (827, 1490, 1078, 1265, 620, 704, 891, 394, 1151), which is a set with total correlation of 0.0 and joint entropy of 1.51 bits (30% of Hsys). [42] These two sets of monitors, Xa and Ya, as well as the solution for which S = EPM, are located in the map of the system in Figure 5. As expected, the monitoring sets are different in spatial terms. The most important monitors of each extreme, in terms of information, are the monitors Xa6 (point 458) and Ya7 (point 891), because they are located in a zone with high marginal entropy. However, in both extremes Xa and Ya (and regardless of having excluded the points with the lowest entropy), points with low entropy are included in the solutions. On one hand, at the extreme Xa, the point Xa2 = 991 has a low‐information content, which means that the eight remaining points would be enough to place. This situation may lead to a refined criterion to determine the number of monitors to be placed, so that no assumptions in this regard would be needed and the second constraint of equations (6) and (7) may not be considered. [43] On the other hand, all the monitors of the solution at extreme Ya are located at very low informative sites, with the exception of point Ya6 = 704 with H(Ya6) = 1.5 bits. This explains why this solution has such a low total correlation: low‐entropy points are more independent from the rest than high‐entropy points. Naturally, this solution is far from being a good set for monitoring because it does not provide significant joint information. 4.2. Results for Approach b [44] Following the same identification pattern used for approach a, the extremes of the Pareto front in Figure 6 are used, Xb being the solution that maximizes the joint entropy, and Yb being the solution that minimizes the total correlation. In addition, Figure 6 discriminates the solutions by the number of violations v of the minimum distance ds. Several observations can be made. First, it can be noticed that,

6 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

Figure 4. Pareto‐optimal set of solutions discriminated by EPM, approach a. Extremes Xa and Ya are indicated for further analysis. Results obtained with WMP method [Alfonso et al., 2009] are also indicated.

Figure 5. Delfland water system with location of solutions for approach a obtained at the extremes Xa and Ya of the Pareto frontier of Figure 4. Solution for S = EPM is also included. Scale represents the marginal entropy at each system point estimated with equation (1). 7 of 13

W12553

W12553

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

Figure 6. Pareto‐optimal front, approach b, discriminated by the number of times the minimum distance ds is violated by the solution set. Extremes Xb and Yb are indicated for further analysis. Results obtained with WMP method [Alfonso et al., 2009] are also indicated. despite having only nine monitors to place, some of the solutions have violated the distance rule 10 times, which means that one monitor was close to either a weir or a pump station more than once. This was expected because of the high density of hydraulic structures in the area. Second, solutions with low total correlation are found only when no violations take place. This implies that nonredundant sets of monitors are only possible to place away from hydraulic structures. However, the price of this independency is paid by the fact that jointly they collect relatively little information (less than 60% of the information content of the system); the trade‐off between the two information quantities is again evident. Third, three solutions give the highest joint entropy and correspond to solutions with 3, 4, and 5 distance violations. [45] A detailed analysis of the optimization is presented in Table 1, which categorizes the 500 solutions obtained for approach b in terms of violations of the minimum distance due to the presence of pumps and/or weirs. It can be noted that the majority of violations are caused by pumps for solutions that include combinations of violations (by pumps and by weirs). For instance, for six violations, the combined possibilities are five pumps plus one weir (e solutions), four pumps plus two weirs (22 solutions), and three pumps plus three weirs (18 solutions) (Table 1). In other words, the number of violations due to the proximity of the monitors to the pumps is generally bigger than the number of violations due to their proximity to the weirs. This can be explained by the fact that a pump operation adds entropy to the upstream and downstream neighboring points, while a fixed weir in

modular regime stabilizes the water levels in a way that downstream points reduce their marginal entropy. However, no trivial pattern in the Pareto front was identified on the basis of the number of pumps or the combination of pumps and weirs. [46] The resulting monitoring sets obtained at both extremes of the Pareto front (for maximum joint entropy and for minimum total correlation) are located in the map of the water system in Figure 7. The solution for the extreme Xb gives a maximum joint entropy of 4.04 bits, about 82% of Hsys. However, two points appear to be low informative: these are Xb2 (point 776) and Xb9 (point 170). Similarly to the first situation, it has been found that nine monitors are not necessary: indeed, seven are enough to describe the information content of the system. For the case of the extreme Yb, we find again a situation similar to the first: the majority of the computational points have negligible Table 1. Number of Solutions for Approach b With Minimum Distance Violations by Pumps and Weirs Number of Solutions With Violations by Pumps

Number of Solutions With Violations by Weirs

0

1

2

3

4

5

6

Total

0 1 2 3 4 Total

79 30 5 ‐ ‐ 114

21 33 16 ‐ ‐ 70

8 30 15 14 ‐ 67

6 30 19 18 ‐ 73

4 18 22 26 8 78

‐ 3 23 31 19 76

‐ ‐ ‐ 7 15 22

118 144 100 96 42 500

8 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

W12553

Figure 7. Delfland water system with location of solutions for approach b obtained at the extremes Xb and Yb of the Pareto frontier of Figure 6. Location of existing hydraulic structures is also included. Scale represents marginal entropy values at each system point (bits).

information content, with the exception of points Yb3 (point 1030) and Yb7 (point 801). This result shows again that points with very low entropy contribute to decrease the total correlation of the set. However, as in the previous set, this solution should not be taken into account because it provides low information content (about 2.0 bits or 40% of Hsys). 4.3. Comparison of Results With WMP Approach [47] The set of monitoring networks obtained through the MOOP for both approaches a and b are compared with the method water level monitoring design in polders (WMP) [Alfonso et al., 2010], in which three pairwise criteria, namely, transinformation Ixy and directional information transfer DITxy and DITyx [Yang and Burn, 1994] are used to evaluate the independency of the monitoring set. The WMP is a step‐based method that can be classified as a greedy algorithm, in which the next best monitor (with high information content and low dependency) is selected at each step, given the total number of monitors. For the sake of comparing results, the WMP method was run for the same rainfall event used in this paper. It must be noted that the WMP method was run with no constraints, so the resultant

monitoring network is composed of only new devices, and their proximity to hydraulic structures is not taken into account. [48] The information theory characteristics of the monitoring sets obtained with the WMP method are included in the Pareto‐optimal solutions of Figures 4 and 6. The result for Ixy is almost identical to the result for DITyx, so they are presented as a single point in the graphs, and the result for DITxy is out of the scale of both graphs. The immediate conclusion is that the WMP method provides solutions that are part of both the Pareto‐optimal front obtained for sets composed by new monitors only (Figure 4) and the Pareto‐ optimal front obtained for sets that do not violate the minimum distance to hydraulic structures. 4.3.1. Priority of Monitors’ Placement [49] One of the characteristics of the multiobjective optimization with genetic algorithms is that during each step of the process, all the variables (monitors) that belong to each solution are generated at the same time, regardless of their individual significance in informative terms. Nevertheless, this is an important issue when placing the monitors, because it is expected that during their implementation, some monitors have a different priority than others. In order to prioritize the monitors, again from the

9 of 13

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

W12553

Figure 8. Progress of information quantities as new monitors are added. Analysis for extremes Xa and Ya of Figure 4. information standpoint, it is possible to sort them either by total correlation (in ascending order, so the monitor that adds the least C to the set is placed first) or by joint entropy (in descending order, so the monitor that adds the biggest JH to the set is placed first).

4.3.2. Approach a [50] Figure 8 shows the behavior of the information theory values for the solutions of approach a obtained at the extremes Xa (Figure 8, left) and Ya (Figure 8, right), sorting them by total correlation in ascending order (Figure 8, top)

Figure 9. Progress of information quantities as new monitors are added for solution of approach a for extremes Xb and Yb of Figure 6. 10 of 13

W12553

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

Figure 10. Sensitivity of the maximum (left) joint entropy and (right) total correlation due to variations of the parameter u, discriminated by the number of new monitors in the solution. and by joint entropy in descending order (Figure 8, bottom). In all cases, the first monitor is the one with the highest marginal entropy. Figure 8 (top left) shows that the set starts with the monitor 42, H(42) = 1.5 bits, and that it is followed by point 991, with H(991) ∼ 0. This means that C(42,991) ∼ 0 and H(42,991) = 1.5, or, in other words, that although monitor 991 is independent from monitor 42, it also does not add additional information to what can be inferred from 42 alone. For the case of Ya Figure 8 (top right), the first monitor is 704, with a marginal entropy of H(704) = 1.5 bits, and all the subsequent monitors add zero total correlation to the set, but at the same time they do not provide any additional information content; in other words, only monitor 704 has information content in this case, so that Figure 8 (top right) is a flat curve. [51] Figure 8 (bottom) shows the same results at each extreme, but now the monitors are sorted by the highest addition in joint entropy. For the case of Xa (Figure 8, bottom left), the second point is 1030, with H(1030) = 1 bit. However, H(42,1030) is not the same as H(42) + H(1030) because C(42,1030) > 0 and the total correlation curve rises. As expected, monitor 991 is the last selected since it does not add information to the set (and therefore only the first eight monitors would be placed). For the case of Ya (Figure 8, bottom right), it is observed that performing the sorting does not make any sense since the only point that adds information to the set is point 704. This analysis has profound implications in the final number of monitors to be located. 4.3.3. Approach b [52] Figure 9 presents the previous analysis for the results obtained for the second situation. In this case it is evident that monitors 776 and 170 obtained in the solution for the extreme Xb do not add any information to them jointly, H(1229) = H(1229,776,170), so they have no capacity to reduce the total correlation to the set. As in the first situation, the extreme for which total correlation is a minimum (extreme Yb) shows that only points 1030 and 801 are informative points, H(1030,801), and therefore only seven points would be needed. This confirms again that the solutions located at this extreme of the Pareto front should not be considered for the final selected monitoring set.

4.3.4. Sensitivity Analysis of the Parameters u and q [53] The initial assumptions of u = 1 and q = 1 were initially chosen to express costs in informative units, and their selection may affect the outcomes. The sensitivity analysis was carried out by solving equation (6) for values of u of 0.1, 0.5, 1, 2, and 5 bits per new monitor. To complete the calculations for the sensitivity analysis in reasonable time, 100 populations and 500 generations were used with NSGA‐II. The extremes of the obtained Pareto fronts were analyzed to evaluate the quality of the results. [54] For the parameter u, Figure 10 and Table 2 have been prepared, where the sensitivity of the maximum joint entropy and the total correlation due to variations of the parameter u is presented, according to the number of new monitors in the solution. Note that the calculation of joint entropy in Table 2 is marginally different for the case u = 1 and EPM = 0 as calculated above because of the reduced NSGA‐II parameters. However, this does not affect the relative behavior of the parameters u and q. [55] It can be observed that for solutions having only new monitors, good results are achieved regardless of the value of u (joint entropy between 4.12 and 4.38 bits or between 84% and 89% of Hsys). However, solutions considering some existing monitors become less informative as u decreases. This is because there are no restrictions on placing the new monitors and better combinations of monitors are found. An additional effect is that no solution that considers four EMP was found within the solutions for u = 0.1, 0.5, and Table 2. Sensitivity Analysis for Parameter u Criteria for Evaluation of Solutions Maximum joint entropy

Minimum total correlation

11 of 13

u (bits per new monitor) EPM

0.1

0.5

1

2

5

0 1 2 3 4 0 1 2 3 4

4.12 3.52 3.04 2.75 ‐ 1.27 0.47 0.19 0.13 ‐

4.26 3.84 3.54 2.85 ‐ 2.40 0.85 0.58 0.14 ‐

4.31 3.73 2.95 2.10 ‐ 1.91 1.06 0.89 0.07 ‐

4.27 4.20 4.12 3.71 3.36 1.60 1.87 1.88 1.43 1.34

4.38 4.37 4.30 4.18 2.61 2.57 2.10 2.54 1.88 0.52

W12553

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

Figure 11. Sensitivity of the maximum (a) joint entropy and (b) total correlation due to variations of the parameter q, discriminated by the number of new monitors in the solution.

1.0. Conversely, high values of u force the inclusion of existing monitors, which generates solutions of lower quality (lower information content, see Figure 10a. When looking at the distribution of the total correlation (Figure 10b), solutions with independent monitors are obtained when more existing monitors are considered, which again supports the conflicting nature of these objectives. [56] A similar analysis was carried out for the parameter q; see Figure 11 and Table 3. In general, solutions obtained with low values of q seem to be less informative and less independent, regardless of the number of violations of distance to hydraulic structures. Additionally, a value of q = 0.1 gives solutions that do not include a high number of violations (see Table 3).

additional terms u*M and q*v were introduced into the objective functions. [58] The following conclusions can be drawn. [59] 1. The information measures of total correlation and joint entropy are two conflicting quantities because as the first improves (i.e., monitors are independent among them), the second deteriorates (i.e., monitors get less information content), and vice versa.

Table 3. Sensitivity Analysis for Parameter q Criteria for Evaluation of Solutions Maximum joint entropy

5. Summary and Conclusions [57] An alternative method for siting a set of water level monitors in polders on the basis of information quantities is presented. The first quantity is joint entropy, which evaluates the amount of information content that the set is able to collect; the second is total correlation, which evaluates the level of dependency or redundancy among monitors in the set. In order to find the most convenient locations to put the monitors from a large number of potential sites, a multiobjective optimization procedure (MOOP) was posed under different considerations: one that takes into account the costs of placing a completely new monitor and another that considers the cost of placing monitors too close to hydraulic structures. In both cases, the joint entropy of the set is maximized and its total correlation is minimized. The costs are considered in terms of information theory units, for which

Minimum total correlation

12 of 13

u (bits per new monitor) Number of Violations

0.1

0.5

1

2

5

0

3.93

3.81

4.06

4.11

4.06

1 2 3 4 5 6 7 8 0

4.26 4.26 4.26 ‐ ‐ ‐ ‐ ‐ 0.00

3.91 4.00 3.94 3.97 3.91 3.86 3.76 ‐ 0.00

3.97 3.93 3.72 3.92 3.95 3.87 3.84 3.78 0.00

4.27 4.29 4.29 4.01 4.09 4.08 4.09 4.10 0.00

4.30 4.08 4.33 4.09 3.98 3.90 3.89 3.22 0.00

1 2 3 4 5 6 7 8

0.38 0.49 0.77 ‐ ‐ ‐ ‐ ‐

0.38 0.52 0.53 0.50 0.61 0.61 0.69

0.32 0.18 0.19 0.10 0.23 0.24 0.25 0.22

0.07 0.03 0.12 0.11 0.07 0.10 0.01 0.10

0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

W12553

ALFONSO ET AL.: WATER LEVEL MONITORING OPTIMIZATION

W12553

[60] 2. The existing monitoring network, which reduces to the measurements taken at the pumping stations to control the switching of the pumps, is not optimal from the information theory point of view. [61] 3. The solutions for which total correlation was considered as a single objective (i.e., without joint entropy) are not satisfactory in terms of monitoring because most of the monitors in the set are placed at sites with no information content. This means that points with no information content are the only ones that are able to add the least total correlation (independency) in a high number of interdependent points. The Pareto solutions located at the extreme where the total correlation is minimum should, therefore, be neglected. However, the maximization of joint entropy gives useful results, as the solutions found can cover between 82% and 85% of the total information content of the system by selecting fewer points than the originally proposed nine points. [62] 4. The results obtained with the WMP method are part of the optimal set of networks obtained by solving the MOOP posed in this paper. The solution of the MOOP, however, has two main advantages over the WMP method: first, it gives a complete picture in terms of options to select a monitoring set; second, it allows adding constraints to the problem so a wider range of situations can be tested. [63] 5. In terms of the practical situations analyzed separately in this paper, namely, the financial constraint of having to place new monitors and the accuracy constraint of having to place them near hydraulic structures, it can be concluded that the best solutions in information units are numerically very similar (joint entropy for the first situation equal to 4.18 bits and for the second situation equal to 4.04 bits) but spatially different. [64] Ongoing research is carried out to evaluate parameters u and q of equations (6) and (7) to find the value of 1 bit of information in the framework of a decision making in a given water system. Furthermore, the concept of accuracy for the case of measurements close to hydraulic structures could also be strengthened by analyzing the propagation of errors and considering dynamic waves.

Amorocho, J., and B. Espildora (1973), Entropy in the assessment of uncertainty in hydrologic systems and models, Water Resour. Res., 9, 1511–1522. Barreto, W. J., et al. (2006), Approaches to multi‐objective multi‐tier optimization in urban drainage planning, paper presented at International Conference on Hydroinformatics, Int. Assoc. for Hydroenviron. Eng. and Res., Nice, France. Caselton, W. F., and T. Husain (1980), Hydrologic networks: information transmission, J. Water Resour. Plan. Manage, 106, 503–520. Cover, T. M., and J. A. Thomas (1991), Information Theory, John Wiley, New York. Deb, K., et al. (2002), A fast and elitist multiobjective genetic algorithm: NSGA‐II, IEEE Trans. Evol. Comput., 6, 182–197. Husain, T. (1989), Hydrologic uncertainty measure and network design, Water Resour. Bull, 25, 527–534. Kraskov, A., H. Stoegbauer, R. G. Andrzejak, and P. Grassberger (2005), Hierarchical clustering based on mutual information, Europhys. Lett., 70, 278. Krstanovic, P. F., and V. P. Singh (1992), Evaluation of rainfall networks using entropy: I. Theoretical development, Water Resour. Manage, 6, 279–293. Lobbrecht, A. H. (1997), Dynamic Water‐System Control: Design and Operation of Regional Water‐Resources Systems, A. A. Balkema, Rotterdam, Netherlands. Markus, M., et al. (2003), Entropy and generalized least square methods in assessment of the regional value of streamgages, J. Hydrol., 283, 107–121. Maruyama, T., et al. (2005), Entropy‐based assessment and clustering of potential water resources availability, J. Hydrol., 309, 104–113. McGill, W. J. (1954), Multivariate information transmission, Psychometrika, 19, 97–116. Mishra, A. K., and P. Coulibaly (2009), Developments in hydrometric network design: A review, Rev. Geophys., 47, RG2001, doi:10.1029/ 2007RG000243. Mishra, A. K., et al. (2009), An entropy‐based investigation into the variability of precipitation, J. Hydrol., 370, 139–154. Mogheir, Y., and V. P. Singh (2002), Application of information theory to groundwater quality monitoring networks, Water Resour. Manage., 16, 37–49. Ruddell, B. L., and P. Kumar (2009), Ecohydrologic process networks: 1. Identification, Water Resour. Res., 45, W03419, doi:10.1029/ 2008WR007279. Singh, V. P. (1997), The use of entropy in hydrology and water resources, Hydrol. Processes, 11, 587–626. Watanabe, S. (1960), Information theoretical analysis of multivariate correlation, IBM J. Res. Dev., 4, 66–82. Yang, Y., and D. H. Burn (1994), An entropy approach to data collection network design, J. Hydrol., 157, 307–324.

References

L. Alfonso, A. Lobbrecht, and R. Price, Hydroinformatics and Knowledge Management, UNESCO‐IHE, Westvest 7, NL‐2611 AX Delft, Netherlands. (l.alfonsosegura@unesco‐ihe.org)

Alfonso, L., A. Lobbrecht, and R. Price (2010), Information theory‐based approach for location of monitoring water level gauges in polders, Water Resour. Res., 46, W03528, doi:10.1029/2009WR008101.

13 of 13

Suggest Documents