Risk-Based Sensor Placement for Contaminant

Risk-Based Sensor Placement for Contaminant Detection in Water Distribution Systems Martin Weickgenannt1; Zoran Kapelan2; Mirjam Blokker3; and Dragan A. Savic4 Abstract: A method for optimizing sensor locations to effectively and efficiently detect contamination in a water distribution network is presented here. The problem is formulated and solved as a twin-objective optimization problem with the objectives being the minimization of the number of sensors and minimization of the risk of contamination. Unlike past approaches, the risk of contamination is explicitly evaluated as the product of the likelihood that a set of sensors fails to detect contaminant intrusion and the consequence of that failure 共expressed as volume of polluted water consumed prior to detection兲. A novel importance-based sampling method is developed and used to effectively determine the relative importance of contamination events, thus reducing the overall computation time. The above problem is solved by using the nondominated sorting genetic algorithm II. The methodology is tested on a case study involving the water distribution system of Almelo 共The Netherlands兲 and the potential intrusion of E. coli bacteria. The results obtained show that the algorithm is capable of efficiently solving the above problem. The estimated Pareto front suggests that a reasonable level of contaminant protection can be achieved using a small number of strategically located sensors. DOI: 10.1061/共ASCE兲WR.1943-5452.0000073 CE Database subject headings: Optimization; Sampling; Water distribution systems; Water pollution; Risk management. Author keywords: Optimization; Sampling; Water distribution systems; Contamination; Risk analysis.

Introduction A water distribution system 共WDS兲 is the vital part of any city. Due to its size and complexity, it is likely that water contamination can occur at any time. The causes for contamination can be accidental 共e.g., pipe bursts, chemical accident兲 or deliberate 共e.g., terrorist attacks兲. Therefore, an effective and efficient contaminant detection system is required to prevent the population from being affected. Obviously, when designing such a system, a basic trade-off exists between the level of safety achieved and the cost of providing it. Bearing in mind the typical network size and budgets available, this means that only a limited number of locations in the WDS can or should be equipped with contamination sensors. Therefore, sensor locations in the WDS have to be chosen carefully. In recent years, there have been a number of publications dealing with the problem of optimal contamination sensor location with respect to various objective functions. Kessler et al. 共1998兲 1 Diplom-Ingenieur, Institut fuer Systemdynamik 共ISYS兲, Universitaet Stuttgart, Pfaffenwaldring 9, 70569 Stuttgart, Germany 共corresponding author兲. E-mail: [email protected] 2 Senior Lecturer, Centre for Water Systems, Univ. of Exeter, Harrison Bldg., North Park Rd., Exeter EX4 4QF, U.K. E-mail: z.kapelan@ exeter.ac.uk 3 KWR, P.O. Box 1072, 3430 BB Nieuwegein, The Netherlands. E-mail: [email protected] 4 Professor, Centre for Water Systems, Univ. of Exeter, Harrison Bldg., North Park Rd., Exeter EX4 4QF, U.K. E-mail: [email protected] Note. This manuscript was submitted on February 20, 2009; approved on January 8, 2010; published online on January 11, 2010. Discussion period open until April 1, 2011; separate discussions must be submitted for individual papers. This paper is part of the Journal of Water Resources Planning and Management, Vol. 136, No. 6, November 1, 2010. ©ASCE, ISSN 0733-9496/2010/6-629–636/$25.00.

used a single-objective formulation as a basis for sensor location optimization with the aim of finding a set of sensors which can provide a given level of service. The level of service is defined as the volume of polluted water allowed to be consumed prior to detection of contamination. The problem is addressed by creating a pollution matrix which indicates if a node can detect the contamination before the given level of service is reached. The optimization problem is solved by finding the minimum number of sensors needed to provide a maximum coverage of the pollution matrix 共i.e., finding a minimal set of sensors which can detect all possible contamination events before the level of service is reached兲. The solution is reached by formulating and solving a set covering problem. Ostfeld and Salomons 共2004兲 used a similar approach and solved the maximum coverage problem by a Genetic Algorithm. The results of Kessler et al. 共1998兲 are discussed in Kumar et al. 共1999兲 who suggested considering the level of service in terms of detection time instead of volume of polluted water. A number of researchers developed methodologies to solve the problem of optimal contamination sensor locations formulated as part of the recently held Battle of the Water Sensor Networks 共BWSN兲 competition in the United States 共Ostfeld et al. 2008兲. The BWSN task was formulated as a four-objective problem aiming at the minimization of contaminant detection time, population affected, demand of contaminated water, and, at the same time, maximization of contaminant detection likelihood. A number of different approaches were proposed for solving this problem; a summary of the methodologies presented and the results obtained can be found in Ostfeld et al. 共2008兲. The approach presented here differs from those presented at the BWSN competition in that a novel, explicit risk formulation of the problem analyzed, which merges the two risk components, i.e., the nondetection probability and the volume of consumed water prior to detection, into a single objective, thus giving a weighted estimate of the volume of

JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT © ASCE / NOVEMBER/DECEMBER 2010 / 629

Downloaded 25 Oct 2010 to 128.206.129.31. Redistribution subject to ASCE license or copyright. Visithttp://www.ascelibrary.org

water which is consumed prior to detection. This enables more explicit evaluation and, hence, more straightforward comparison of different solutions in terms of the contamination risk involved. It also reduces the number of objectives considered in the associated optimization problem thus enabling a single optimization model run to address the trade-off between the contamination risk and the sampling cost. In Shastri and Diwekar 共2006兲, the optimal sensor placement under parameter uncertainties is examined. The uncertainties considered are the location of intrusion 共i.e., contamination node兲 and the water demand at the time of the intrusion. By changing the demands, flow directions can change. Thus, initially unaffected nodes can be contaminated if the demands change slightly. The writers consider demand changes of ⫾25%. The objective function for optimization stated in Berry et al. 共2005兲 is modified to consider different demand pattern scenarios. The sum of objective functions 共i.e., risk of contamination兲 over all samples 共samples differ in demand pattern兲 is minimized. For each sample, a hydraulic and a quality analysis has to be performed because the hydraulic conditions change. The suggested approach requires a large amount of computational time, so it is suitable only for small networks. As the network considered in this contribution is large, this method is not applicable. Carr et al. 共2004兲 also addressed the effect of various types of uncertainty concerning the optimal sensor placement in WDS. The problem is tackled by formulating objective functions as mixed-integer programs which are robust against uncertainties. The resulting objective functions depend on uncertainty assumptions. The robustness of the proposed approaches is shown theoretically. In McKenna et al. 共2006兲, an interesting issue is raised, namely the impact of a sensor detection threshold on the number of sensors needed for contamination detection. The problem is formulated as a twin-objective optimization task, with the number of sensors and the number of detected scenarios as objectives. Subsequent optimization runs for various sensor detection limits are performed. The results show that if the detection limit is lower than 10% of the average node concentration, the effect of lowering the limit on detection possibility is rather small. In the course of this contribution, the detection limit was set to zero, i.e., only perfect sensors were used. This is due to the fact that the aim was to test the novel risk formulation and the sampling method rather than show the robustness of the network description with respect to uncertainties. Berry et al. 共2009兲 also raise the issue of imperfect sensors. While sensors can detect any nonzero contamination, the effect of false sensor readings is taken into account. The optimization criterion is to minimize the total impact of contamination on the network for a given set of scenarios. The inclusion of false negative and false positive sensor readings leads to a nonlinear formulation of the optimization problem. For three example networks of various sizes, the optimization problem is solved using nonlinear programming, mixed-integer programming, and a local search heuristic. They prove the applicability of their formulation by simulation and show that false sensor readings can have a significant impact on network safety. The optimal sensor location problem is formulated and solved as a mixed-integer program in Berry et al. 共2005, 2006兲. The single objective is the fraction of the population being affected by the contamination. It considers the density of the population 共which is related to the demand兲 at each node and the node’s attack probability. A case study is conducted by using real world distribution networks. Berry et al. 共2009兲 solve the mixed-integer

programming formulation and carry out a sensitivity analysis based on changes in the nodal demands. The results presented in the paper show that the single-objective sensor location problem can be efficiently solved for large-scale problems. Here, the optimal sensor location problem is formulated as a twin-objective optimization problem: 共1兲 minimize the number of sensors 共surrogate for sensor related costs兲 and 共2兲 minimize the risk of contamination. The risk of contamination is explicitly evaluated as the product of contaminant nondetection likelihood and the corresponding consequence 共water demand consumed兲. In addition, it is assumed that the sensors used are not perfect, i.e., that they can detect contaminant only above some threshold concentration value. The possibility of false sensor readings is neglected. For a study including sensor errors 共e.g., false sensor readings兲, however, the algorithm presented here could be included in the fault detection framework presented in Berry et al. 共2009兲. The methodology developed and presented is used to solve the problem of optimal sensor location in the Dutch town of Almelo 共see Fig. 2兲. The aim was to protect it against a potential waterborne outbreak of E. coli. The hydraulic and water-quality analysis were performed by using the dynamic link library 共DLL兲 developed as part of this project to replicate the functionality of the ALEID simulation software 关KWR, ALEID software 共http:// www.waterware.nl/Aleid/兲兴, which has been developed based on EPANET 共Rossman 2000兲. Samples for the quality analysis 共i.e., time and location of intrusion兲 were chosen via the newly proposed importance-based sampling method 共IBSM兲. The optimization problem is solved by the nondominated sorting genetic algorithm II 共NSGA-II兲共Deb et al. 2002兲 The paper is organized as follows: In the next section, the issue of water-quality modeling is briefly discussed. Subsequently, the problem of optimal sensor placement is formulated as an optimization problem. Then, contamination scenarios are defined and a novel IBSM is presented. The novel optimal sensor placement methodology is tested and verified on case study analyses in the subsequent section. Based on the results obtained, the main conclusions are summarized.

Water-Quality Modeling This section briefly reviews the governing equations of waterquality modeling in WDSs. A more comprehensive description of water-quality models can be found in Hua et al. 共1999兲, Powell et al. 共2000兲, Rossman 共2000兲, Clark and Sivaganesan 共2002兲 and Boccelli et al. 共2003兲. Beginning with an initial concentration of a contaminant C, the concentration decays over time which is due to transport and/or chemical reactions. In general, change in concentration which is due to reactions in pipes may be described by the partial differential equation

⳵C ⳵C =−u − 共kb + Rw兲C ⳵x ⳵t

共1兲

where C = contaminant concentration in a single pipe; t = time; u = flow velocity; x = spatial coordinate 共i.e., the position along the pipe兲; kb = first order rate constant of the bulk reaction; and Rw = flow-dependent wall reaction coefficient. Water mixing at junctions and in tanks is assumed to be instantaneous and complete and the contaminant concentration leaving junction k will be the flow weighted average of the con-

630 / JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT © ASCE / NOVEMBER/DECEMBER 2010


taminant concentrations from all inflowing sources. In the case of sensor location optimization, a potentially large number of waterquality model runs may be required. These runs differ in the assumed contaminant intrusion time and location. Since large numbers of contaminant intrusion scenarios are likely to be considered 共to get accurate optimization results兲, the process of water-quality analysis is likely to be time consuming. As a consequence, necessary care has been taken here to consider efficiency of these computations when developing the software for determining the optimal sensor locations.

Table 1. Optimization Parameters 共NSGA-II兲 Symbol P Sm Sn R P _ crossover P _ mutation N _ 共optimization兲

Description Number of individuals Maximal number of sensors Minimal number of sensors Number of generations Crossover probability Mutation probability Number of runs for average

Value used for optimization 100–1000 40 1 500 0.5 0.1 20

Optimal Sensor Placement The problem of optimal sensor placement is formulated and solved as an optimization problem with sensor locations being the decision variables. Objectives Assuming a network with N nodes and a maximum number of S sensors, the following two objectives are to be minimized: 共1兲 number of sensors 共surrogate measure for a cost of sensors兲 and 共2兲 risk of contamination. As the two objectives clearly compete against each other, the optimization output will consist of a set of optimal compromise solutions, the so-called Pareto front 共Pareto 1896兲. Mathematically, the optimization problem is formulated as follows: Minimize f共Sគ 兲 =

冉冊 S R共Sគ 兲

共2兲

where Sគ = set 共or vector兲 of sensor locations; S = number of elements in Sគ ; and R共Sគ 兲 = associated contamination risk. The risk of contamination is defined as the product of the probability of not detecting the contaminant intrusion and the corresponding consequence 共expressed here as the average volume of water consumed prior to network shutdown兲

冉

m

冊冉兺 m

1 1 R共Sគ 兲 = 1 − max␰共S p,Kl兲 · min ␰共S p,Kl兲 m l=1 Sp苸Sគ m l=1 Sp苸Sគ

兺

冊

contamination. In the second step, however, this assumption is relaxed and a shutdown time of 4 h is assumed. Decision Variables The decision variables used here are the locations of contamination sensors. Each decision variable x is represented as a vector consisting of two parts 共of the same length兲: Part 1—denoting all possible sensor locations 共Lគ with elements Li 苸关1 , N兴兲, and Part 2—containing a binary list of sensor “switches” 共W គ with elements Wi 苸关0 , 1兴兲. Node “i” is equipped with a sensor if Wi = 1 共its switch is set to one兲, otherwise no sensor is attached to it. The length of the decision vector x is 2 · S, where S = maximum number of sensors available. An example of a decision vector is as follows: xគ = 关Lគ 兩 W គ 兴 = 关1 , 4 , 1 , 756, 269兩 1 , 0 , 0 , 1兴. In this scenario, the maximal number of sensors would be 4, but only nodes 1 and 269 would actually be equipped with a sensor. Optimization Method The NSGA-II algorithm introduced by Deb et al. 共2002兲 is used here to solve the above multiobjective optimization problem. The algorithm is not discussed in detail and the reader is referred to Deb et al. 共2002兲 for a detailed description of the NSGA-II optimization procedure. Parameters used for optimization are listed in Table 1.

共3兲

The set of actual sensor locations is described by Sគ ; S p = elements in Sគ ; m = number of contamination scenarios; and Kl = contamination scenario index. The function ␰共S p , Kl兲 is a binary function which describes if a scenario can be detected by the given sensor S p. Therefore, the first sum in Eq. 共3兲 is the number of scenarios in which the contamination can be detected by the chosen sensor locations. As this sum is divided by the number of scenarios and subtracted from 1, the nondetection probability is expressed. The second sum in the same equation is the volume of water that is consumed prior to network shutdown; the function ␹共S p , Kl兲 describes the volume of water that is consumed before the network can be shutdown following the intrusion detection at a specific sensor. The volume of water consumed prior to shutdown is calculated as the integral of polluted water emanating from all nodes 共in cubic meters兲 from time 共t兲 to time 共t + detection time+ shutdown time兲. The basis for this calculation is water-quality simulation. Assuming that every inhabitant of the area supplied by the given WDS consumes the same volume of water, the total of consumed polluted water is directly proportional to the affected population. In the first step, the assumption is made here that the network is shut down immediately after

Scenario Definition and Sampling Design Scenario Definition Before the stochastic optimization problem presented in the previous section can be solved, it is necessary to generate a 共large兲 number of possible contamination scenarios. This is required to accurately calculate the value of the objective function defined by Eq. 共3兲. Detailed information about the contaminant propagation in the network is obtained by repeatedly running the ALEID simulation software 关KWR, ALEID software 共http://www. waterware.nl/Aleid/兲兴. In this contribution, potential contaminant intrusion at node Ni is modeled as a constant contaminant concentration at that node. Pollutant concentration decay is neglected to enable a worst case scenario case study. The propagation of a contaminant in the network depends on both the intrusion location and the time of intrusion. Therefore, a contamination scenario 共i.e., event兲 is defined here as follows. Definition A contamination scenario K共Ni , t j兲 is the combination of time t j and place Ni where a contaminant E enters a water distribu-



tion network. Ni共i = 1 , . . . , N兲 is the intrusion node and t j 苸关Ti,0 , Ti,end兴 is the time of intrusion. N is the number of junctions in the network and Ti,0 and Ti,end are the earliest and latest possible time of intrusion, respectively. Note that other parameters, such as duration of contaminant intrusion, contaminant E 共i.e., E. coli bacteria兲 and the amount of contaminant in the network can also be seen as part of a contamination scenario. In this work, it is assumed these parameters are the same for all contamination events, although the methodology developed is general and can include variations of all the above parameters, e.g., one could increase the amount of intrusion for nodes in the vicinity of high risk locations. Assuming that a contamination event can occur every 10 min within 24 h in the Almelo water distribution network 共1,758 junctions兲, the number of possible scenarios is equal to 储K储 = 144· 1,758= 253,152. A water-quality analysis of all possible scenarios would take almost 6 days and would produce more than 12 GB of data for the calculation of objective matrices. This amount of time and data are undesirable for an optimization process. Therefore, a set of sample scenarios has to be chosen for which the quality analysis will be performed. Importance-Based Scenario Sampling The selection of scenarios for a water-quality analysis is an important task because of its obvious impact on the results. The situation is as follows: on the one hand, the most important scenarios, 共i.e., scenarios considering contamination occurring close to highly populated areas, i.e., high demand nodes兲 should be considered in the optimization process; on the other hand, the “unimportant” scenarios 共e.g., contamination occurring in low population areas兲 also have to be taken into account to provide a reasonable protection for the whole population. As a consequence, an importance-based selection procedure is developed here to ensure that both objectives are achieved. Let us assume that the time of contaminant intrusion is discretized with a time step of ⌬Ti. The set K of all possible contamination scenarios can then be represented by the following two-dimensional matrix:

K=

冢

共N1,t0兲共NN,t0 + ⌬t兲 ¯ 共N1,t j兲 ] ] ] ] 共Ni,t0兲共Ni,t0 + ⌬t兲 ¯ 共Ni,t j兲 ] ] ] ] 共NN,t0兲共NN,t0 + ⌬t兲 ¯ 共NN,t j兲

冣

共4兲

where Ni = intrusion node and t 苸 t0 , t j = intrusion time within the given time frame. A value can be assigned to each entry of this matrix which determines the importance I of the analyzed scenario. Here, the importance of each scenario is defined as the total volume of water that is polluted by this event in a given time interval ⌬T 共⌬T is the intrusion duration兲, i.e., the total volume of water emanating from the contaminated node in time t to t + ⌬T. The calculation of importance values can be accomplished by simulation. After assigning importance values to all contamination events, the importance matrix I can be written as follows:

冢

I共1,1兲 ¯ I共1,J兲 ] ] I= I共N,1兲 ¯ I共N,J兲

冣

Fig. 1. Importance matrix for the Almelo WDS

matrix as a selection probability. Scenarios which have polluted more water are more important and thus have a higher selection probability. After normalizing the matrix, a randomized process is developed to choose from the probability matrix a set of scenarios to be taken into account for optimization. The probability that a scenario is selected corresponds directly with the importance value. When large networks are considered, the normalization of Iគ can cause numerical problems, because some elements could be very close to zero. This can be circumvented by not choosing a value between zero and one as the selection variable, but between zero and the sum of all importance values. Then the scenario Ik 共k is a column-by-column index in I starting with 1兲 is chosen for simulation if for the random number R the following condition holds: k−1

兺 l=1

k

Il 艋 r ⬍

Il 兺 l=1

共6兲

Fig. 1 shows graphically the importance matrix for the Almelo WDS. The height of each point on the graph corresponds to the importance and the crosses mark the scenarios selected using the above described procedure. The benefit of the IBSM compared to Monte Carlo sampling is that even though all parts of the network are taken into account, more important parts have a higher weight 共probability to be selected兲 on the sensor location design. This is due to the fact that 共on average兲 a larger number of important scenarios are taken into account for optimization compared to Monte Carlo selection. This leads to a significant increase in the population’s safety with respect to contamination detection while still taking the whole network 共including less important parts兲 into account. The empirical 共simulation-based兲 proof of this claim is addressed in the “Results and Discussion” section.

Almelo Case Study 共5兲

After normalizing this matrix 共so that the sum of all elements is one兲, the scenario selection method regards each entry of this

Description This section presents the WDS of Almelo 共The Netherlands兲 for which the optimal sensor locations were determined with the aim



350

300 250

200 150

3

Contamination risk [m of contaminated water consumed]

400

Fig. 2. Almelo WDS 共screenshot from the ALEID software兲

100 50

0

5

10

15

20

25

Number of Sensors

to effectively and efficiently detect any potential E. coli outbreak. The ALEID model of the water distribution network 共see Fig. 2兲 has 2,307 pipes, 1,759 junctions, four tanks/reservoirs, two pumps, and six valves. As it can be seen from this figure, Almelo WDS covers three major geographical areas: the northeast, the southwest, and the central part. The lowest network node is at an elevation of 0 m 共sea level兲 while the highest is at an elevation of approximately 14 m 共above sea level兲. The network has a single source of water located on the west side of the system. Furthermore, the node demands are assumed to be deterministic values rather than stochastic parameters. In this framework, demand uncertainties can be included by resimulating the network for every run of the optimization algorithm and then building averages of the Pareto front. Van Lieverloo et al. 共2007兲 presented a simulation model study on the Almelo network which shows the spreading of E. coli over the network for varying intrusion nodes and times. They show that a careful choice of time and place of E. coli measurement is crucial for guaranteeing safety in terms of clean water supply. They also point out the necessity of a careful choice of the place and time of E. coli measurement. The description of all parameters used in optimization process is given in the appendix. Quality analysis is performed over a 72 h time period. An intrusion in the network can start every 20 min within the first 24 h of simulation. The intrusion duration is 16 h. During this time, the contaminant concentration at the intrusion node is 10 mg/L and the detection threshold for each sensor is 100 mg. The parameter values used for all water-quality analysis are summarized in Table 2.

Table 2. Water-Quality Analysis Parameters and Optimization Parameters Name Ct Ci M ␦tq T ⌬Ti Ti ⌬T

Value

Unit

Description

100 mg Sensor detection threshold 10 mg/L Contaminant concentration at intrusion node 1,000 L Number of contamination scenarios 10 min Time step for the quality analysis 72 h Simulation time for the quality analysis 20 min Time interval between possible intrusions 0 to 24 h Possible intrusion time 共from simulation start兲 16 h Intrusion duration

Fig. 3. Estimated Pareto front for IBSM 共immediate shutdown兲

Results and Discussion This section presents the sensor location optimization results. Figs. 3 and 7 show the average Pareto optimal front obtained from multiple runs with different random seeds by using the NSGA-II optimization method and IBSM when immediate network shutdown is assumed. To prove the good performance of the NSGA-II algorithm, an exhaustive search was performed for S = 1 and S = 2 共i.e., for the case of one sensor and the case of two sensors兲. The results obtained show that the solutions obtained by NSGA-II are very near 共S = 2兲 or equal 共S = 1兲 to the true optimum 共marked by squares in Fig. 3兲. The exhaustive search was not performed for more than two sensors 共S ⬎ 2兲 because it is too computationally demanding 共for S = 3 there are more than 900 million possible solutions to evaluate which could take more than 19 days of computational time on an average personal computer兲. Another thing to note from Fig. 3 is that the algorithm found Pareto front estimates for S = 1 – 25 sensors, i.e., the front coverage is very good, thus confirming good convergence of the algorithm. Fig. 4 shows the estimated Pareto front for both shutdown scenarios, namely the case when immediate action is taken after contaminant detection and the case of delayed shutdown. As can be expected, the risk of contamination is higher when the network cannot be shutdown immediately after pollution detection. When making the more realistic assumption of delayed shutdown, however, good results can still be achieved. The Pareto front coverage is still very good, as estimates were found for the number of sensors ranging from one to more than 25. For the case of immediate shutdown as well as for the case of delayed shutdown, the solutions that have been found for S ⬎ 10 differ from each other only slightly in terms of the contamination risk obtained. This implies that the risk of contamination cannot be decreased much more with a further increase in the number of sensors used. To analyze the relative importance of different sensor locations, a histogram is plotted in Fig. 5 共for the case of immediate shutdown兲. This histogram shows the number of occurrences of each sensor location 共i.e., node兲 in the optimal solutions obtained using NSGA-II. For the histogram, 600 solution candidates ob-



3


no shutdown delay 4 hrs shutdown delay

500

400

300

200

100

Fig. 6. Most important sensor locations in Almelo WDS 0 5

10

15

20

25

Number of Sensors

Fig. 4. Estimated Pareto front for IBSM 共both shutdown scenarios兲

tained in various optimization runs are considered. Therefore, the maximal number of occurrences for each node is 600. Fig. 6 shows the locations of the most important sensor identified. This figure also shows that sensors should particularly be placed in the northeast part of the network. This is no surprise since this part of the system is in the area with the highest water consumption. Even though large parts of the network seem to be unprotected 共i.e., without sensors兲, the location of sensors in the northeast area minimizes the risk of contamination by minimizing the consequence, i.e., impact component of the risk. For the case of the delayed shutdown, the distribution of the most important sensors is the same as for the immediate shutdown. Note that once the Pareto optimal front is determined by using the NSGA-II optimization method, the decision maker is presented with a curve showing the trade-off between the sampling cost 共as surrogated by the number of sensors兲 and the risk of E. coli contamination. This way, the decision maker is given the full

picture enabling him/her to select the preferred solution based on the budget available, the required minimum level of safety or by using some other approach 共e.g., identifying the point on the Pareto front where further substantial increase in cost is leading only to a small reduction of contamination risk兲. Importance-Based versus Monte Carlo Sampling As it can be seen from Fig. 7, the IBSM is more effective than Monte Carlo sampling in terms of finding the optimal set of sensor locations for a given number of contamination scenarios. The same figure shows that the IBSM Pareto optimal front 共solid line兲 has significant advantages over the Pareto front obtained with Monte Carlo sampling design 共dashed curve兲. The curves present the average Pareto fronts over several optimization runs 共the same number for IBSM and Monte Carlo runs兲. This results in slight inconsistencies in the Pareto front, i.e., some dominated solutions are shown 共and thus, are not part of the true Pareto front兲. This is due to the fact that Fig. 7 presents average Pareto fronts. In the context of the present contribution, a number of 1,000 contamination scenarios are used as a basis for sensor location


500 300

250

150

3

Occurrence

200

100

50

0

400 350 300 250 200 150 100 50 0

0

200

400

600

800

1000

1200

1400

1600

1800

Node number

Fig. 5. Optimal sensor locations frequency for 600 solution candidates

IBSM Monte Carlo Sampling

450

0

5

10

15

20

25

30

Number of Sensors

Fig. 7. Average Pareto fronts obtained for the IBSM and Monte Carlo sampling method



optimization. Compared to the maximal number of approximately 253,000 possible scenarios, computational time for quality analysis is reduced from almost 6 days to approximately 30 min. Furthermore, the amount of memory needed for storing objective matrices is reduced to approximately 50 MB for 1,000 scenarios compared to 12 GB for 253,000 scenarios.

Conclusions A novel methodology for efficient determination of optimal sensor locations for effective contamination detection in WDSs is presented here. The problem is formulated and solved as an optimization problem with two objectives, one being the minimization of the number of sensors used 共a surrogate for sensor related costs兲 and the other being minimization of the risk of contamination. Unlike previous approaches, the risk of contamination is defined and evaluated explicitly as the product of likelihood that a set of selected sensor locations fails to detect the contaminant intrusion and the consequence of that failure 共expressed as a volume of polluted water consumed prior to detection兲. The associated water-quality analyses are performed using a newly developed DLL based on the ALEID software. The above methodology is tested on the case study involving a WDS of the Almelo 共Netherlands兲 and the potential intrusion of E. coli bacteria. The following conclusions can be drawn from this study: • The explicit definition and evaluation enables a more direct comparison of the contamination risks involved. It also reduces the number of objectives to consider which, in turn, simplifies the optimization problem to be solved. • The case study results obtained show that the NSGA-II integrated with IBSM is capable of efficiently and effectively solving the sensor location problem for the sample network. For the case of one and two sensors, it was shown that the estimated Pareto front is very close to the real Pareto front. Also, the estimated Pareto fronts obtained suggest that a reasonable level of contaminant protection can be achieved by using a small number of strategically located sensors. • A novel IBSM is developed and used to effectively distinguish between contamination events based on their importance. The solutions obtained in this way are 共on average兲 dominating, i.e., better than the solutions obtained by using the conventional Monte Carlo type sampling 共for the same number of samples兲.

Acknowledgments The research work presented here was undertaken as part of the research project supported by the E.U. Socrates scheme, U.K. EPSRC Platform Grant, GR/T26054, and KWR, The Netherlands, which is gratefully acknowledged. The writers would also like to acknowledge the input provided by the three anonymous reviewers which helped to improve the quality of this paper.

Notation The following symbols are used in this paper: C ⫽ concentration of contaminant at a node; F ⫽ true Pareto front; f共 . 兲 ⫽ objective function for optimization;

I K Ki Lគ Li m N Ni R共.兲

⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽

r S Sគ Sp tj W គ Wi xគ ⌬T ␰共 · 兲 ␹共 · 兲

⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽ ⫽

importance matrix; contamination event matrix; element of K គ; vector of candidate sensor locations; element of vector Lគ ; number of samples for quality analysis; number of nodes in the network; index of intrusion node; risk of contamination 共m3 of contaminated water consumed兲; random variable in 共0,1兲; number of sensors used for contamination detection; vector of sensors used to detect contamination; element of vector Sគ ; intrusion time; vector of sensor switches; element of vector W គ; decision vector; intrusion duration; detection function; and contamination function.

References Berry, J. W., Carr, R. D., Hart, W. E., Leung, V. J., Phillips, C. A., and Watson, J. 共2009兲. “Designing contamination warning systems for municipal water networks using imperfect sensors.” J. Water Resour. Plann. Manage., 135共4兲, 253–263. Berry, J. W., Fleischer, L., Hart, W. E., Phillips, C. A., and Watson, J. P. 共2005兲. “Sensor placement in municipal water networks.” J. Water Resour. Plann. Manage., 131共3兲, 237–243. Berry, J. W., Hart, W. E., Phillips, C. A., Uber, J. G., and Watson, J. P. 共2006兲. “Sensor placement in municipal water networks with temporal integer programming models.” J. Water Resour. Plann. Manage., 132共4兲, 218–224. Boccelli, D. L., Tryby, M. E., Uber, J. G., and Summers, R. S. 共2003兲. “A reactive species model for chlorine decay and THM formation under rechlorination conditions.” Water Res., 37共11兲, 2654–2666. Carr, R. D., Greenberg, H. J., Hart, W. E., and Phillips, C. A. 共2004兲. “Addressing modeling uncertainties in sensor placement for community water systems.” Proc., World Water Congress 2004—Critical Transitions in Water and Environmental Resources Management. Clark, R. M., and Sivaganesan, M. 共2002兲. “Prediction chlorine residuals in drinking water: Second order model.” J. Water Resour. Plann. Manage., 128共2兲, 152–161. Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T. 共2002兲. “A fast and elitist multiobjective genetic algorithm: NSGA-II.” IEEE Trans. Evol. Comput., 6共2兲, 182–197. Hua, F., West, J. R., Barker, R. A., and Forster, C. F. 共1999兲. “Modelling of chlorine decay in municipal water supplies.” Water Res., 33共12兲, 2735–2746. Kessler, A., Ostfeld, A., and Sinai, G. 共1998兲. “Detecting accidental contaminations in municipal water networks.” J. Water Resour. Plann. Manage., 124共4兲, 192–198. Kumar, A., Kansal, M. L., and Arora, G. 共1999兲. “Detecting accidental contaminations in municipal water networks—Discussion.” J. Water Resour. Plann. Manage., 125共5兲, 308–309. McKenna, S. A., Hart, D. B., and Yarrington, L. 共2006兲. “Impact of sensor detection limits on protecting water distribution systems from contamination events.” J. Water Resour. Plann. Manage., 132共4兲, 305–309. Ostfeld, A., et al. 共2008兲. “The battle of the water sensor networks 共BWSN兲: A design challenge for engineers and algorithms.” J. Water Resour. Plann. Manage., 134共6兲, 556–568.



Ostfeld, A., and Salomons, E. 共2004兲. “Optimal layout of early warning detection stations for water distribution systems security.” J. Water Resour. Plann. Manage., 130共5兲, 377–385. Pareto, V. 共1896兲. Cours D’Economie Politique, Univ. de Lausanne, Lausanne, Switzerland. Powell, C. J., West, J. R., Hallam, N. B., Forster, C. F., and Simms, J. 共2000兲. “Performance of various kinetic models for chlorine decay.” J. Water Resour. Plann. Manage., 126共1兲, 13–20. Rossman, L. A. 共2000兲. “Epanet2.” Users’ manual, Risk Reduction

Engineering Laboratory, U.S. EPA, Cincinnati. Shastri, Y., and Diwekar, U. 共2006兲. “Sensor placement in water networks: A stochastic programming approach.” J. Water Resour. Plann. Manage., 132, 192–203. van Lieverloo, J. H. M., Mesman, G. A. M., Bakker, G. L., Baggelaar, P. K., Hamed, A., and Medema, G. 共2007兲. “Probability of detecting and quantifying faecal contaminations of drinking water by periodically sampling for E. coli: A simulation model study.” Water Res., 41共19兲, 4299–4308.