Two-Tiered Sensor Placement for Large Water Distribution Network Models Katherine A. Klise, Senior Member of the Technical Staff, Geoscience Research and Applications, Sandia National Laboratories,
[email protected], PO Box 5800 MS 0751, Albuquerque, NM 87185 Cynthia A. Phillips, Senior Scientist, Discrete Math and Complex Systems, Sandia National Laboratories,
[email protected], PO Box 5800 MS 1318, Albuquerque, NM 87185 Robert J. Janke, Research Scientist, United States Environmental Protection Agency, Water Infrastructure Protection Division,
[email protected], Mail Stop NG-16/NB21F, 26 West Martin Luther King Dr., Cincinnati, Ohio 45268 Abstract Water distribution network models for large municipalities have tens of thousands of interconnecting pipes and junctions with complex hydraulic controls. Many water security applications, including sensor placement optimization, require detailed simulation of potential contamination incidents. The post-simulation optimization problem can easily exceed memory on standard desktop computers. Large networks can be skeletonized to reduce computation; however, this alters network hydraulics and, therefore, sensor placement. The objective of this paper is to evaluate a two-tiered sensor placement approach that combines hydraulic and water quality simulations using “all-pipes”, or original, network models with subsequent geographic aggregation of time and impact values to reduce memory requirements. The two-tiered approach first places sensors on aggregated regions, then refines the solution to actual nodes in the original model. The two-tiered sensor placement approach is compared to results using the original network and skeletonized networks based on solution quality, memory use, and runtime. Results show that skeletonized networks introduce error in sensor placement. Two-tiered sensor placement using geographic aggregation replicates the original model solution to within 5% in most cases. Subject Headings Water distribution systems; Security; Terrorism; Water quality; Scale effects; Optimization; Public Health Introduction Water distribution networks are crucial for delivering safe drinking water to municipalities; however, they are also vulnerable to contamination by malicious attack or accidental incident. Sensor networks are one aspect of a warning system that could protect against a contamination event. Due to the high expense of sensor installation and maintenance, optimization methods have been developed to help utilities choose locations for water quality sensors. Review of sensor placement methods can be found in Hart and Murray (2010), Ostfeld et al. (2008) and Berry et al. (2006). In this paper, we use the single objective p-median formulation for sensor placement optimization described in Berry et al. (2006). The p-median formulation uses an impact array, generated from an ensemble of contaminant transport simulations, to minimize an objective value. For each contamination scenario, the impact array contains a list of locations in the network where a sensor might detect that contamination. For each such location, the impact array contains the detection time and the total number of people impacted given a sensor at that location is the first to detect contamination from that scenario. In this study, the objective value is the mean impact on the population across all contamination scenarios, where impact is the population exposed to a given dose of contaminant. Alternate impact objectives include minimization of other population-based measures such as deaths or minimization of network pipe contamination. Memory requirements for sensor placement depend on the number of possible sensor locations in the water distribution system (WDS), the number of contamination scenarios in the design, and the size of the impact array. Sensor placement to protect against a large ensemble of contamination scenarios often requires large, powerful workstations. Low-memory techniques, including witness aggregation, incident aggregation and heuristic solvers have been developed to overcome memory limitations for sensor placement on large networks (Hart et al., 2008). There are several limitations to these techniques. Witness aggregation combines nodes that 1
detect contaminant at the same time, reducing the size of the impact array. However, each contaminant scenario is aggregated independently. This can increase the number of effective nodes in the sensor placement formulation, since groups of nodes acquire a node “name” for impact representation. Incident aggregation combines the impact of two contamination scenarios only if their trajectories match in restrictive ways. As a result, the memory savings using incident aggregation can be small. Sensor placement can exceed the physical memory on standard desktop computers even when heuristic solvers are used. These solvers must store the impact array internally, requiring increased memory for full integer and floating point representations and data structure overhead. Developing more space-efficient solvers could independently improve peak memory usage at the cost of potentially extreme increases in runtime. An “all-pipes” WDS model captures maximum detail for pipes, junctions, and hydraulic controls. For a large municipality, an “all-pipes” model can contain 50,000 or more junctions. Sensor placement on these networks can require a lot of memory. For example, in this research, the impact array for a 55,000 node network with a large ensemble of contamination scenarios takes 7 GB of disk space. Solving a sensor placement problem on that impact array using the p-median formulation required more than 40 GB of memory using a heuristic solver. One option to reduce this memory requirement is to reduce the size of the WDS model through skeletonization. Skeletonization attempts to eliminate pipes and junctions while maintaining the overall hydraulics of the system. The process effectively contracts a connected piece of the network into a single node, which we call a supernode. For example, a supernode might represent a whole neighborhood. All WDS models are skeletonized to some degree. The most detailed model of the network is generally referred to as an "all-pipes", or original, model. There are several important differences between sensor placement using a skeletonized network and sensor placement using the original network. First, a single node in the skeletonized model often represents multiple nodes in the original network model. Sensor placement using a skeletonized model must be refined to a valid sensor placement by determining one location from within the selected supernode. While sensor placement on a supernode can be refined by randomly or manually selecting one node in the group, this will not generally lead to the optimal refinement. Second, skeletonization alters the hydraulics of the original model. This changes the timing and spread of contamination in the system. Finally, aggregating population statistics to a reduced set of nodes will change the sensor placement solution. Walski et al. (2004a) shows that while transient pressure waves change as skeletonization increases, changes can be minimal until the number of pipes in the skeletonized network is less than 10 percent of the original network. However, Janke et al. (2007 and 2009) and Bahadur et al. (2006) describe the influence of skeletonization on water quality simulations and contaminant consequence assessment, and show that the assessment degrades as a function of increased skeletonization. Given that sensor placement relies on an accurate representation of system hydraulics and contaminant transport, we seek lowermemory sensor placement methods based on the hydraulics of an “all-pipes” model. The objective of this paper is to evaluate a low-memory sensor placement method for large water distribution networks based on solution quality, memory requirement and runtime. The proposed method is a two-tiered sensor placement approach that uses hydraulic and water quality simulations from the original WDS model but reduces the memory requirements of sensor placement by geographically aggregating the impact array. Unlike skeletonized networks, this method is not influenced by modified hydraulics since the impact array is computed from the original WDS model. Geographic aggregation combines nodes in the impact array based on their geographic proximity. Because contaminant transport has geographic correlations, we use the supernode membership from skeletonization to define geographic proximity. The two-tier approach uses geographic aggregation to first identify promising regions for sensor placement. These regions are then used in a secondary optimization to place sensors in the original network. In this paper, the two-tiered sensor placement approach using geographic aggregation is compared to a more standard sensor placement approach using skeletonized networks directly. For the sake of direct comparison, the sensor placement solution from a skeletonized network is refined to the original network in the same secondary optimization step used after geographic aggregation. Secondary optimization can only be performed when the original network topology is known. This is a “best case” situation for sensor placement using skeletonized networks. Sensor placement using the original network model is considered "ground truth" for evaluating the quality of both sensor placement using geographic aggregation and sensor placement using skeletonized networks. 2
Water Distribution Networks This study uses three real WDS networks; approximated descriptions follow. The first network, referred to as BWSN throughout the paper, is Network 2 from the Battle of the Water Sensor Networks (Ostfeld et al. 2008). BWSN serves 250,000 people over an area of 490 square kilometers. The network model contains 13,000 junctions and 13 hydraulic controls (reservoirs, tanks, pumps, and valves). There are 11,000 non-zero demand (NZD) nodes in the model. NZD nodes generally represent service connections. The BWSN network model is available for download from the Exeter Centre for Water Systems (http://emps.exeter.ac.uk/engineering/research/cws/). Two additional WDS networks are included to demonstrate the two-tiered approaches on even larger networks. For confidentiality, the network names and specific details are withheld. Network 1 serves 800,000 people over an area of 130 square kilometers. The network model contains 48,000 junctions and 3,700 hydraulic controls. There are 9,000 NZD nodes in the model. Network 2 serves 350,000 people over an area of 260 square kilometers. The model contains 55,000 junctions and 100 hydraulic controls. The network contains 26,000 NZD nodes. Methods In this section, we describe contaminant impact simulation and sensor placement methods, the two-tiered sensor placement approach, skeletonization procedures, and the experimental design. Contaminant Impact Assessment and Sensor Placement Given an ensemble of scenarios in a WDS model, we use EPANET (Rossman, 2000) to simulate the fate and transport of contaminants to create impact arrays. The Threat Ensemble Vulnerability Assessment, Sensor Placement Optimization Tool (TEVA-SPOT), version 2.3.0 (U.S. EPA 2010), is used to manage the EPANET simulations and compute sensor placements. The distributed version of TEVA-SPOT runs simulations of scenarios in parallel to complete the suite in a reasonable amount of time. The U.S. Environmental Protection Agency, Sandia National Laboratories, and Argonne National Laboratory developed TEVA-SPOT. The toolkit version is available for download from https://software.sandia.gov/trac/spot/. All methods used in this research are documented in the user manual. Sensor placement is optimized using the p-median formulation. The formulation places p sensors in the network to minimize an objective value. Here, the objective value is the arithmetic mean of contaminant exposure considering all scenarios. One contamination scenario is simulated from each NZD node in the network. For each scenario, exposure is measured in “population dosed,” defined as the number of people who ingest contaminated tap water and receive a dose above a specified level. We examine two dose levels with respect to population impacts: 0.0001 mg and 1.0 mg. These dose levels relate to a contaminant with a high toxicity (results in health effects above a 0.0001 mg dose level) and a contaminant with a low toxicity (results in health effects above 1.0 mg). Additional information on the simulation and population impact assessment procedure is described in Davis and Janke (2011). For a given contamination scenario, the impact array specifies the time at which contaminant above a minimum detectable limit first reaches a node and the total impact across the whole network at that time. It also lists the impact for the scenario at the end of the simulation time horizon. This approximates the impact if no sensor detects that scenario. The impact array is a sparse representation; if a node does not observe the contaminant, the node is not listed for that scenario. The impact arrays, saved as text files, from the original WDS models are approximately 150 MB for BWSN, 1 GB for Network 1, and 7 GB for Network 2. The impact array for Network 2 is much larger than Network 1 due to the increased number of scenarios and increased number of nodes that witness each scenario. The formulation assumes that sensors are perfect; hence each will signal an alarm when it first detects contamination. The minimum detection limit is set at zero, i.e., any contaminant concentration above zero was detected. A general alarm is used, so the impact to the network stops after the first detection with no response delay. The GRASP local-search solver is used for sensor placement (Pitsoulis and Resende 2002). We verified the optimality of the GRASP solution, which results in an upper bound, by using a Lagrangian method which 3
computes a lower bound. All sensor placement optimization problems were run on the same workstation with 72 GB of RAM. Two-tiered Sensor Placement In this paper, we present a Two-Tiered Sensor Placement approach that uses Geographic Aggregation (TTSPGeoA). This method is compared to sensor placement using skeletonized networks. With skeletonized networks, sensors are placed on supernodes. In this research, we assume the original network model is known and can be used for secondary optimization on this skeletonized solution. This allows for a direct comparison between the TTSP-GeoA and a Two-Tiered Sensor Placement approach that uses Skeletonized Networks (TTSP-SN). For both TTSP-GeoA and TTSP-SN, the first tier (Tier 1) places sensors on aggregated regions of the network, however, the impact array used in this step differs between the two methods. With TTSP-GeoA, sensor placement is performed on an aggregated impact array generated from the original network. With TTSP-SN, sensor placement is performed on an impact array generated from the skeletonized network. The second tier (Tier 2) sensor placement refines the Tier 1 solution to original network nodes. This step is the same for both TTSPGeoA and TTSP-SN. These methods are described in detail below. The network in Figure 1 is used to illustrate both two-tiered methods using a simple 10-node network. The 10-node network was skeletonized based on branch trimming, series pipe merging, and parallel pipe merging. Details on the skeletonization process used in this study are given in the section titled “Skeletonization” below. Figure 2 outlines the steps used in TTSP-GeoA and TTSP-SN. The original and skeletonized network models and impact arrays for the Figure 1 network and for BWSN are available for download from https://software.sandia.gov/trac/spot/wiki/two-tier. These files include scripts to run TTSP-GeoA and TTSP-SN using TEVA-SPOT and python codes to aggregate and refine the impact arrays. TTSP-SN requires a two step process to place sensors on the original network. In Tier 1, hydraulic and water quality calculations are performed using the skeletonized network and sensors are placed on supernodes. At this stage, an original network node could be arbitrarily selected from inside each supernode for sensor placement. For the Figure 1 network, if Supernode 4 was selected in Tier 1, the sensor could be placed on Node 4, 5, 7, or 8. Tier 2 sensor placement requires detailed knowledge of both the skeletonized and original network, including an impact array based on the original network model. In Tier 2, the original impact array is refined to only include nodes within each supernode selected in Tier 1 sensor placement. Sensor placement is repeated using the refined impact array. For the Figure 1 network, the refined impact array only includes Nodes 4, 5, 7, and 8 with time and impact values from the original impact array. TTSP-GeoA also requires a two step process to place sensors on the original network. In Tier 1, the original impact array is aggregated by grouping nodes that are geographically similar. In this paper, geographic proximity is defined by skeletonizing the original network model. While other metrics could be used to define geographic proximity, standard skeletonization techniques are used to make nodes within each supernode as hydraulically compatible as possible. Using skeletonization to define geographic proximity also facilitates direct comparison between TTSP-GeoA and TTSP-SN. Geographic aggregation reduces the impact array to include only supernodes defined in the skeletonized network. From the skeletonized viewpoint, if we draw a circle around the nodes in a supernode (e.g. as shown in Figure 1), the contaminant hits a supernode when it first crosses the curve defining the supernode. Depending upon the direction the contaminant approaches, it will hit the nodes inside the supernodes in different orders. However, the impact array must contain a single impact value for the supernode. With geographic aggregation, we consider the minimum, maximum, mean, or median of the time and impact for the nodes in each supernode. Because population dosed is a cumulative metric, time and impact are correlated, with impact monotonically increasing with time. For a sensor on a single node in the supernode, the maximum statistic is pessimistic for scenarios that the node detects. Similarly, using the minimum is optimistic. We considered all four aggregation statistics to determine which best replicates sensor placement results using the original impact array. Figure 3 shows an original impact array, a geographic map that assigns original network nodes to skeletonized supernodes and a resulting aggregated impact array. The geographic map is based on the original and skeletonized network in Figure 1. The aggregated impact array in this figure is based on taking the mean time and impact value for each supernode. For example, Supernode 4 in Figure 3 represents Nodes 4, 5, 7 and 8 in the original network. The 4
original network impact values for those nodes are 300, 250, 200, and 100, respectively. After geographic aggregation, the supernode impact value would be 100 using the minimum, 300 using the maximum, 212.5 using the mean (as shown in Figure 3), and 225 using the median. The time associated with each of these impacts is grouped the same way (using the minimum, maximum, mean or median). The new time and impact pair is the aggregated representation in the aggregated impact array for that supernode. This aggregation is performed for each scenario and each supernode. While this greatly reduces the size of the impact array, the number of scenarios is unchanged. TTSP-GeoA Tier 1 sensor placement uses the geographically aggregated impact array. This approach avoids solving sensor placement using the original impact array, a process that can exceed the memory available on a personal computer, while using hydraulic and water quality simulation results from the original network. A refined impact array is then generated that contains only nodes included in supernodes from the Tier 1 sensor placement. This refined impact array has time and impact values from the original impact array. Tier 2 sensor placement uses this refined impact array to place sensors in the original network. This is the same refinement and Tier 2 sensor placement method as used in TTSP-SN. For both TTSP-SN and TTSP-GeoA, we use the same number of sensors for both Tier 1 and Tier 2 sensor placement. For example, for a 5-sensor design, Tier 1 sensor placement selects 5 supernodes. These 5 supernodes might represent 100 nodes in the original model. Tier 2 sensor placement chooses any 5 of these 100 nodes as sensor locations. Thus it is possible the final placement will select multiple locations within a single selected supernode. For both approximation methods, relative error (RE) is calculated after Tier 1 and Tier 2. RE is defined as (φ approx − φ ) / φ where φ is the objective value based on sensor placement using the original network and φapprox is the objective value based on sensor placement using an approximate two-tiered method. To calculate Tier 1 RE for TTSP-SN or TTSP-GeoA, one node must be selected from each supernode to place sensors on a node in the original network. We arbitrarily select the original node inside each supernode with the same name as the supernode (i.e. the supernode’s representative). For example, if Supernode 4 in Figure 1 is selected, the sensor is placed on Node 4 for Tier 1 sensor placement. Tier 1 φapprox results from evaluation of the sensor placement (Node 4 in this case) using the original impact array. The Tier 2 refinement places sensors on the original network using a subset of the original impact array. The sensor placement problem seeks to minimize the objective value; therefore, the true minimum φ is always less than or equal to an approximate minimum φapprox for both Tier 1 and Tier 2. Skeletonization A number of codes provide approximately equivalent network skeletonization: Haestad Skelebrator (Haestad Methods 2002), MWHSoft H2OMAP® (Boulos 2005a) and InfoWater® (Boulos 2005b), and TEVA-SPOT Skeleton (U.S. EPA 2011). For this paper, we use H2OMAP® to skeletonize the WDS networks. Networks are skeletonized using three steps in an iterative process to reduce the number of network nodes. These steps include branch trimming, series pipe merging, and parallel pipe merging (Walski et al. 2004b). Hydraulic equivalency calculations and demand redistribution are computed to maintain system dynamics so the reduced network replicates the original model as realistically as possible. Hydraulic equivalency maintains the headloss across merged pipes by calculating an equivalent diameter or pipe roughness. Junction demands and demand patterns are redistributed according to the nodes grouped in each skeletonization step. A pipe diameter threshold determines candidate pipes for skeletonization. Only pipes below the threshold can be removed from the network. The networks used in this study do not include predefined pressure zones, so this additional information is not included in the skeletonization process. The addition of pressure zones would ensure that demands are not redistributed across pressure boundaries. H2OMAP creates an EPANET-compatible skeletonized network and a history log. The history log, which contains sequential skeletonization steps based on network pipes, is converted into a ‘geographic map’ that maps original network model nodes into skeletonized supernodes. This post processing step is necessary to transition between the original and skeletonized networks on a node-by-node basis. To aggregate and refine impact arrays, the user must know, for example, that Supernode 4 in the skeletonized network represents Nodes 4, 5, 7, and 8 in the original network. 5
Under the skeletonization steps described above, there is a limit to how much a network can be reduced based on its topology. The number of nodes in BWSN is reduced by 66%, 76%, and 78% using a pipe diameter threshold of 8, 12, and 16 inches, respectively. For Network 1, the number of nodes in the network is reduced by 50%, 57%, and 60% using the same thresholds. Network 2 is reduced by 75%, 85%, and 87% using the same thresholds. Figure 4 shows a subset of Network 2 before and after skeletonization using a 16 inch pipe threshold. The difference in reduction between the networks is a function of the number of dead end pipes, and pipes in series and parallel. For example, sections of the network with a grid structure will not reduce under these skeletonization steps. Additionally, the number of hydraulic controls influences skeletonization, as all pipes and nodes associated with these features are preserved. Experimental Design To track the influence of various design parameters on the quality of sensor placement using either two-tiered sensor placement approaches, a range of pipe diameter thresholds (used to define geographic proximity for TTSPGeoA and to skeletonize the network for TTSP-SN), sensor numbers, and dose levels are tested using the three WDS networks. Additionally, the minimum, maximum, mean, and median statistics are used to aggregate the impact array for the TTSP-GeoA method. Table 1 provides the input parameters for the experimental design. Results The following results compare solution quality, memory requirements, and computation runtime of solutions using the TTSP-GeoA and TTSP-SN methods to solutions using the original network. Solution quality We compare the objective value (population dosed) based on sensor placement using the original network to the objective value based on sensor placement using TTSP-GeoA and TTSP-SN to compute the RE after Tier 1 and Tier 2. For TTSP-GeoA, the solution using each aggregation statistic (minimum, maximum, mean, or median) is evaluated using the RE after Tier 2 sensor placement. Based on results from the three networks, there is no single aggregation statistic that always produces the lowest RE. However, the minimum statistic yields the lowest RE 60% of the time. The mean and median statistic produces the lowest RE 14% and 26% of the time, respectively. The maximum statistic never results in the lowest RE. The difference in RE using mean and median aggregation is minimal. The RE can change by up to 18% based on use of the minimum, mean, or median option, so selecting the correct aggregation statistic is significant. In this paper, we report on results using the minimum aggregation statistic since it predicts the lowest RE in a majority of cases. Specific cases where the RE could be reduced with the mean or median aggregation statistic are discussed. Figure 5 shows the RE after Tier 1 and Tier 2 for both TTSP-GeoA (based on the minimum aggregation statistic) and TTSP-SN. For each network, the high toxicity contaminant (Figure 5A,B) produces higher RE than the low toxicity contaminant (Figure 5C,D). The RE for high toxicity contaminants is more sensitive to an exact sensor placement. Tier 2 RE (Figure 5B,D) denotes the final solution for each method. RE tends to increases as the pipe diameter threshold increases. This is due to the modified hydraulics in the TTSP-SN method and the level of aggregation in the TTSP-GeoA method. The following results are collective across all pipe diameter thresholds and number of sensors, unless otherwise noted. The specific network, dose level, and sensor placement method are reported. For BWSN, the TTSP-GeoA method produces a Tier 2 RE below 7.4% and 2.9% for the high and low toxicity contaminant simulations, respectively. The RE is relatively high for the five sensor design using the high toxicity contaminant. In this case, the RE can be reduced to below 2% by using the mean or median statistic. For Network 1, the TTSP-GeoA method results in Tier 2 RE below 1.0% and 0.2% for the high and low toxicity contaminant simulations, respectively. For Network 2, TTSP-GeoA Tier 2 RE is below 19.2% and 1.8% for the high and low toxicity contaminant simulations, respectively. As with BWSN, the high RE for Network 2 is from the five sensor design using the high toxicity contaminant. However, the error is still around 10% when using the mean or median statistic. Across all experiments, the TTSP-SN method results in higher Tier 2 RE. The RE using the TTSP-SN method is 2 to 120 times higher than the RE from the TTSP-GeoA method. This result emphasizes the error associated with skeletonized hydraulics. While the objective value is the sole metric used to 6
assess a sensor placement, the physical location was also compared using both methods. As compared to sensor placement using the original impact array, the physical sensor location is preserved 35% of the time, on average, using the TTSP-GeoA method. Using the TTSP-SN method, the physical location is preserved only 14% of the time. Without the Tier 2 refinement, sensor placement occurs on supernodes that represent a group of nodes in the original network model. To quantify the benefit of the refinement, we evaluate the Tier 1 solution by selecting supernode representatives and compare that solution to the Tier 2 placement. The difference between Tier 1 RE (Figure 5A,C) and Tier 2 RE (Figure 5B,D) isolates the benefit of refining the sensor placement solution to nodes in the original network model. Refining sensor locations that originated from skeletonized networks (TTSP-SN Tier 1 solutions) has nominal effect on the overall solution quality. While the RE changes as a result of refinement, the final sensor placement quality is still worse than the objective value for the sensor placement using the original network. This implies that the supernodes selected in the first round of sensor placements based on skeletonized hydraulics are not good candidates for secondary optimization. On the other hand, refining sensor locations that originated from geographic aggregation (TTSP-GeoA Tier 1 solutions) can improve the final placement. On average, the objective value for BWSN and Network 2 improves by 38% and 12% for the high and low toxicity contaminant, respectively. For both networks, sensor locations change over 60% of the time, on average, as a result of refining sensor placement to original network nodes. Because both networks are greatly reduced using skeletonization (the number of nodes is reduced up to 87%), each supernode in a skeletonized model represents a large number of nodes in the original network model. This results in more nodes included in the refinement and therefore more capacity for change between the Tier 1 and Tier 2 sensor placements. For Network 1, however, refining the impact array has little impact on the solution quality. On average, the objective value improves by less than 1% due to refining the sensor placement solution from supernodes to original network nodes. Sensor locations change only 20% of the time when refining sensor placement in Tier 2. For Network 1, the solution from Tier 1 adequately predicts the final solution. Compared to the other networks, Network 1 has a low degree of skeletonization. For every supernode selected in Tier 1, there are fewer original network nodes to refine the solution. In all cases, the two-tiered sensor placement method does not require that one sensor is placed per supernode. However, this is generally the case. The final placement selects one sensor per supernode 94% of the time. Memory Footprint The memory footprint, or maximum memory used, for sensor placement using the original network model for BWSN is 2.6 GB. The memory footprint for Network 1 and Network 2 is 22.6 and 42.2 GB, respectively. While the memory requirements for BWSN are manageable under certain conditions, Network 1 and Network 2 require large powerful workstations to run. The memory footprint is a function of the size of the WDS model and the number of scenarios included in the impact array. For both TTSP-SN and TTSP-GeoA, memory is reduced because the size of the impact array is reduced. Both approximation methods require file manipulation, namely, geographic aggregation and/or refining the impact array after selecting supernodes. For the networks used in this study, maximum memory footprint for file manipulation is below 0.1 GB. This is smaller than the size of the impact array because we use streaming algorithms. For both TTSP-SN and TTSP-GeoA, Tier 1 takes the most memory while Tier 2, using only candidate nodes from the first solution, uses much less memory. The memory footprint associated with Tier 1 is listed in Table 2. This is the maximum memory needed for the approximate methods. There are only slight differences in memory footprint using various numbers of sensors or contaminant thresholds. Memory used in Tier 2 is below 1 GB for all sensor placements. Using geographic aggregation or simply skeletonizing the network reduces the maximum memory needed for sensor placement. In both cases, the memory reduction is a function of the pipe diameter threshold. Under the same skeletonization steps and parameters, BWSN and Network 2 are reduced more than Network 1, resulting in increased memory savings. Skeletonization results in fewer nodes and, therefore, fewer scenarios in the impact array used in the TTSP-SN method. For TTSP-GeoA, only the number of impacts is reduced; there is still a full scenario set. Therefore, TTSP-GeoA requires more memory than TTSP-SN for Tier 1 sensor placement. Maximum memory is reduced by up to 87% using the TTSP-GeoA method. This method reduces the memory 7
footprint for BWSN to manageable levels on most personal computers; however, the memory requirements for Network 1 and Network 2 remain high. While the TTSP-SN requires less memory than TTSP-GeoA, the RE is greater. Figure 6 shows the tradeoff between maximum memory and solution quality. For TTSP-SN, an increase in skeletonization decreases the memory requirements. However, the solution quality generally declines as a result of the modified hydraulics and loss of scenarios. TTSP-GeoA is better suited to maintain solution quality over increased aggregation levels. However, because geographic aggregation does not reduce the number of scenarios in the impact array, it generally requires more memory than sensor placement using a skeletonized network. Runtime A two-tiered approach requires that sensor placement, a computationally expensive process, is solved twice. However, the combined computational runtime for Tier 1 and Tier 2 sensor placement is less than the runtime for sensor placement using the large original impact array. Nevertheless, the total runtime for the two-tiered approach, which includes streaming file manipulation, is greater than the runtime for sensor placement using the original impact array. For example, a 5 sensor placement design on Network 2 using the original impact array takes 18.5 minutes. The TTSP-GeoA method, based on the minimum statistic, takes 55.7 minutes. This includes geographic aggregation of the 7 GB original impact array (28.5 minutes), Tier 1 sensor placement (4.2 minutes), refinement of the original impact array (23.0 minutes), and Tier 2 sensor placement (0.1 minute). A single geographic aggregation can be used for multiple sensor placement designs, so in many cases the runtime is reduced to 27.2 minutes. Skeletonized networks do not require aggregation, and the impact array used in the initial sensor placement has fewer scenarios. Therefore, a comparable placement using the TTSP-SN method takes 26.0 minutes for Tier 1 sensor placement, refinement, and Tier 2 sensor placement. Conclusion This paper presents a two-tiered sensor placement approach using geographic aggregation (TTSP-GeoA) that filters potential sensor locations in a first pass with coarse global information, and then performs secondary optimization on the candidates that survive the first pass. Geographic aggregation is used to reduce the size of the sensor placement optimization. The TTSP-GeoA method is compared to a more common method using skeletonized networks. Here, we assume the original network model is known and can be used for secondary optimization in both cases. This allows for a direct comparison between the TTSP-GeoA and a two-tiered sensor placement method using skeletonized networks (TTSP-SN). Both methods benefit from using a lower memory method to provide an approximate initial solution. As compared to sensor placement directly on large networks, TTSP-GeoA is able to maintain solution quality to within 5% in most cases. In many cases, the TTSP-GeoA approach is able to select optimal sensor placement locations, equivalent to performing the sensor placement on the non-aggregated impact array. On the other hand, results using TTSP-SN highlight the error that is introduced into contamination consequence assessment due to modified hydraulics that results from the skeletonization process. Results show that accurate sensor placement requires high fidelity WDS models for contaminant transport simulations. The benefit of using geographic aggregation is that initial sensor placement solutions can be screened using a low memory filtering step thereby minimizing the amount of RAM memory required. However, the impact array from the original network model must still be calculated, a process that is computationally expensive, but requires little RAM. Results from this study suggest that networks that can be greatly reduced through skeletonization benefit the most from geographic aggregation and subsequent refinement. When skeletonized nodes represent numerous nodes in the original network, refining the solution has a greater impact on solution quality. Network 1, while representing a large WDS model, already represents some degree of skeletonization. For this reason, we believe that BWSN and Network 2 more accurately depict the benefits of using the TTSP-GeoA method. Here, geographic aggregation is used to reduce the memory footprint for sensor placement up to 87%. This is directly related to the size of the skeletonized network used to define geographic proximity. In general, the impact array should be aggregated to meet the available RAM needed for Tier 1 sensor placement. However, many sensor placement optimizations may still be too large to run on standard desktop computers with limited RAM. Other methods may help further reduce the memory requirements. The TTSP-GeoA method is compatible 8
with other solvers. For example, Tier 1 sensor placement could be performed using a low memory Lagrangian solver. Geographic aggregation can also reduce memory use in the case of multi-objective optimization, where several impact arrays can be reduced using the same geographic proximity. Skeletonization is not the only means to define geographic proximity used in the TTSP-GeoA method. Furthermore, different skeletonization methods will define different geographic aggregation approaches with varying quality. The size of WDS models can also be reduced based on hydraulic behavior (Ulanicki et al. 1996; Perelman and Ostfeld 2008). Future work will evaluate additional network reduction methods that can be used in sensor placement design to decrease memory requirements. Acknowledgements The U.S. Environmental Protection Agency (U.S. EPA), Office of Research and Development funded and participated in the research described here under an interagency agreement. The views expressed in this paper are those of the authors and do not necessarily reflect the views or policies of the U.S. EPA. Mention of trade names or commercial products does not constitute endorsement or recommendation for use. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Company, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. References Bahadur, R., Johnson, J., Janke, R., and Samuels, W.B. (2006). “Impact of model skeletonization on water distribution model parameters as related to water quality and contaminant consequence assessment.” Proceedings 8th Annual Water Distribution Systems Analysis Symposium, August 27-30, Cincinnati, OH. Berry, J., Hart, W. E, Phillips, C. E., Uber, J. G., and Watson, J. (2006). “Sensor placement in municipal water networks with temporal integer programming models.” J. Water Resour. Plann. Manag., 132(4), 218–224. Boulos, P. (2005a). “H2OMAP Skeletonizer users guide.” MHW Soft, Inc, 300 North Lake Ave, Suite 1200, Pasadena, CA. Boulos, P. (2005b). “InfoWater Skeletonizer users guide.” MHW Soft, Inc, 300 North Lake Ave, , Suite 1200, Pasadena, CA. Davis, M. J., and Janke, R. (2011). “Patterns in potential impacts associated with contamination events in water distribution systems.” J. Water Resour. Plann. Manag., 137(1), 1-9. Haestad Methods. (2002). “Automated skeletonization techniques,” Haestad Methods, Inc., Waterbury, CT. Hart, W.E., Berry, J.W., Boman, E., Phillips, C.A., Riesen, L.A., Watson, J.P. (2008). “Limited-Memory Techniques for Sensor Placement in Water Distribution Networks”, V. Maniezzo, R. Battiti and J.P Watson, editors, Learning and Intelligent Optimization, pp. 125–137. Springer-Verlag Berlin, Heidelberg. Hart, W.E. and Murray, R. (2010). “Review of Sensor Placement Strategies for Contamination Warning Systems in Drinking Water Distribution Systems.” J. Water Resour. Plann. Manag., 136(6), 1-9. Janke, R., Murray, R., Uber, J., Bahadur, R., Taxon, T. and Samuels, W. (2007). “Using TEVA to assess impact of model skeletonization on contaminant consequence assessment and sensor placement design.” Proceedings of the World Environmental and Water Resources Congress, Tampa, FL, May 15-19. Janke, R., Haxton, T., Grayman, W., Bahadur, R., Murray, R., Samuels, W., and Taxon, T. (2009). “Sensor network design and performance in water systems dominated by multi-story buildings.” Proceedings of the World Environmental and Water Resources Congress, Kansas City, MO, May 17-21.
9
Ostfeld, A., Uber, J. G., Salomons, E., Berry, J.W., Hart, W.E., Phillips, C.A., Watson, J.P., Dorini, G., Jonkergouw, P., Kapelan, Z., Di Pierro, F., Khu, S.T., Savic, D., Eliades, D., Polycarpou, M., Ghimire, S.R., Barkdoll, B.D., Gueli, R., Huang, J.J., Mac Bean, E.A., James, W., Krause, A., Lescovec, J., Isovitsch, S., Xu, J., Guestrin, C., Van Briesen, J., Small, M., Fischbeck, P., Preis, A., Propato, M., Piller, O., Trachtman, G.B., Yi Wu, Z., and Walski, T. (2008). “The battle of the water sensor networks (BWSN): a design challenge for engineers and algorithms.” J. Water Resour. Plann. Manag. Div., ASCE, Vol. 134(6), 556-568. Perelman, L. and Ostfeld. A. (2008). “Water distribution system aggregation for water quality analysis.” J. Water Resour. Plann. Manag., 134(3), 303-309. Pitsoulis, L., and Resende, M.G.C. (2002). “Greedy randomized adaptive search procedures.” P. M. Pardalos and M. G. C. Resende, editors, Handbook of Applied Optimization, pp. 168–181, Oxford University Press. Rossman, L.A. (2000). EPANET 2 users manual, EPA/600/R-00/057, U.S. Environmental Protection Agency, National Risk Management Research Laboratory, Office of Research and Development, Cincinnati, Ohio. U. S. EPA. (2010). “Threat Ensemble Vulnerability Analysis – Sensor Placement Optimization Tool (TEVASPOT) graphical user interface user’s manual (2010) ,” version 2.3.0, EPA 600/R-08/147, Office of Research and Development, National Homeland Security Research Center, Water Infrastructure Protection Division, Cincinnati, OH. Available at http://www.epa.gov/nhsrc/pubs/600r08147.pdf U. S. EPA. (2011). “TEVA-SPOT Toolkit and user's manual,” version 2.5, EPA 600/R-08/041C, Office of Research and Development, National Homeland Security Research Center, Water Infrastructure Protection Division, Cincinnati, OH. Ulanicki, B., Zehnpfund, A., and Martinez, F. (1996). “Simplification of water network models.” Proceedings of 2nd International Conference on Hydroinformatics, Zurich, Switzerland, pp. 493-500. Walski, T.M., J. Daviau, and Coran, S. (2004a). “Effect of skeletonization on transient analysis results.” Proceedings of the World Environmental and Water Resources Congress, Salt Lake City, UT, June 27-July 1. Walski, T. M., Chase, D. V., Savic, D. A., Grayman, W., Beckwith, S., and Koelle, E. (2004b). Advanced water distribution modeling and management. Haestad Methods, Inc.
10
Figure 1. Example original and skeletonized network.
Figure 2. Flowchart for TTSP-GeoA and TTSP-SN.
11
Figure 3. A) Original network impact array based on the network in Figure 1 using one scenario starting at Node 1. B) Geographic map based on the original and skeletonized network in Figure 1. C) Aggregated impact array using the mean statistic.
12
Figure 4. Subset of Network 2 (A) before and (B) after skeletonization with a 16 inch pipe diameter threshold.
13
Figure 5. Relative error (RE) for BWSN, Network 1, and Network 2 after Tier 1 sensor placement (A,C) and Tier 2 sensor placement (B,D). Results using TTSP-GeoA are compared to results using TTSP-SN. The RE for various sensor numbers (5, 10, 20, and 40 sensors) is shown for each pipe diameter threshold (8, 12, and 16 inches).
14
Figure 6. Relative error (RE) versus memory footprint for BWSN, Network 1, and Network 2 after Tier 2 sensor placement. Results using TTSP-GeoA (black) are compared to results using TTSP-SN (gray). Each plot includes a pipe diameter threshold of 8 inches (cross sign), 12 inches (diamond), and 16 inches (circle) for each of the four sensor numbers. Table 1. Sensor placement design parameters. Networks Pipe diameter threshold Aggregation statistic Number of sensors Dose level
BWSN, Network 1, and Network 2 8, 12, 16 inches Minimum, maximum, mean, and median (for TTSP-GeoA only) 5,10,20,40 High toxicity (0.0001 mg threshold) Low toxicity (1.0 mg threshold)
Table 2. Memory footprint (MF) for sensor placement using the original network, TTSP-SN and TTSP-GeoA. MF for the two-tiered methods is based on Tier 1 sensor placement.
Network
MF (GB) Original network
BWSN
2.6
Network 1
22.6
Network 2
42.2
15
Pipe diameter threshold (inch) 8 12 16 8 12 16 8 12 16
MF (GB) TTSP-SN
MF (GB) TTSP-GeoA
0.4 0.2 0.2 6.4 4.9 4.3 4.1 1.9 1.5
0.9 0.6 0.6 6.9 5.3 4.7 10.8 6.2 5.3