MSANOS: Data-Driven, Multi-Approach Software for ... - Springer Link

Water Resour Manage (2015) 29:619–644 DOI 10.1007/s11269-014-0859-9

MSANOS: Data-Driven, Multi-Approach Software for Optimal Redesign of Environmental Monitoring Networks Emanuele Barca & Giuseppe Passarella & Michele Vurro & Alberto Morea

Received: 13 December 2013 / Accepted: 20 October 2014 / Published online: 21 November 2014 # Springer Science+Business Media Dordrecht 2014

Abstract Within the recent EU Water Framework Directive and the modification introduced into national water-related legislation, monitoring assumes great importance in the frame of territorial managerial activities. Recently, a number of public environmental agencies have invested resources in planning improvements to existing monitoring networks. In effect, many reasons justify having a monitoring network that is optimally arranged in the territory of interest. In fact, modest or sparse coverage of the monitored area or redundancies and clustering of monitoring locations often make it impossible to provide the manager with sufficient knowledge for decision-making processes. The above mentioned are typical cases requiring optimal redesign of the whole network; fortunately, using appropriate stochastic or deterministic methods, it is possible to rearrange the existing network by eliminating, adding, or moving monitoring locations and producing the optimal arrangement with regard to specific managerial objectives. This paper describes a new software application, MSANOS, containing some spatial optimization methods selected as the most effective among those reported in literature. In the following, it is shown that MSANOS is actually able to carry out a complete redesign of an existing monitoring network in either the addition or the reduction sense. Both model-based and design-based objective functions have been embedded in the software with the option of choosing, case by case, the most suitable with regard to the available information and the managerial optimization objectives. Finally, two applications for testing the goodness of an existing monitoring network and the optimal reduction of an existing groundwater-level monitoring network of the aquifer of Tavoliere located in Apulia (South Italy), constrained to limit the information loss, are presented. Keywords Optimal monitoring network redesign . Optimization methods . Spatial simulated annealing . Jackknife . Greedy deletion

E. Barca (*) : G. Passarella : M. Vurro CNR-IRSA, Water Research Institute—National Research Council, V.le De Blasio 5, 70132 Bari, Italy e-mail: [email protected] A. Morea Department of Physics, University of Bari, Via Amendola 173, 70126 Bari, Italy

620

E. Barca et al.

1 Introduction Spatial sampling is a crucial activity in environmental studies. Substantially, it is the main source of knowledge about the spatial behaviour of any natural phenomenon. As is well known, adequate monitoring-activity design and planning can strongly reduce the related costs and improve the reliability of the resulting information (Winkel and Stein 1997). This justifies all the research efforts made to achieve a comprehensive spatial sampling theory (Stevens Jr 2006). In addition, most of the European Environmental Directives impose strict criteria on monitoring, making all the studies developed on optimal spatial sampling/monitoring more relevant than ever. As one example, the Groundwater Directive requires the assessment of the present environmental state (characterization), and, if necessary, the defining of environmental objectives to be accomplished through proper corrective actions. In this framework, the only available tool capable of supporting decisional processes related to the effectiveness of corrective actions is a reliable monitoring system. Furthermore, the Groundwater Directive requires the modification of the network configuration according to changes of the monitoring objectives over time. Intuitively, areas where a good ecological status has been met do not need extensive monitoring, but a denser network must be made operative where the environmental status has been found to be degraded. Therefore, a valid monitoring network has to be considered as a dynamic tool, and environmental managers need to be able to modify the network configuration periodically at any time, paying attention to preserving its reliability. In fact, periodically redesigning a monitoring network for new goals is a truly critical stage which involves effectiveness and efficiency of the modified network in terms of reliability and costs of maintenance and adaptation, respectively. Then the monitoring network redesign must consider valid optimization strategies in order to improve its informative capacity, reducing maintenance costs. Awareness of the critical issues tied to the environmental monitoring was the trigger which led to the start of the project aimed at implementing the MSANOS software. The rationale at the basis of the project is to provide a general purpose network-redesign tool targeted toward territorial managers; in fact, it can be applied successfully to any monitoring network (i.e., air, water, soil, etc.). The main target user and the requested wide range of application have guided the implementation and justify some basic working choices. Firstly, all physically based methods (those using partial derivatives models) have been willingly neglected. Consequently, implemented algorithms are based either on purely geographical/geometrical features (coordinates, distances) of the monitoring sites or on spatial autocorrelation models (variograms) of monitored data. This choice assures the freedom from any specific physical context of the implemented methods and guarantees the requested wide range of application. Other methods, whose theoretical issues are still the object of scientific discussion and not yet completely formalized within a mathematical theory, have been neglected until now, even though some of them appear to be very promising. MSANOS actually addresses the three following cases of Optimal spatial Monitoring Network Redesign (OMNR), which embrace a wide range of practical situations: – – –

Network upsizing (or creation) (UPS); Network downsizing (DWS); Network sites relocation (REL).

UPS and DWS, even though they can be both considered similar optimization issues, face opposite standpoints. In fact, when upsizing, the objective is to find those

MSANOS: Software for Optimal Redesign of Monitoring Networks

621

locations within a user-defined discretization of the study area which maximize the information gain, whereas when downsizing the objective is to find those existing network locations whose exclusion minimizes the information loss. As a particular case of UPS, namely the redesign of the empty network, MSANOS can perform the design of an actual brand new monitoring network. Finally, REL, being defined as the composition of a two-step sequence of DWS and UPS, can be brought back to the two aforementioned issues (Barca et al. 2011). A specific managerial need uniquely defines the related optimization issue (UPS, DWS, REL); afterwards, a quantitative formulation of the monitoring network feature to be optimized (e.g. site arrangement, network size, network predictive capability, etc.) has to be specified. This quantitative formulation is referred to as the Objective Function (OF). In this respect, MSANOS provides seven OFs; then, the availability of so many OFs increases the flexibility and the real effectiveness of the software, since it becomes capable of taking advantage of and capitalizing on various kinds of information about the existing network. Substantially, given a specific OMNR issue, it could aim at different objectives, such as reshaping the network topology (that is, the reciprocal arrangement of sites over the study area) to improve sites’ accessibility to each other, improving the parameter estimation of the observed population, reducing estimation uncertainties, and so on. So far, many software applications have been proposed for OMNR applications (Dutta et al. 1998; Van Groenigen et al. 1999; Naoum and Tsanis 2004; Hu and Wang 2011), but none of them address, simultaneously, the three issues listed above. MSANOS has been accurately designed and implemented in order to reduce the run times, with negligible effects on the efficacy of the results. Therefore, it can be considered a fully-fledged Decision-Support System in the framework of the redesign of an existing monitoring network. The main optimization method provided in MSANOS, for both UPS and DWS issues, is Spatial Simulated Annealing (SSA), an inherently stochastic methodology whose efficacy has been recognized in a great deal of scientific papers (Metropolis et al. 1953; Kirkpatrick et al. 1983; Van Groenigen et al. 2000). For DWS issues, even two deterministic methodologies are provided, namely Jackknife and Greedy Deletion. Finally, some novel technical arrangements, such as the smart choice of the initial configuration and some suitable stop criteria, have been implemented in order to speed up the software convergence. The above mentioned methods implemented in MSANOS were accurately chosen with respect to predetermined goals. The practical aim was to provide environmental managers with a decision-support system applicable to a wide range of practical contexts (e.g. monitoring of air quality, water, soil, rainfall, temperature, etc.). The main target user and the requested wide range of application moved to focus on data-driven algorithms while neglecting physically based methods (e.g., data assimilation and Kalman-Filter based methods). Within this class of methods, the choice fell upon SSA, whose efficacy has been largely proved, both theoretically (Metropolis et al. 1953; Kirkpatrick et al. 1983; Tsitsiklis 1989; Christakos and Killam 1993; Deutsch and Cockerham 1994; Drosou and Pitoura 2009; Richey 2010) and practically (Pardo-Igùzquiza 1998; Van Groenigen and Stein 1998; Van Groenigen et al. 1999, 2000; Nunes et al. 2006; Nunes et al. 2007), by a large amount of scientific literature (Henderson et al. 2003). Finally, a case study showing an application of MSANOS to a DWS issue of the groundwater level monitoring network of the aquifer of Tavoliere in the Apulia Region (Southern Italy) is presented.

622

E. Barca et al.

2 Materials and Methods 2.1 The MSANOS Architecture The MSANOS architecture is made up of three main modules: (i) input module, (ii) optimization module, (iii) output module. The input module has been designed to support the user in formalizing the practical aim, which is obviously and strongly dependent on the problem, the objectives, and the available data. Since the monitoring network redesign involves relevant geographic features, providing the spatial coordinates of existing monitoring locations and a discretization grid of the considered area is a mandatory input for running MSANOS. The role of the grid is essential and manifold. First, thanks to the possibility of setting grid cells to be excluded from computations, the user can realistically fit the grid to the study area boundaries and define inner zones that are not involved in the optimization process. Furthermore, the discretization grid is fundamental for those OFs based on spatial estimations and, last but not least, it works as a reservoir of candidate locations in any upsizing problem. For the above, the choice of mesh size is critical: the finer the mesh size, the more accurate the optimization result but the longer the computational time. Apart from the abovementioned set of minimal mandatory information, further information could be required according to the specific OMNR project. The optimization module is made up of two sub-modules (upsizing and downsizing) specifically devoted to addressing the chosen issue, namely (i) removing locations from an existing monitoring network, (ii) adding new locations to a monitoring network, or (iii) moving locations as a combination of (i) and (ii). MSANOS starts running, performing a number of iterations depending on the specific optimization setting, and stops when a suitable criterion is met. Once MSANOS has started, it shows several plots summarizing the most relevant input information, such as the study area and the existing monitoring network configuration, and some specific problem-dependent complementary information (e.g., constrained monitoring sites or areas, experimental and theoretical variograms, etc.). During the whole execution, MSANOS allows the user to follow the optimization evolution on a specific graphical interface (Fig. 1). Finally, once the simulation has been completed, MSANOS provides a comprehensive report of the optimization results. In particular, the optimal configuration is plotted and the monitoring site coordinates of the redesigned network are saved into a text file. Other casespecific plots are also produced to describe better the final result (e.g.: the OF convergence plots and the cooling scheme). 2.2 Objective Functions An OF is the quantitative representation of a specific monitoring-network characteristic, such as the average distance between each monitoring site, the average kriging variance evaluated over an estimation grid, and so on. Then, in general, it can be used to compare two different network configurations and to choose the most suitable for the optimization objectives. According to the information needed to evaluate an OF, it can be classed as design-based or


623

Fig. 1 The MSANOS runtime graphical interface

model-based. In general, when the monitoring objective is the knowledge of some average characteristic of the monitored area, the design-based option is the most appropriate. In fact, a design-based scheme aims at catching some large-scale property of the study area. On the contrary, when the monitoring aims at knowing the behaviour of some phenomenon at small scale, the model-based (or geostatistical) approach is more suitable. In this case, the monitoring network configuration must be capable of capturing the spatial variability of the considered parameter. Table 1 lists the OFs provided by MSANOS. The first two OFs, MKV and AKV, summarize the kriging variance field fluctuations associated with a user-defined estimation grid that are produced by adding or removing a location from the network (D’Agostino et al. 1997; D’Agostino et al. 1998; Barca et al. 2008). The second pair, WMSD and WASD, are criteria which, taking advantage of supplementary information represented in the form of a weight matrix, spreads the monitoring sites according to a preferential scheme over the interested area. As a particular case, when the weight matrix is unitary, the criterion standardizes as much as possible the reciprocal distances between network sites in order to get an almost regular coverage of the area (Van Groenigen et al. 1999; Van Groenigen et al. 2000). The FD is an inherently design-based OF representing the fractal dimension of the ideal graph connecting site locations to each other and allows one to make the Table 1 List of MSANOS OFs grouped by design- and model-based classes Class

Objective function

Acronym

Model Model

Maximum kriging variance Average kriging variance

MKV AKV

Design

Weighted maximum of shortest distances

WMSD

Design

Weighted average of shortest distances

WASD

Design

Fractal dimension

FD

Design

Network points distance

NPD

Model

Percentage change in average kriging variance

%AKV

624

E. Barca et al.

monitoring-locations coverage uniform. This OF is uniquely applicable to upsizing problems (Di Zio et al. 2004). The NPD, which is applicable only to downsizing problems (Rennen 2009), evaluates the locations’ reciprocal distances. It allows one to eliminate locations that are spatially redundant. Finally, the last OF quantifies the current relative percentage change in the AKV with respect to the given configuration (Passarella et al. 2002; Passarella et al. 2003). In the following sections, a brief description of all the MSANOS OFs is provided; the related references provide more detailed information. 2.2.1 Maximum and Average Ordinary Kriging Variance (MKV-AKV) Using these OFs allows one to assign a score to any network configuration in terms of the maximum or average kriging variance estimated at any node of the discretization grid (Hughes and Lettenmaier 1981; Rouhani 1985; Cressie et al. 1990). The well-known ordinary kriging variance formulation in a generic unsampled location xi is (Journel and Huijbrechts 1978; Isaaks and Srivastava 1989): σ2R ðxi Þ ¼

XN j¼1

λ j ðxi Þγ x j ; xi −μðxi Þ

ð1Þ

where λ(xi) are the kriging estimation weights, γ(xj,xi) is the variogram value for the location pair (xj,xi), and μ(xi) are the Lagrange multipliers. Consequently, the two OFs can be written as: ϕAKW ¼

1 XN 2 σ ðx Þ i¼1 R i N

ϕM KW ¼ Maxi σ2R ðxi Þ

ð2Þ

ð3Þ

It is assumed that prior knowledge about the spatial law (variogram model) of the variable to be monitored is available. Monitoring-network optimization based on MKV and AKV tends to allocate monitoring locations in those parts of the considered area where kriging variance is high due to lack of data or to remove locations where the monitoring information is redundant (D’Agostino et al. 1997; D’Agostino et al. 1998; Barca et al. 2008). 2.2.2 Weighted Maximum and Average of Shortest Distances (WMSD-WASD) The WMSD and WASD OFs are take advantage from auxiliary information given in form of weight matrix used for grasping the spatial variability of the primary variable both for addition to or removal sites from the network. These OFs are quite popular since they are the most used in the case of prior total lack of information about the variable to be monitored (Van Groenigen and Stein 1998; Van Groenigen et al. 1999; Van Groenigen et al. 2000; Castrignanò et al. 2008). In the WASD case, a location-dependent weighting function w(! x ) is introduced in the OF: Z W ! x ! x d ! ð4Þ x −V s ! x ϕWASD ¼ A

x the coordinate vector where ! x denotes a two-dimensional coordinate vector and V s ! of the monitoring locations nearest to ! x . The symbol || is used to indicate a distance vector.


625

This function is estimated by:

bWASD ðS Þ ¼ ϕ

ne w X

j j j ! ! ! −V x xe xe s e

j¼1

ne

ð5Þ

where S is the solution space, ! x e denotes the generic node of a discretization grid of the area of interest, and ne is the number of grid nodes. In the case of WMSD, the formula of the OF becomes: j j j ! ! b ð6Þ x e x e −V s ! xe ϕWMSD ðS Þ ¼ Max 2.2.3 Fractal Dimension (FD) This OF applies only to upsizing cases. Di Zio et al. (2004) used a measure of fractal dimension as a criterion of spatial regularity of a monitoring network. The function e ðrÞ is defined as the average number of sites within radius r of a site. In practice, H considering monitoring-network locations uniformly distributed over a given area A, for any given location xi, the number H(r) of other locations within a circle of radius r is proportional to the area of the circle (i.e., to πr2). Since this also holds for the e ðrÞ (obtained over all the locations), it follows immediately that average number H 2 e H ðrÞ∝πr . Di Zio et al. (2004) highlight that this rule represents a limit case of a more generic power law which can be expressed as: e ðrÞ∝krD H

ð7Þ

where D is the fractal dimension (Mandelbrot 1982), which can be considered a quantitative measure of irregularity of the monitoring network pattern. Consequently, Di Zio et al. (2004) propose to use the fractal dimension as a criterion for selecting optimal sampling schemes. The fractal dimension D can be estimated as the slope of the best fitting line produced when e log H ðrÞ is regressed against log(r): e ðrÞ ≈logðk Þ þ Dlog ðrÞ log H

ð8Þ

In this case, it is expected that the more the points tend to uniformly cover the region of interest, the more D approaches 2, which represents the topological dimension of the study area as well as the maximum theoretical value of D. Accordingly, the difference: b FD ðS Þ ¼ 2−D ϕ

ð9Þ

can be used as a coverage measure that describes the level of inhomogeneity in the spatial distribution of environmental monitoring networks, therefore it is preferred when the objective is to produce a uniform coverage (Stevens 1997; 2006).

626

E. Barca et al.

2.2.4 Network Points Distance (NPD) This OF applies only to downsizing cases. Consequently, given the original monitoring network SN and the number of locations to be removed k to obtain a reduced network SN−K, Drosou and Pitoura (2009) and Rennen (2009) describe a suitable two-step OF for assessing an optimal reduced configuration on the basis of the reciprocal location distances. The first step is: 1 b ϕ N PD si ; s j ¼ d si ; s j

ð10Þ

where d is the classical Euclidean distance between two generic monitoring locations si and sj. The second step is: 2

b ϕ N PD ðsi ; S N −1 ∖fsi gÞ ¼

X 1 d si ; s j jS N −1 ∖fsi gj s ∈s fs g j

N −1 ∖

ð11Þ

i

where SN−1 ∖{si} is the monitoring network under reduction deprived of si and 1

MSANOS: Data-Driven, Multi-Approach Software for ... - Springer Link

MSANOS: Data-Driven, Multi-Approach Software for ... - Springer Link

Suggest Documents

Software visualization - Springer Link

Software for recording observational files - Springer Link

Software for recording observational files - Springer Link

Software Visualization for Reverse Engineering - Springer Link

Experimental RunTime System: Software for ... - Springer Link

Software Process Improvement Methodologies for ... - Springer Link

Experimental RunTime System: Software for ... - Springer Link

HardwareâSoftware System for Noninvasive ... - Springer Link

Hardware-Software Collaborative Techniques for ... - Springer Link

Differential metabolomics software for capillary ... - Springer Link

Scientific Software Engineering - Springer Link

Software Developer's Journey - Springer Link

Eureqa: software review - Springer Link

Software health management - Springer Link

(DAK) Software Package - Springer Link

What is software? - Springer Link

Software health management - Springer Link

datadriven insights - Pecan Street Inc.

Doctoral Thesis A multiapproach study of soil

Software Is a Directed Multigraph - Springer Link

Building European software architecture community ... - Springer Link

Modeling Software Processes and Artifacts - Springer Link

Authorship trends in software engineering - Springer Link

Software-defined elastic optical networks - Springer Link

MSANOS: Data-Driven, Multi-Approach Software for ... - Springer Link