Biochemical Transport Modeling and Bayesian Source Estimation in ...

4 downloads 0 Views 2MB Size Report
A chemical/biological weapon designed for a .... toy example representing an urban environment. ... 1 shows a toy example illustrating an urban environment.
2520

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

Biochemical Transport Modeling and Bayesian Source Estimation in Realistic Environments Mathias Ortner, Arye Nehorai, Fellow, IEEE, and Aleksandar Jerémic

Abstract—Early detection and estimation of the spread of a biochemical contaminant are major issues in many applications, such as homeland security and pollution monitoring. We present an integrated approach combining the measurements given by an array of biochemical sensors with a physical model of the dispersion and statistical analysis to solve these problems and provide system performance measures. We approximate the dispersion model of a contaminant in a realistic environment through numerical simulations of reflected stochastic diffusions describing the microscopic transport phenomena due to wind and chemical diffusion and use the Feynmann–Kac formula. We consider arbitrary complex geometries and account for wind turbulence. Numerical examples are presented for two real-world scenarios: an urban area and an indoor ventilation duct. Localizing the dispersive sources is useful for decontamination purposes and estimation of the cloud evolution. To solve the associated inverse problem, we propose a Bayesian framework based on a random field that is particularly powerful for localizing multiple sources with small amounts of measurements. Index Terms—Biochemical dispersion, Feynman–Kac, inverse problem, random field, sensor array processing, source estimation.

I. INTRODUCTION

T

HE prospect of a terrorist biological or chemical attack is a prominent security issue. A chemical/biological weapon designed for a large-scale impact can be delivered through the air as a gas or aerosol. The severity of such an attack is related to the concentration of the substance when it reaches its target, which in turn depends on variables such as wind conditions and contaminant volatility. Release in a closed space such as subways or buildings could result in especially high doses and impact [1]–[3]. In order to limit the potential damage of a biochemical attack, it is essential to have a rapid and reliable detection system and the ability to predict the agent dispersion. Several works have been proposed. We propose a forward physical dispersion model relating the source to the measurements given by an array of biochemical sensors in realistic complex environments. Using statistical analysis, we design tools for optimally detecting a release and Manuscript received November 8, 2005; revised May 30, 2006. This work was supported by the National Science Foundation under Grant CCR-0330342. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Carlos H. Muravchik. M. Ortner and A. Nehorai are with Washington University, St. Louis, MO 63130 USA (e-mail: [email protected]; [email protected]). A. Jerémic is with Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON L8S 4K1 Canada (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSP.2006.890924

solving the associated inverse problem, in particular, localizing the original spread of contaminant substance. In our previous work, we presented detection and estimation techniques for simple scenarios where analytical solutions to the transport equations are available (see [4]–[8]). In [9], we considered numerical solutions given by finite elements methods. Refer to [10]–[15] for additional work on the subject, including ways of considering the impact of turbulence on urban diffusion. In this paper, we present a new numerical approach for the transport model in realistic scenarios using Monte Carlo simulations. The proposed approach provides a modeling framework that accounts for complex geometries and allows full use of software simulated random wind turbulence. This modeling framework is efficient for a large setup, since its computational load evolves linearly with the number of sensors and the time of diffusion, and it is highly amenable to parallel computing. Furthermore, the approach is generic, can be applied to various diffusion problems, and is particularly adapted to solving inverse problems. It also allows us to use different estimators instead of the maximum likelihood. The key point of our approach is that we decouple the “fluid simulation” part from the “transport compution.” In particular, we show simple, but realistic, examples how to incorporate numerical simulations including random effects. The approach we propose is generic enough to incorporate additional random effects, including chemical reactions, temperature effects, and radioactive decaying. The Monte Carlo approach we employ is based on a Feynman–Kac representation and, therefore, does not require solving the problem on the entire domain, but only at the sensor positions. The required computational time is thereby limited. Localization of the source is important in order to predict the cloud propagation in space and time and consequently make quick emergency decisions. We consider cases involving multiple sources and propose to use a generic Bayesian approach to solve the inverse problem based on a suitable prior model. This framework allows us to consider multiple sources with unknown initial diffusion times and release intensities. The estimator we consider is the a posteriori probability distribution of the source locations, with a major consequence: in the case of uncertainties due to a lack of measurements or the presence of several diffusive sources, the computed a posteriori distribution shows all relevant hypotheses. This paper is organized as follows. In Section II, we detail the physical dispersion model and present the measurement model. In Section III, we present the setup for numerical approximations of the dispersion through Monte Carlo simulations of the associated stochastic diffusion, assuming known arbitrary geometry. We also propose an ad hoc approach to account for stochastic wind turbulence based on the result of numerical

1053-587X/$25.00 © 2007 IEEE

ORTNER et al.: BIOCHEMICAL TRANSPORT MODELING AND BAYESIAN SOURCE ESTIMATION

2521

cific example, we assume six chemical sensors in known locations and two instantaneous release sources (this instantaneous hypothesis can actually be relaxed, as we explain it later in this section). We assume the wind has a main direction coming from outside the city as illustrated by the arrows from the north in Fig. 1. We also assume two side boundaries (shown as two vertical lines), which we believe to be more realistic than a fully open domain. We suppose the diffusivity coefficient of the substance as well as the sensor noise variance to be known. B. Physical Dispersion Model 1) Advective Model: We consider a bounded open domain . Let be a point in . Denote by the dispersive substance concentration at a point and time . The transport equation in the presence of a wind field is given by the following equation (see [4] and [16])

2

Fig. 1. Example of a dispersion scenario: an urban environment (135 225 m) modeled by a set of rectangles. Wind is shown to come from the north at a speed of 54 km/h. The rectangles stand for buildings. Six sensors (shown as circles) are placed in the area, and a chemical substance is instantaneously released at two points (shown as disks).

fluid computations. In Section IV, we explain the Bayesian model employed for localizing several sources and detail the Metropolis–Hastings algorithm used to sample the posterior density. The results from Sections III and IV are illustrated by a toy example representing an urban environment. In Section V, we present additional results obtained for an indoor scenario. In the conclusion summarized in Section VI, we present various possible ideas to extend our work. II. PHYSICAL AND STATISTICAL MODELS In this section, we discuss the physical model we use to describe the dispersion of a chemical substance in a fluid. We employ the framework provided by [5] and [8]. We also detail the statistical sensor measurement model. The goal is to determine a numerical relationship between the contaminant sources and sensor measurements. A. Assumptions We assume the geometry of the setup to be known, including the locations of the biochemical sensors and boundaries of the problem (the building geometries in the case of urban environments). We also assume a knowledge of the wind distribution in the considered setup. For instance, in the presented examples, we assume that the wind has a known main direction and we suppose the availability of software that is capable of computing the resulting wind distribution over the area, including vorticities for modeling turbulent effects. We also assume that we know the diffusion properties of the contaminant (diffusion coefficient) and that the sensors have been calibrated, resulting in a known noise variance. Fig. 1 shows a toy example illustrating an urban environment that is modeled by a set of rectangles with two release sources and six sensors. Note that the setup in the figure is considered 2-D for computational simplicity, although the framework described in the rest of this section is presented in 3-D. For this spe-

where is the 3 3 matrix of conduction (or diffusivity). We suppose that both and are function of the space variables. Note that when the medium is assumed to be incompressible , this equation reduces to (1) In the rest of the paper, we consider the latter advective model under the incompressible assumption. To consider realistic scenarios, we need to use a wind field that includes turbulent effects resulting in either random or time dependent wind fields . We discuss this point in more detail in Section II-B-4. Fig. 2 presents a diffusion simulation on the example presented in Fig. 1. be the boundaries of 2) Boundary Conditions: Let the domain . We assume two kinds of domain boundary conditions and divide into two disjoint subsets denoted and corresponding to the following types of by conditions [18]. • Neumann conditions: Here we assume that the domain is subject to boundary (2) is the normal vector at any point belonging to where . These Neumann boundary conditions describe the hypothesis that the substance concentration is not affected by the boundary (normal flux is null). • Dirichlet conditions: Here we assume that the domain satisfies boundary (3) We use these conditions to model the boundaries within the domain and the “outside world.” In our urban environment example, the top and bottom boundaries are subject to such Dirichlet conditions, whereas the side ones are described by the Neumann conditions. We use these side boundaries to model the impact of other blocs, similarly to a wind tunnel. This assumption is not optimal, and for real-life applications, larger domains should be considered.

2522

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

Fig. 2. Simulated diffusion of a contaminant at different time instants computed by dedicated software [17] that includes realistic turbulence computations. The gray level corresponds to the concentration of a contaminant.

3) Initial Substance Distribution and Sources: In the rest of the paper, we denote by the substance concentration at (the release time is arbitrary taken as the time origin) time (4) This formulation assumes that the sources have released a certain amount of a substance into the environment when the diffusion begins (e.g., instantaneous sources). In the case of noninstantaneous or even moving sources, the solution of the linear transport equation (1) can be obtained by superposition of the solutions under instantaneous sources, due to the linearity of can obthe assumed boundary conditions [8]. Moreover, viously be chosen to describe any scenario. For instance, in the case of a point source, can be assumed to be a Dirac function . Equation (1) is, therefore, general and potentially includes any possible cases of interest. 4) Fluid Simulations, Transport Model and Inclusion of Random Effects: A major modeling issue in (1) is the assumed knowledge of the transport term due to the wind. The wind actually is a function of both the position and time variables. In urban areas, as shown in [19], random effects generated by wind turbulence may have a huge impact on the dispersal behavior of the biochemical contaminant. Precise knowledge of this variable is impossible to obtain even if many wind sensors are available owing the chaotic nature of the wind. One way to model the chaotic nature of turbulence is to use a deterministic representation of the wind describing the ergodic effect of random turbulence. Eddy diffusivities that approximate the turbulent effect by a higher diffusion coefficient [20] are a first possible model. Another idea is to use Reynolds decomposition and consider random fluctuations of the wind around the mean [18]. We propose a different approach in Section III-C-5, one that allows us to include more realistic wind descriptions for specific scenarios. Instead of estimating an average effect of the turbulence on the diffusion, we include a random model based on the

results of fluid simulations. We show in this paper how to make full use of modern numerical capabilities and include random effects computed by software dedicated to the task of computing realistic fluid distribution. The key point of our approach is that it uses snapshot of the computed wind distribution at different time instants. Usual approaches, like FEM, hardly incorporate the fully available information in the transport inverse problem solving. A major advantage of our approach is that it accounts for the spatial correlation in the turbulent flow and avoids null averaging effects that occur when using the usual mean wind field approximation. Our approach is based on wind distribution computations. For instance, in the outdoor problem of Fig. 1, we suppose that the wind comes from a main direction. We then employ dedicated software [17] to compute the wind distribution over space and time. Fig. 3 shows some samples of the obtained wind field at different times using this software. Note that the software is restricted to incompressible fluids. For real applications, more suitable softwares may be employed, including additional effects such as micrometeorological parameters (planetary boundary layer height, convective velocity scale, friction velocity, etc). Our purpose here is to present an example to show how to incorporate fluid simulations incorporating random effects for solving the inverse problem. In our case, we use the default values provided by the Gerris flow solver. For the scenario of Fig. 1 (135 225 m) we assume that the wind is coming from the north at a speed of 54 km/h. It results in a mean wind field of 2 km/h horizontally and 62 km/h vertically. The average standard deviation obtained is 60 km/h horizontally and 80 km/h vertically. The numerical forward computation technique we propose allows making full use of these snapshots as detailed in Section III-C-5. C. Measurement Model To model the measurements, we suppose a spatially distributed array of biochemical sensors located at known . We assume that each sensor takes meapositions

ORTNER et al.: BIOCHEMICAL TRANSPORT MODELING AND BAYESIAN SOURCE ESTIMATION

2523

Fig. 3. Temporal snapshots of wind distribution at different times given by dedicated software [17] using the scenario presented in Fig. 1, with a wind value of 35 km/h. Top: x component, bottom: y component.

Fig. 4. (Solid line) Computed temporal measurements at each of the six sensors in the example of Fig. 1. (Dotted line) These measurements were obtained by adding Gaussian noise ( = 0:1) to the concentrations predicted by the dedicated software for wind modeling.

surements at times . Referring to [5], we adopt the following measurement model:

(5) where represents measurement noise and modeling errors. In the remainder of this paper, we assume that is known from a calibration step.

In Fig. 4, we present the sensor temporal measurements given by the simulation result shown in Fig. 2. The initial (unit in mass concentration of the two sources was percentage), and the diffusion coefficient . The wind , stream was set to 1 m/s and the noise variance to corresponding to SNR of 10 dB (the SNR is computed using the average squared signal level in space and time over the sensors).

2524

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

III. TRANSPORT MODELING USING THE MONTE CARLO APPROXIMATION In this section, we propose a numerical approach to solve the transport equation (1) in the presence of turbulence. The proposed approach is generic and uses random walks to compute the solution of a diffusion equation at a precise location and time. The interesting point is that these random walks are started from the point of interest (e.g., the sensor) and go backward in time with respect to the physical interpretation of the diffusion equation. As a consequence, the problem is only solved at the important locations (sensor positions), saving computational time. Of course, the random walks depend on the considered diffusion equation and boundary conditions. For simplicity, in the remainder of this section we will consider a 1-D setup. Extensions to 2-D or 3-D frameworks are straightforward. Note that in order to present a generic framework, we use specific notations and replace the concentration function by . A. Stochastic Diffusion We present here the general mathematical link between diffusion equations and their associated stochastic process. This link can be seen as the relationship between the macro- and the microscopic descriptions of the physical phenomena. 1) Diffusion Equation: Consider the following general diffusion operator acting on a function (6) We are interested in solving the diffusion equation (7) where is a real-valued function that might depend on the varistands for able location . In the case of the heat equation, an additional cooling or heating effect at . In our biochemical case, the variable of interest is the con. The transport term then describes the centration , whereas the diffusion term represents wind effect . In the case of the the chemical diffusion effect is a source/sink term that represents chemical diffusion, radioactive decaying, chemical reaction or incorporate a divergence term for compressible fluids. For the sake of simplicity, , although the we assume that the fluid is incompressible framework can be applied similarly for non-null . 2) Stochastic Process: We introduce the following stochastic diffusion process in the Ito sense to represent the microscopic description of (7). As we will detail, this process can be seen as the evolution equation of individual particles going backward with respect to the real physical phenomena. We consider given by

(8) where represents a Brownian motion. The position value of at time has a probability distribution that

and time . We use to depends on the initial location denote the corresponding expectation family. The relationship between this stochastic process and the diffusion problem (6), (7) arises from the property described in (9). 3) Feynman–Kac Formula: The Feynman–Kac formula relates the continuous macroscopic transport description (7) to the stochastic microscopic equation (8) through the following property. Under certain regularity conditions (mentioned in the Apand , the following function is a solution to pendix) in the diffusion problem (7):

In the case of

(incompressible flow), we obtain (9)

, replacing by a For a given initial condition (sensor position and sample time), the sensor coordinates Feynman–Kac formula (9) asserts that the value of the solution at a given location and time is given by the expec. This expectation tation of the initial condition of , the louses the probability distribution starting from after time . cation of the process The important point is that the stochastic process is started from the sensor location : the formula reflects a backward property compared with usual forward computation methods going from the source to the sensor. In our application, the trans[see (6) and (7)]. The process is port term is given by accordingly drifted by the opposite of the wind and goes backward with respect to the real physical transport phenomena. Another interesting point is that the random walk is independent of the initial condition: one can compute the solutions associated with different initial conditions by using the same set of random walk samples. We present in Fig. 5 an illustration of the Feynman–Kac property. Suppose that the initial condition is given by an indicator . The Feynman–Kac formula states function , i.e., that the solution value that at and time is given by the probability that the process starting from hits the set after time . Thus, to obtain the solution of the heat equation at a particular location and time, one can launch random walks starting from this point and compute the empirical probability of hitting the set after the considered time. 4) Boundary Conditions: The behavior of the stochastic process depends on the boundary conditions. For Dirichlet boundaries, the process can be killed, or stopped. For Neumann conditions, the process can be reflected and for mixed conditions more generic Feynman–Kac representation need to be considered [21]. We provide examples in the next section, as well as in the Appendix. 5) Approximated Stochastic Diffusion and Sensor Measurements: To implement the Feynman–Kac formula in practice we . More specifneed to consider simulations of the process ically, we need to simulate the random process starting from and then compute the empirical expectation associated

ORTNER et al.: BIOCHEMICAL TRANSPORT MODELING AND BAYESIAN SOURCE ESTIMATION

2525

Fig. 5. Illustration of the Feynman–Kac formula (9). Left: the initial condition (t = 0) is given by the indicator function of set A, i.e., u (x) = 1(x 2 A). The disk represents A, the circle shows the sensor location x . Middle: illustration of the analytical result of a heat equation at t = t . Right, realizations of the stochastic process starting from the sensor location x . According to the Feynman–Kac formula, the result of the diffusion in x after time t is given by the probability that the random walks of the last image hit the set A: u(x ; t ) = P(X 2 A).

with (9). After collecting samples of the process by location at time , we calculate a natural estimate of (10) This result demonstrates a direct relationship between the solution at a fixed position and a given time and the initial condition . Once a sample has been can be computed for different obtained, the solution using (10) without requiring further genfunctions eration of new samples. In the context of locating a chemical source using an array of sensors, we need to launch random walks starting from each of the sensors in order to obtain the concentration evolution at each of the sensor locations. The first advantage of the proposed approach to solve the forward problem over other classical numerical tools like finite element methods is that it allows a stochastic modeling of the wind. Second, this method is highly suitable for parallel computing: the random walks are independent and can, therefore, be generated by different computers. Another advantage is that this approach computes the solution only at specific points and time of interest and does not try to solve the entire problem. As a consequence, even when we consider a large setup (a real city, for instance), the processing load evolves linearly with the number of sensors and the considered diffusion duration. The major drawback is the needed computational time: the method is more computationally intensive in small setups than finite elor ). Howements (in practice, we use ever, as already stated, the approach is particularly useful for large setups, in terms of domain size and dimension. Another point is that the forward simulation can be done ahead of time as we show it in Section III-C-2.

problem. In this framework, we use a reflected stochastic diffusion to model the Neumann boundary conditions. Different schemes have been proposed (e.g., by Lepingle [23] or Costantini et al. [21]). The weak convergence of the approach we use has been proved only recently by Bossy, Godet, and Talay [24]. In the Appendix, we detail the simulation procedure for the case of the Neumann boundary conditions, while restricting the presentation here to the simple case of infinite domains without boundaries. Infinite Domain: In the case of an infinite domain (without any boundary conditions), the following Euler scheme is known to converge in the weak sense [22]. Define a time discretization parameter giving the associated discrete times , where is an integer. Sampling the process described in (8) is straightforward and is obtained by the following iterative procedure: (11) is simulated by a Gaussian random variwhere able with mean 0 and variance . The weak error for in a class of smooth functions can be expanded in terms of powers of , provided some regularity conditions on or some conditions on are satisfied [24] (see the Appendix). C. Application to the Transport Problem We present here the numerical method for implementing the particular case of our transport problem (1). We also present its consequences on the likelihood of the measurements. We propose a way to include turbulent flow within the computations. 1) Stochastic Transport Model: By identifying the generic diffusion Equation (7) with the transport equation (1), we obtain, in 2-D or 3-D

B. Simulation and Convergence We now focus on the simulations of the stochastic process involved in (8). The discretization of a stochastic process is a research field in itself and has been the subject of a huge body of literature, especially in the past ten years (see, for instance, [22]). Depending on the type of process to be approximated, convergence results may exist or not. In our case, the infinitesimal generator is very simple (it essentially corresponds to the heat equation), and one can expect some nice convergence results. However, the boundary conditions complicate the

where we used the square root for a definite positive matrix. Note again that the wind is reversed: the Feynman–Kac formula works in a backward mode as discussed earlier. The discrete scheme in (11) results in the following iterations: (12) is obtained by the generation of two or where three (depending on the dimension) independent and identically

2526

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

!

Fig. 6. Illustration of the p for i = 1 (bottom left sensor in Fig. 1) and several successive times t . The mapping z p corresponds to the empirical the darker the corresponding pixel. The probability density of the origin z of a particle arriving at the ith sensor after a time t . In each image, the higher p gray scale is adapted for each image, resulting in the overall darkening.

distributed normal variables, with mean 0 and variance . In practice, we consider an isotropic and homogeneous matrix and the square root correspond to the scalar one. 2) Monte Carlo Approximations: To obtain a handy version of the empirical distribution of the particles after time , we use a discrete version of the domain . Denote by a set of associated with a grid of points indexed by and sites in the number of elements in . We partition into small squares (pixel like) and For given sensor location and time , diffusion matrix , and wind distribution , the Monte Carlo simulations give final points, denoted by . Let be the : average number of such points falling in the element

For a given initial concentration value function Feynman–Kac formula (9) yields

, the (13)

is the calculated estimate of the concentration at the where location and time . We present two results of the computations of these transport probabilities on Fig. 6. For a given , bottom left on Fig. 1) we illustrate the probabilisensor ( for different times . The figure illustrates the notion ties indeed standing for the probof backward evolution, the ability that a particle arriving at the first sensor was launched from a site at a time earlier. These probabilities are 3) Unit Response: Consider a time discretization parameter and regularly spaced discrete times

. The sequence can be seen as to a unit instanthe unit response of the sensor located at taneous substance release at the site indexed by . In the and denote remainder of this article, we will consider the time index for the impulse response by [see (14)]. all the measurements 4) Likelihood: Denote by lumped into a single dimensioned vector. Let be the in every point . vector of all initial concentration By assuming independent measurements and a Gaussian noise, (13) yields the likelihood

(14) 5) Wind Turbulence Modeling: We present here an ad hoc procedure to account for wind turbulence. To illustrate the procedure, we focus on the urban environment example presented in Fig. 1. We compute a solution to the Navier Stokes equation for the wind distribution using a dedicated program called Gerris [17]. This computation is made possible because of the assumption that the wind in this example is mainly coming from the north. We then obtain several snapshots of the wind distribution such as those shown in Fig. 3. A common way to account for turbulence is to use a mean wind field [19], averaging the obtained snapshots. However, our preliminary results indicated that using a mean field would be inappropriate especially for outdoor applications. For instance, at a location where the turbulence is large, the average wind can be null if the wind direction variability is large enough. Another solution is to decompose the wind into a mean field and a Gaussian random variable [18]. However, even if the turbulence values are chaotic in time, there is a strong spatial correlation

ORTNER et al.: BIOCHEMICAL TRANSPORT MODELING AND BAYESIAN SOURCE ESTIMATION

2527

Fig. 7. Comparison of the effect of (top) a mean wind field and (bottom) the proposed stochastic wind approach on the transport probabilities p . We present the transport probabilities associated with the second sensor (bottom center, in Fig. 1). Top: Results obtained using a mean wind field obtained using dedicated software [17]. Bottom: Result obtained using our stochastic wind modeling which appears to increase the dispersive effect.

between neighboring points. Supposing independence between neighboring points might, therefore, be incoherent while estimating the correlation might not be a simple issue. We now propose a simple way of incorporating turbulent flow behavior in the computations of the random walks. We take advantage of the stochastic formulation of the numerical computations we proposed in this section. Let be spatial snapshots of the wind distribution over the whole setup, provided by a software. For instance, we use [17] to compute 30 of such snapshots under the constant main wind direction assumption. We propose to replace the wind drift in (12) by one snapshot randomly selected among . In order to account for the variability of the turbulent flow, we randomly change the snapshot used during the stochastic random walk (8). Each snapshot is used during a random time generated according to an exponential distribution with mean (15) and once this random duration has expired, we uniformly select a new snapshot of the wind among the possible snapshots before generating a new random duration and iterating the process. The parameter can be seen as the expected duration of a turbulence flurry, in practice we took . During a random walk of duration , we use approximatively different snapshots. The first advantage of this approach comes from the implicitly modeled spatial correlation. The second advantage is that even if turbulence has a null average at one location, we still can account for large wind values. We present a result in Fig. 7. On top, we show the transport probabilities computed using a mean wind field approximation. The results show that the main direction of the wind is taken into account, since the cloud of possible original locations goes towards the north with time. However, the bottom result obtained using our approach shows that using randomly selected snapshots of the wind increases the predicted diffusivity a result that is in line with the eddy diffusivity framework [20].

IV. LOCALIZING THE SOURCE(S) We present in this section a framework for inferring the source location from the measurements using a Bayesian approach. This task is useful for predicting the contaminant dispersion. Namely, once the source location has been estimated, applying the transport model to the estimated source(s) location(s) will predict the cloud evolution in space and time. In our context of chemical contaminant dispersion, we should be cautious with respect to the modeling assumption. In homeland security applications, it is indeed likely that several sources might be deployed. Moreover, the available measurements might be too noisy or insufficient in number to rely on a maximum likelihood estimator (MLE). In such cases, we are more interested in the probability distribution of the source in space, which incorporates all relevant hypotheses and the uncertainties, rather than in the location of the best hypothesis given by the MLE. These consideration is especially important for homeland security scenarios, where a mistake on the source evaluation could strongly affect emergency procedures. We, therefore, propose to use a random field with an adequate prior, as commonly used in image processing [25]. A. Inverse Problem and Random Field We develop in this section a setup to introduce a Bayesian regularization term for solving the inverse problem. This Bayesian term allows us to consider distributed and, therefore, multiple sources. Additionally, the unknown initial time can be considered as a model parameter and to be estimated from the measurements. 1) Distributed Sources: We consider a set of sites and we associate with each site an unknown initial concentration . The purpose of the source localization is value using a set of measurements to estimate the initial values . In the following, we assume that , the initial release time, is approximatively known from a detection

2528

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

process (our current research concerns this point). The likelihood of the measurements being given the set of values is then given by

and , where and is the normalizing constant. Again, according to Bayes formula, is the inverse of the expectation of the likelihood . The under the prior model value of is usually called the Bayesian evidence [26] and can be used for model selection. c) Estimator: For each site, we consider the following two values:

We can obtain a matrix formulation of the likelihood. We first transform the indices and over the sensors and times into a single index describing the sensor-time possible combinations. The measurements are reprethen described by and the transport probabilities sented by . Next, let , , and be matrices such that: is a by 1 matrix representing • all measurements; and • is a by matrix containing the transport probabilities; is a by 1 matrix repre• senting the initial concentrations. The log likelihood is then given by

(18)

If the number of measurements was large enough, a quadratic optimization would be achievable, resulting in maximum likelihood estimator (MLE) of . However, for a large number of considered sites and a small number of available measurements, the problem is under-determined. We, therefore, propose and compute different estimators. to add a Bayesian prior on 2) Bayesian Model: a) Prior model: We use the following mixture as a prior . We state that should be equal to 0 with model for the and uniformly distributed in a probability with a probability . The prior term can, therefore, be written as (16) The mixing parameter should be chosen according to the size of the domain and the number of sites, which we note . A way to choose is to make a prior decision about the average surface of the release. We believe that this prior is really suitable to the considered problem. This prior makes an assumption on the (average) total surface of the sources but not on their , meaning number nor their shape. In practice we took that we state that the source surface is expected to be 1% of the overall domain area. b) Posterior density: The likelihood of the measurements, the prior model, and Bayes formula result in the following a posteriori distribution:

(17)

These estimators respectively correspond to the posterior probability of having a source at the location , and the posterior conditional expectation of the source concentration, knowing that there is a source. The former provides the probability that there was a source at each location, while the latter gives the estimated intensity in that case. d) Estimating the initial time of release: Equation (17) assumes that the initial time of the transport phenomena is exactly known. However, this will hardly be the case, and we propose for estimating this unknown initial time to compute the Bayesian , in (17)] for several time hypotheses and then evidence [ consider the maximum obtained value. This approach appears to be reasonable since in our current research we develop a sequential detector that provides a rough estimate of the initial time. Therefore, we only need to consider a few time hypotheses around this rough estimate, and select the best ones. B. Algorithm 1) Metropolis Hastings: The computation of the estimators in (18) cannot be done directly. We, thus, implement a Monte Carlo Markov chain method. Since we know only the joint density over the sites and the full conditional distribution is intractable, we propose to use a Hasting Metropolis sampler [27]. This kind of sampler is especially suitable for our case, since we do not know the normalizing constant in (17). Note that is actually another value we want to compute, since it provides the Bayesian evidence. Random fields and Metropolis–Hastings samplers have been used in a variety of context for solving inverse problems (see, for instance, [28] and [29]). a) Variable at a time Metropolis update: The goal is ergodically converging to the to build a Markov Chain random field distribution given by (17). The state space of the Markov Chain consists in the configurations of initial substance concentration (19) We use a Metropolis Hastings update scheme. At time , one of the sites is randomly selected, and the associated variable is updated according to the following scheme: according to a given distribution , 1) first simulate then by with a probability 2) accept to replace

Note that acceptance ratio depends on the target distribution , on the proposition kernel and the current variables ,

ORTNER et al.: BIOCHEMICAL TRANSPORT MODELING AND BAYESIAN SOURCE ESTIMATION

Fig. 8. Results of the source localization estimation given by the Bayesian approach obtained using the transport probabilities of Fig. 6 and the measurements  > , the probability of having a source in each locaof Fig. 4. Left:  j > , the average source intensity tion (logarithmic scale). Right: conditioned on the presence of a source. Note that the real initial concentration (unit in mass percentage). The result on the left prolevel used was c vides for each location the probability that a source was present. The result on the right provides the estimated intensity in the case there was a source.

P

(

= 5%

0)

E

(

0)

2529

Fig. 9. Indoor example. An artificial configuration of rooms and ducts is considered. Wind is supposed to come in through four input ducts and released through the central duct. We consider (circles) two sensors and (disk) one source.

and , a fact that is essential to get the convergence of the sampler towards the target distribution. b) Proposition kernel: We use a perturbation kernel made of two parts. such that • With a probability 0.5, we change the state of , we propose a new uniformly in , if while if , we propose . , we propose , • With a probability 0.5, if being a random perturbation magnitude in . (unit in mass percentage). In practice, we used As a consequence, the proposition kernel can be written as

Fig. 10. Indoor problem. Time evolution of the substance concentration.

c) Resulting transition kernel: The resulting Markov kernel is a combination of three random steps: selection of a site, generation of a new candidate for the site concentration, and random acceptance of the proposal with a probability . It can be show (see [27], for instance) that the resulting Markov invariant, ergodic and irreducible. It means that chain is are distributed the resulting sample and verify an ergodic property that permits according to using them for empirical computations.

conditioned by the event that there is a release. Note that in the locations were the probability of having a source is high, the estimated intensity is close to the real value [we recall that we (unit in mass percentage)]. used an initial intensity For that particular example, we assumed the initial time to be . In the following section, we provide a result known using the Bayesian evidence, for selecting a relevant initial time hypothesis. These results show that the random field approach is powerful for finding several sources. V. ADDITIONAL NUMERICAL RESULTS: INDOOR APPLICATIONS

C. Results We present in Fig. 8 the result of the sampler associated with the measurements of Fig. 4. On the left, we show the posterior probability of having a source in each considered location (note that the gray-scale is logarithmic). The true locations of the sources were correctly found. On the right, we show the a posteriori expectation of the initial intensity in each location

We present in this section additional results for an indoor example. We consider the synthetic example of Fig. 9. The presented framework consists of “ducts” and “rooms,” four wind input points, and one output duct. Two sensors are placed in the central duct. We suppose a release in the top left corner. Fig. 10 shows the evolution of concentration in time, mostly due to transport effects. Again, the initial concentration was taken as

2530

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

Fig. 11. Log of Bayesian evidence with respect to different delay hypotheses. Zero delay corresponds to the real initial time, whereas a positive delay correspond to a hypothesis that the release occurred later than it really did, and vice versa.

in the disk location. The incoming wind speed was set to 1 m/s. We used SNR dB to generate the measurements.

Fig. 12. Effect of an error in the initial time estimate on the resulting source localization estimation. The images represent the posterior probability of having a source for the six best time hypotheses, according to the Bayesian evidence criterion. Note that the estimated source location is highly dependent on the initial release time hypothesis.

A. Source Localization and Initial Time We apply the framework presented in the previous section. is known only through a We suppose that the release time rough estimate and use the Metropolis–Hastings algorithm to compute the Bayesian evidence associated with different release time hypotheses. Fig. 11 shows the log of the Bayesian evidence as a function of . Null delay corresponds to the real release. A negative delay corresponds to cases where the release is supposed to have occurred sooner than the real time, while positive delay describes the opposite case. Note that the Bayesian evidence is at its highest around the real value In Fig. 12, we show the posterior probabilities associated with the six best hypotheses given by the Bayesian evidence criteria. As expected, the estimated source location is highly dependent on the supposed initial time of release. In Fig. 13, we show the average posterior presence probability over these six different hypotheses. This figure shows that our approach allows us to deal with uncertainties in the inverse problem. The final empirical probability distribution indeed contains the information of several initial release time hypothesis. VI. CONCLUSION We have presented a new way to compute chemical transport equations in realistic environments and proposed a Bayesian framework to solve the inverse problem. The results are potentially useful for array optimal design. Assuming a main wind direction for the external incoming flow and a known geometry, we developed Monte Carlo simulations of the stochastic process associated with the transport equation. The proposed method allows the inclusion of a realistic stochastic wind distribution accounting for turbulence that proved to be powerful in practice. We then integrated this method into an array signal processing setup. and presented a Bayesian framework to localize

Fig. 13. Estimated distribution of the source. The image represents the average of posterior probabilities over the six time-hypotheses presented on Fig. 12.

the releasing sources, useful for cases where the amount of measurements is too low, resulting from uncertainties concerning the source parameters. The presented framework allows us to localize several sources and to represent uncertainties in the source location. We also provided a way to get an estimate of the initial release time through the Bayesian evidence. We are currently developing a sequential detector for online detection of a release. Future work should also focus on 3-D realistic examples. In particular, we hope to be able to obtain real data to test our method in practical scenarios. Other issues of interest are the extension of the method to the case of absorbing boundaries, inclusion of other physical effects (e.g., temperature convection), and more realistic sensor modeling. We also plan to work on optimal design for optimally configuring the sensor array with respect to detection performances.

ORTNER et al.: BIOCHEMICAL TRANSPORT MODELING AND BAYESIAN SOURCE ESTIMATION

2531

REFERENCES

Fig. 14. Illustration of Neumann and Dirichlet boundaries. Left: Trajectory of a reflected process illustrating the numerical scheme used to account for Neumann boundary conditions. In gray, the originally proposed point replaced by its mirror point with respect to the boundary. Right: trajectory of an absorbed realization illustrating the effect of a Dirichlet boundary.

APPENDIX I DISCRETIZATION OF THE STOCHASTIC PROCESS WITH BOUNDARY CONDITIONS A. Neumann Conditions In the case of an open bounded domain , the scheme presented in Section III-B needs to be adapted. The weak convergence of the following algorithm has been recently provided by Bossy, Godet, and Talay [24]. This algorithm is based on symmetric reflections of the random walk against the boundaries

if otherwise.

(20)

represents the mirror point of the point with Here, taken using the direction. The respect to the boundary vector depends on the boundary condition: for . In our case , resulting in a symmetric reflection (see Fig. 14). In order to be consistent, this definition supposes that there are no ambiguities on the border on which to reflect ( , thus, needs to be small enough). When a bad case is encountered (the symmetric point lies outside of ), one can resimulate . Note that according to [24], the probability of such events goes exponentially to zero with . , , and , Through conditions on the smoothness of , Bossy, Godet, and Talay have shown the following result: for smooth enough (see the original paper) and compatible with the problem

where is uniform in and , and C(f) depends on the norm of the differential of until order 5 (see [24] sum of for details). B. Dirichlet Conditions The simplest way [30], to account for Dirichlet boundaries, is to stop the random walk when it hits such a boundary. The Feynman–Kac formula then becomes , being the stopping time associated crosses a Dirichlet boundary.” Note that with the event: “ we denote by the indicator function.

[1] DHS and National Academies Highlight Role of Media in Terrorism Response [Online]. Available: http://www.nae.edu/ [2] J. Fitch, E. Raber, and D. Imbro, “Technology challenges in responding to biological or chemical attacks in the civilian sector,” Science, vol. 32, pp. 1350–1354, Nov. 2003. [3] H. Banks and C. Castillo-Chavez, Bioterrorism: Mathematical Modeling Applications in Homeland Security. Philadelphia, PA: SIAM, 2003. [4] A. Nehorai, B. Porat, and E. Paldi, “Detection and localization of vapor-emitting sources,” IEEE Trans. Signal Process., vol. 43, no. 1, pp. 243–253, Jan. 1995. [5] B. Porat and A. Nehorai, “Localizing vapor-emitting sources by moving sensors,” IEEE Trans. Signal Process., vol. 44, no. 4, pp. 1018–1021, Apr. 1996. [6] A. Jeremic´ and A. Nehorai, “Design of chemical sensor arrays for monitoring disposal sites on the ocean floor,” IEEE J. Ocean. Eng., vol. 23, no. 5, pp. 334–343, Oct. 1998. [7] A. Jeremic´ and A. Nehorai, “Landmine detection and localization using chemical sensor array processing,” IEEE Trans. Signal Process., vol. 48, no. 5, pp. 1295–1305, May 2000. [8] T. Zhao and A. Nehorai, “Detecting and estimating biochemical dispersion of a moving source in a semi-infinite medium,” IEEE Trans. Signal Process., vol. 54, no. 6, pp. 2213–2225, Jun. 2006. [9] A. Jeremic´ and A. Nehorai, “Detection and estimation of biochemical sources in arbitrary 2D environments,” presented at the IEEE Int. Conf. Acoustics, Speech, Signal Processing, Philadelphia, PA, Mar. 2005. [10] A. Gershman and V. Turchin, “Nonwave field processing using sensor array approach,” Signal Process., vol. 44, no. 2, pp. 197–210, Jun. 199. [11] A. Pardo, S. Marco, and J. Samitier, “Nonlinear inverse dynamic models of gas sensing systems based on chemical sensor arrays for quantitative measurements,” IEEE Trans. Instrum. Meas., vol. 47, no. 3, pp. 644–651, Jun. 1997. [12] H. Ishida, T. Nakamoto, and T. Moriizumi, “Remote sensing and localization of gas/odor source and distribution using mobile sensing system,” in Proc. 45th Midwest Symp. Circuits Systems, Aug. 2002, vol. 1, pp. 52–55. [13] J. Matthes, L. Groll, and H. Keller, “Source localization based on pointwise concentration measurements,” Sens. Actuators A: Phys., vol. 115, no. 1, pp. 32–37, Sep. 2004. [14] H. Niska, M. Rantamäki, T. Hiltunen, A. Karppinen, J. Kukkonen, J. Ruuskanen, and M. Kolehmainen, “Evaluation of an integrated modelling system containing a multi-layer perceptron model and the numerical weather prediction model Hirlam for the forecasting of urban airborne pollutant concentrations,” Atmos. Environ., vol. 39, no. 35, pp. 6524–6536, 2005. [15] A. Venestanos, T. Huld, P. Adams, and J. Bartzis, “Source, dispersion and combustion modeling of an accidental release of hydrogen in an urban environment,” J. Haz. Mater., vol. 105, pp. 1–25, 2003. [16] A. S. Monin and A. M. Yaglom, Statistical Fluid Mechanics. Cambridge, MA: MIT Press, 1975. [17] Gerris Flow Solver [Online]. Available: http://gfs.sourceforge.net [18] J. Anderson, Fundamentals of Aerodynamics. New York: Mc GrawHill, 1984. [19] K. Radics, J. Bartholy, and R. Pongrácz, “Modeling studies of wind field on urban environment,” Atmos. Chem. Phys. Discussions, no. 2, pp. 1979–2001, 2002. [20] K. P. U. D. Baldocchi, T. Meyers, and K. Wilson, “Correction of eddy-covariance measurements incorporating both advective effects and density fluxes source,” Boundary-Layer Meteorol., vol. 97, no. 3, pp. 487–511, 2000. [21] C. Costantini, B. Pachiarotti, and F. Sartoretto, “Numerical approximation for functionals of reflecting diffusion processes,” SIAM J. Appl. Math., vol. 58, no. 1, pp. 73–102, 1998. [22] D. Talay, “Simulations of stochastic differential systems,” in Probabilistic Methods in Applied Physics. New York: Springer Verlag, 1995, vol. 451, Lecture Notes Phys.. [23] D. Lépingle, “Euler scheme for reflected stochastic differential equations,” Math. Comput. Simulations, vol. 38, pp. 119–126, 1995. [24] M. Bossy, E. Gobet, and D. Talay, “Symmetrized Euler scheme for an efficient approximation of reflected diffusions,” J. Appl. Probab., vol. 41, no. 3, pp. 877–889, 2004. [25] G. Winkler, Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction. New York: SpringerVerlag, 2003. [26] J. Ruanaidh and W. Fitzgerald, Numerical Bayesian Methods Applied to Signal Processing. Statistics and Computing. New York: Springer, 1996.

2532

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 55, NO. 6, JUNE 2007

[27] C. Robert and G. Casella, Monte Carlo Statistical Methods. New York: Springer-Verlag, 1999. [28] H. Pikkarainen, “State estimation approach to nonstationary inverse problems: Discretization error and filtering problem,” Inv. Probl., no. 22, pp. 365–379, 2006. [29] S. Roberts, “Control chart tests based on geometric moving averages,” Technometrics, vol. 1, pp. 239–250, 1959. [30] E. Gobet, “Efficient Schemes for the weak approximation of killed diffusions,” Stochastic Process. Appl., vol. 7, pp. 167–197, 2000.

Mathias Ortner received the B.Sc. degree in aeronautics and space sciences from the French School of Aeronautics and Space in 2001 and the Ph.D. degree in computer science and image processing from the University of Nice, Sophia-Antipolis, France, in 2004. From 2004 to 2006, he was a Postdoctoral Research Visitor on Prof. Nehorai’s team.

Arye Nehorai (S’80–M’83–SM’90–F’94) received the B.Sc. and M.Sc. degrees in electrical engineering from the Technion—Israel Institute of Technology, Haifa, and the Ph.D. degree in electrical engineering from Stanford University, Stanford, CA. From 1985 to 1995, he was a faculty member with the Department of Electrical Engineering, Yale University, New Haven, CT. In 1995, he joined the Department of Electrical Engineering and Computer Science, University of Illinois at Chicago (UIC), as Full Professor. From 2000 to 2001, he was the

Chair of UIC’s Electrical and Computer Engineering Division, Department of Electrical Engineering and Computer Science, which then became a new department. In 2001, he was named University Scholar of the University of Illinois. In 2006, he became Chairman of the Department of Electrical and Systems Engineering, Washington University, St. Louis (WUSTL), MO. He is the inaugural holder of the Eugene and Martha Lohman Professorship of Electrical Engineering, and he has been the Director of the Center for Sensor Signal and Information Processing, WUSTL, since 2006. Dr. Nehorai was Editor-in-Chief of the IEEE TRANSACTIONS ON SIGNAL PROCESSING from 2000 to 2002. From 2003 to 2005, he was Vice President (Publications) of the IEEE Signal Processing Society, Chair of the Publications Board, member of the Board of Governors, and member of the Executive Committee of this Society. He is the founding Editor of the special columns on Leadership Reflections in the IEEE Signal Processing Magazine. He was co-recipient of the IEEE SPS 1989 Senior Award for Best Paper with P. Stoica, co-author of the 2003 Young Author Best Paper Award, and co-recipient of the 2004 Magazine Paper Award with A. Dogandzic. He was elected Distinguished Lecturer of the IEEE SPS for the term 2004 to 2005. He is the Principal Investigator of the new multidisciplinary university research initiative (MURI) project entitled Adaptive Waveform Diversity for Full Spectral Dominance. He has been a Fellow of the IEEE since 1994 and of the Royal Statistical Society since 1996.

Aleksandar Jerémic received the B.Sc. degree in electrical engineering from the University of Belgrade in 1995 and the M.Sc. and Ph.D. degrees in electrical and computer engineering from the University of Illinois at Chicago in 1997 and 2002, respectively, under the guidance of Prof. A. Nehorai. He is currently, an Assistant Professor with the Department of Electrical and Computer Engineering, McMaster University, Hamilton, ON, Canada. His research interests are in biomedical modeling, statistical signal processing, and medical visualization.

Suggest Documents