Radiation Field Estimation Using a Gaussian Mixture - International ...

6 downloads 0 Views 1MB Size Report
posterior PDF while the second is a Monte Carlo ap- ..... tained by the use of Monte Carlo sampling [20]. The ..... [12] C. Musso, N. Oudjane, and F. Le Gland, “Im-.
12th International Conference on Information Fusion Seattle, WA, USA, July 6-9, 2009

Radiation field estimation using a Gaussian mixture Mark R. Morelande Melbourne Systems Laboratory The University of Melbourne Parkville, Australia

Alex Skvortsov HPP Division Defence Science and Technology Organisation Fishermen’s Bend, Australia

[email protected]

[email protected]

Abstract – The problem of estimating the spatial distribution of radiation using measurements from a collection of spatially distributed sensors is considered. A parametric approach is adopted in which the field is modelled by a weighted sum of Gaussians, i.e., a Gaussian mixture. This is a valid approach for a large class of fields, e.g., absolutely integrable fields. Two Bayesian estimators based on progressive correction are proposed to estimate the mixture parameters. The first performs progressive correction using a Gaussian approximation while the second uses a Monte Carlo approximation. It is demonstrated that the Gaussian approximation is capable of accurate estimation using both simulated and real data. Keywords: Radiological field estimation; Bayesian estimation.

1

Introduction

Accurate assessment of the spatial distribution of radiation is an important public safety issue in situations involving the unintended release of radioactive materials, e.g., reactor leaks [1, 2]. Knowledge of the spatial distribution of radiation, referred to here as the radiation field, can be obtained using measurements from sensors, such as Geiger-M¨ uller (GM) counters, distributed throughout the area of interest. It is common to assume that the radiation field is comprised of a collection of point sources. Each point source can be characterised by a small number of parameters, for instance, the position and intensity, or release rate, for stationary sources. Radiation field estimation then involves determining the number of sources and estimating their parameters [3–7]. Although this procedure can often be performed quickly and accurately, at least for a small number of sources, it is not a completely satisfactory solution to the field estimation problem because the assumption of point sources is too restrictive for general practical applications. For a general radiation field, the mean count for a GM counter at a given position is the convolution be-

978-0-9824438-0-4 ©2009 ISIF

tween a kernel function and the field. Given measurements at a number of locations, field estimation in this context is what is commonly referred to as an inverse problem [8, 9]. Inverse problems are difficult due to the smoothing applied by the convolution of the field with a low-pass kernel. Any attempt to invert this smoothing will necessarily involve a high-pass inversion filter. As a result the field estimates resulting from the inversion are highly sensitive to statistical deviations in the measurements. This can be viewed as the predictable outcome of estimating a large number of parameters from a comparatively small number of noisy measurements. A large number of techniques, usually referred to as regularisation methods, have been proposed to alleviate the problems caused by high-pass inversion [8–10]. These methods essentially involve balancing a trade-off between estimation bias and variance. To avoid the problems associated with inverse problems we propose using a model which can represent the field with a relatively small number of parameters. In particular, it is assumed that the radiation field can be modelled by a weighted sum of Gaussians, referred to as a Gaussian mixture. This modelling approach is motivated by the fact that a large class of functions, e.g., those that are absolutely integrable, can be approximated with arbitrary accuracy by a Gaussian mixture [11]. A similar approach is popular for plume models [1, 2], although in that case only a single plume is present so only a single Gaussian is used. Modelling the radiation field as a Gaussian mixture means that the problem of estimating the field becomes a parameter estimation problem similar to that encountered when using point source models [6]. In our case the problem is complicated by the addition of parameters which describe the spatial extent of the mixture components. A Bayesian approach is adopted to estimate the mixture parameters. The goal of Bayesian estimation is to compute the posterior probability density function (PDF) from which the minimum mean square estimator, the posterior mean, can be calcu-

2247

lated. Since the posterior PDF cannot be computed in closed-form approximations are necessary. The approximations developed here are based on the notion of progressive correction [12]. Progressive correction is an iterative procedure in which the measurement correction is applied incrementally to the prior PDF. This alleviates the difficulties of approximating a posterior PDF when the likelihood has much smaller effective support than the prior, i.e., the measurement is precise compared to the prior information. This is certainly the case here as, for instance, the prior information regarding the positions of the mixture components will usually only be that they lie in the surveillance region, potentially a very large area. Two Bayesian estimators based on progressive correction are proposed. The first estimator computes a Gaussian approximation to the posterior PDF while the second is a Monte Carlo approximation similar to that used in [6] for point sources. The paper is organised as follows. In Section 2 the statistical model for the field and sensor measurements is described. Bayesian estimation methods are developed in Section 3. The proposed estimators are trialled with simulated and real data in Section 4. Conclusions are given in Section 5.

2

functions by radial basis functions. In particular, the following result is of interest. Let a = [a1 , . . . , aq ]′ ∈ Rq , µ = [µ′1 , . . . , µ′q ]′ ∈ R2q and σ = [σ1 , . . . , σq ]′ ∈ R+q denote vectors containing the component weights, locations and scales for the q-element mixture cq (x; a, µ, σ) =

yj ∼ P (λj ),

j = 1, . . . , m,

(1)

where P (λ) is the Poisson distribution with mean λ. The mean count for the jth sensor is given by Z λj = k(x − ξ j )f (x) dx (2) where k is the kernel,  1/R2 , kxk < R, k(x) = 1/kxk2 , kxk ≥ R.

(3)

Note that the saturation effect present in a real sensor is accounted for in (3). In this paper we adopt a parametric approach to field estimation based on modelling the field function by a Gaussian mixture. This approach can be justified by theoretical results concerning the approximation of

ai φ((x − µi )/σi ),

x ∈ R2 , (4)

i=1

where φ(x) = exp(−kxk2 /2)/(2π). Let Z = {cq (·; a, µ, σ) : q ∈ N, a ∈ Rq , µ ∈ R2q , σ ∈ R+q } denote the set of mixture functions. Let Lp (R2 ) denote the set of real-valued functions on R2 which are pth power integrable. Then the following result holds [11, Theorem 2]: Theorem 1. The set Z of mixture functions is dense in the set L1 (R2 ) of absolutely integrable functions. Similar results to Theorem 1 exist for pth power integrable functions, p ≥ 1 and continuous functions [11, 13, 14]. Although we restrict our attention here to Gaussian mixtures the results of [11,13] hold for a quite general class of mixtures. Motivated by the results concerning function approximation by Gaussian mixtures we model the field by

Modelling

Let f : R2 → R denote a field function such that f (x) is the strength of the radiation field at a location x. The field is to be estimated using measurements obtained from a collection of spatially distributed sensors. The sensors return the radiation dose in the form of a count. Let yj , j = 1, . . . , m denote the measurement from the jth sensor, where m is the number of sensors, and y = [y1 , . . . , ym ]′ denotes the measurement vector. The sensors are distributed throughout the surveillance area S ⊂ R2 with the position of the jth sensor denoted as ξ j ∈ S. The observed counts are independently distributed according to

q X

f (x) =

q X

ai N (x; µi , Σi ),

x ∈ R2 ,

(5)

i=1

p −1/2 (x − µi ))/ |Σi | with where N (x; µi , Σi ) = φ(Σi the square root Σ1/2 of the positive definite matrix Σ such that Σ1/2 (Σ1/2 )′ = Σ. The spreading matrix is represented by three parameters, denoted as σi , τi > 0 and ρi ∈ (−1, 1) for the ith component:   σi2 ρi σi τi Σi = (6) ρi σi τi τi2 For a given number q of mixture components, the field estimation problem amounts to estimating the 6qdimensional parameter vector θ = [µ′ , σ ′ , a′ ]′ where a = [a1 , . . . , aq ]′ is the vector of amplitudes, µ = [µ′1 , . . . , µ′q ]′ , with µi = [xi , yi ]′ , is the vector of locations and σ = [σ ′1 , . . . , σ′q ]′ , with σ i = [σi , τi , ρi ]′ , is the vector of spreading parameters. The dependence of the field on the mixture parameters is made explicit by writing f (x; θ) for the value of the field at the point x for the mixture parameters θ. The parametric model adopted here for the field subsumes the commonly adopted point source model, e.g., [3, 6]. In particular, a point source model is obtained by letting σi , τi → 0, i = 1, . . . , q. In a point source model the number of parameters is reduced to 3q for q point sources since spreading matrix parameters do not need to be estimated. It should be noted that the components of the mixture need not be interpreted as sources. Rather, each component adds an extra degreeof-freedom with which to perform a function approximation.

2248

3

Bayesian field

estimation

of the

A Bayesian approach is used to estimate the vector θ of mixture parameters. Assuming a prior PDF π0 for θ, the posterior PDF is π(θ) ∝ ℓ(θ; y)π0 (θ)

(7)

where the likelihood of observing y given the parameter value θ is found from the measurement equation (1) as ℓ(θ; y) =

m Y

P (yj ; λj (θ))

(8)

j=1

where P (y; λ) = λy exp(−λ)/y! is the PDF of a Poisson random variable. Note that the dependence of the mean count λj on the field parameters θ is made explicit in (8). For the Gaussian mixture field, the mean count for the jth sensor is found by substituting (5) into (2) to give q X ai v(µi − ξj , Σi ) (9) λj (θ) = i=1

where v(µ, Σ) =

Z

N (x; µ, Σ)k(x) dx

(10)

An optimal estimate, in the mean square error sense, of θ can be obtained by computing the posterior mean, Z E(θ|y) = θπ(θ) dθ (11) Since neither the posterior PDF (7) or the posterior mean (11) can be computed exactly a numerical approximation is required. The process of numerical approximation is complicated by the fact that the prior will almost always be considerably more diffuse than the likelihood. Previously it has been found that the notion of progressive correction (PC), originally proposed for sequential Monte Carlo sampling in [12], can be profitably applied in such conditions [6, 15]. The idea of PC is to sequentially approximate, over a number of steps, intermediate versions of the posterior PDF which become successively closer to the true posterior PDF. The intermediate posterior PDFs are obtained as in (7) with the likelihood raised to a power less than one. To formalise this idea, let s denote the number of stages of PC. We define s correction factors γ1 , . . . , γs Ps such that 0 < γi < 1 and i=1 γi = 1. Typically γ1 < γ2 0, the ith intermediate posterior PDF is calculated using Bayes’ rule with the likelihood replaced by a linearised Gaussian approximation. Let λ(θ) = [λ1 (θ), . . . , λm (θ)]′ denote the vector of mean counts for

2249

the parameter vector θ. For the vector b = [b1 , . . . , bk ]′ , we define the k × k matrix D(b) with (i, j)th element given by, for i, j = 1, . . . , k, Di,j (b) =



bi , i = j, 0, otherwise.

(14)

The linearised Gaussian approximation to the likelihood at the ith stage is then given by, for i = 1, . . . , s, ˆ i−1 ) + Li (θ − θ ˆ i−1 ), D(λ(θ ˆ i−1 ))) ℓˆi (θ; y) = N (y; λ(θ (15) where Li = ∇θ λ(θ)|θ=θˆ i−1 is the m × 6q gradient matrix, derived in Appendix A. The Gaussian approximation to the ith intermediate posterior PDF is then found as π ˆi (θ) ∝ ℓˆi (θ; y)ˆ πi−1 (θ) ˆ i−1 ) + Li (θ − θ ˆ i−1 ), D(λ(θ ˆ i−1 ))/γi ) ∝ N (y; λ(θ ˆ i−1 , P i−1 ) × N (θ; θ ˆi, P i) ∝ N (θ; θ

ˆi = θ ˆ i−1 + K i [y − λ(θ ˆ i−1 )], θ

(17)

P i = P i−1 − K i Li P i−1

(18)

ˆ i−1 ))/γi and K i = with S i = Li P i−1 L′i + D(λ(θ ′ −1 P i−1 Li S i . Eq. (16) follows from the previous line by the application of a well-known rule governing products of Gaussian PDFs [17, 18]. The final estimate of θ ˆ=θ ˆs. is obtained as θ The linearised Gaussian likelihood approximation (15) is at the heart of perhaps the most well-known nonlinear filtering approximation, the extended Kalman filter (EKF) [19]. Although the EKF is notoriously unreliable in certain situations, the results presented in Section 4 show that, for the problems considered here, linearised approximations in a PC framework produce remarkably accurate results. Calculation of the mean count λj (θ) for a parameter value θ and the gradient matrix Li requires the evaluation of integrals of the form Z (19)

where the function f depends on the quantity being computed. A Monte Carlo approximation can be obtained by drawing xc ∼ N (µ, Σ) for c = 1, . . . , d and computing 1/d

d X c=1

Importance sampling approximation

A potentially more accurate posterior PDF approximation than the Gaussian approximation can be obtained by the use of Monte Carlo sampling [20]. The particular Monte Carlo approach used here is importance sampling which involves drawing samples from an importance density and assigning a weight to each sample. In the context of PC, importance sampling has previously been used for both sequential [12] and batch estimation [15]. To begin with, in step 2 of Algorithm 1, a Monte Carlo representation of the prior PDF is obtained by drawing θ c0 ∼ π0 and setting w0c = 1/n for c = 1, . . . , n. We now consider approximate application of Bayes’ rule in step 4. Assume that a collection of samples 1 n , . . . , wi−1 representing θ1i−1 , . . . , θni−1 and weights wi−1 the (i − 1)th intermediate posterior PDF πi−1 is available for i ≥ 1. The approximation to πi−1 resulting from these samples and weights can be written as

(16)

where the mean and covariance matrix of the Gaussian approximation to the ith intermediate posterior PDF are

f (θ, x)k(x − ξ j )N (x; µ, Σ) dx

3.2

π ˆi−1 (θ) =

n X

c wi−1 δ(θ − θ ci−1 )

(21)

c=1

It is desired to obtain a collection of samples θ 1i , . . . , θni and weights wi1 , . . . , win representing the ith intermediate posterior PDF πi . Using Bayes’ rule we obtain the following approximation to the ith intermediate posterior PDF: π ˜i (θ) =

n X

w ˜ic δ(θ − θci−1 )

(22)

c=1

where the weights are given by, for c = 1, . . . , n, c w ˜ic = B ℓ(θci−1 ; y)γi wi−1  γi Y m c c P (yj ; λj (θ i−1 )) = B wi−1

(23)

j=1

with B such that w ˜i1 , . . . , w ˜in sum to one. Setting c c c c ˜i , c = 1, . . . , n, would provide a θi = θi−1 and wi = w Monte Carlo approximation to πi which could be propagated to later steps. However, this would be equivalent to sampling from the prior π0 , since the samples remain unchanged through each iteration, and then weighting by the likelihood. Such an approach is likely to require an extremely large sample size to provide accurate results. To obtain a representative, diverse collection of samples at the ith stage we draw samples from a kernel density approximation of π ˜i . Let π ˇi (θ) = π ˜i ⋆ gi (θ) =

n X

w ˜ic gi (θ − θci−1 )

(24)

c=1

f (θ, xc )k(xc − ξ j )

(20)

where gi is a suitably chosen kernel density. Selection of the kernel density involves a trade-off between bias and

2250

variance. Here we use a Gaussian kernel with covariance matrix as suggested in [21]. Samples from the mixture (24) are obtained by drawing indices j(1), . . . , j(n) such that j(c) = k with probability w ˜ik and drawing j(c) samples θci = θi−1 +ǫci where ǫci ∼ gi , c = 1, . . . , n. The sample weights are given by wic = 1/n, c = 1, . . . , n. The selection of sample indices according to the weights w ˜i1 , . . . , w ˜in ensures that unlikely samples are removed. Drawing samples from the kernel density means that duplication is avoided. The final estimate of θ is obtained as n X ˆ = 1/n θ θc (25)

sian and importance sampling approximations. The correction factors are evenly distributed on a log scale with γ40 /γ1 = 104 . A sample size of 200 is used for the Monte Carlo approximation (20) to the integrals required for the mean count and gradient. The prior PDF is π0 (θ) =

× G(τi ; 2, 1/10)G(σi ; 2, 1/10)

The Monte Carlo approximation (20) is used to compute the likelihood.

Performance analysis

In this section the performances of the proposed estimators are analysed using both simulated and real data.

Simulation results

The two scenarios considered in the simulations are shown in Figure 1. The radiation field is a mixture with two components in the first scenario and three components in the second scenario. The plots show the 2-sigma ellipses of the mixture components along with the sensor positions, indicated by dots. The line width of each ellipse is proportional to the peak field contribution of the corresponding component. There are 144 sensors evenly distributed throughout the 150m×150m surveillance region. The second scenario is considerably more challenging than the first because, in addition to there being an extra mixture component and therefore more parameters to estimate, two of the mixture components are in close proximity. Further, these two components are quite weak.

100

100 y−position

150

y−position

150

50

0 0

(26)

where G(·; α, β) is the Gamma PDF with shape parameter α and rate parameter β and UA is the uniform distribution over the set A. In (26) S = [0, 150]2 is the surveillance region. The importance sampling approximation is implemented with a sample size of 500. Algorithm performance is measured by the MSE computed over 200 realisations for each scenario. To conserve space results are shown only for the position and spreading parameter estimators. Results for scenario 1 are shown in Tables 1 and 2 and results for scenario 2 are shown in Tables 3 and 4. In the tables PC-G refers to progressive correction using a Gaussian approximation to the posterior and PC-IS refers to the use of the importance sampling approximation. Also shown is the Cram´er-Rao bound (CRB) which can be derived similarly to [6] with the difference that here the mean count is given by the integral (9). The performance achieved by PC-G is far better than that achieved by PC-IS. Moreover this improvement is achieved with a fraction of the computational expense. This is remarkable given the relatively poor performance usually achieved by linearised likelihood approximations. The results suggest that the errors incurred by linearisation can be greatly reduced by the use of progressive correction. It can be seen that sometimes the MSE of PC-G is below the CRB. This could happen because of Monte Carlo variation in the MSE estimate or because the CRB is a deterministic parameter bound which does not use the prior information assumed by the Bayesian estimators.

c=1

4.1

G(ai ; 2, 1/500)US (µi )U(−1,1) (ρi )

i=1

s

4

q Y

Table 1: Position estimator MSEs for scenario 1. 50

50

100 x−position

Scenario 1

150

0 0

50

100

150

x−position

Scenario 2

Parameter

PC-G

PC-IS

CRB

x1 y1 x2 y2

1.52 1.83 2.18 3.30

2.21 3.86 4.79 7.78

1.27 1.35 2.13 2.56

Figure 1: Simulation scenarios: The ellipses represent the locations and extents of the mixture components. The line width of each ellipse is an indicator of the strength of the component. The black dots are the sensor positions.

4.2

The following parameters are used. Progressive correction is implemented with 40 steps for both the Gaus-

Experimental data was acquired during trials at Puckapunyal, Victoria, Australia. In the data set considered here three sources of radiation are present. The

2251

Experimental results

Test 3

Table 2: Spread estimator MSEs for scenario 1.

10 −1

140 6

13

120

3.05 8.48 4.83 14.12

1.46 1.55 2.32 2.89

63

70

8 5

40

6

39 11

4

80

27 20

49

55

31 38

0

54 26

60

19

25

4

53

1

18

0

48 37

3 47 2 1.5

1.72 4.49 2.31 4.90

71

−0. 12

52

30 36

40

51 24 17

20 3

2

46 35

50

1

0.5

σ1 τ1 σ2 τ2

64

41

−0.5 5

100

Y(m)

CRB

.5

PC-IS

72

−0

PC-G

65

0.5

Parameter

42

23

10

16

45

29 34

62 69 76 80

84

87

91

97

93

96

2

94

0.5

1

7

−40

−20

0

−20

21 14

60 68 75 2379

44

33

1.5

58 67 74 78

43

57

28 32

0

56 66 73 77

20

83

1

59

1

40

81

60 X(m)

0

90 86

82

89

85

80

−0.5

15 8

0.5

22

2

0

61

32.5 9

0

88

92

100

120

95

140

160

−2

Table 3: Position estimator MSEs for scenario 2. Parameter

PC-G

PC-IS

CRB

x1 y1 x2 y2 x3 y3

1.80 1.86 5.31 6.64 7.67 3.55

3.74 4.63 13.30 24.34 21.27 16.65

1.30 1.38 7.13 5.12 8.36 2.86

Figure 2: Experimental set-up for three source data set: source positions are green dots and the numbered white crosses are the positions at which measurements were acquired.

radiation sources are sufficiently concentrated in space that they may be considered point sources. An overhead view of the experimental set-up is shown in Figure 2. During the trials an observer with a GM counter moves over the surveillance area, periodically stopping to acquire measurements. The yellow dots in Figure 2 indicate the path followed by the sensor, the numbered white crosses are the positions at which measurements were acquired and the green dots are the source positions. Sixty readings were take at each position. By randomly selecting at each position from these readings it is possible to create different measurement realisations. Algorithm performance can then be assessed over a large number of real data records. Two measures of algorithm performance are considered. The first is the mean of the field estimate obtained from the mixture parameter estimates normalised by its standard deviation. The second measure of performance is the number of sources correctly detected. Sources are hypothesised

Table 4: Spread estimator MSEs for scenario 2. Parameter

PC-G

PC-IS

CRB

σ1 τ1 σ2 τ2 σ3 τ3

2.18 2.03 5.21 6.38 4.95 3.02

5.92 14.64 17.51 20.20 19.65 17.01

1.50 1.59 5.64 3.69 5.52 2.51

to lie at the locations of peaks of the field estimate with value greater than a specified threshold. Hypothesised sources which are within 10m of a true source position are classified as true, otherwise they are false. If multiple detected sources are true for the same true source position and no other all but one of these is classified as false. The mean number of true and false sources are compiled over a number of measurement realisations. Only progressive correction with the Gaussian approximation is considered in this performance analysis. It is implemented with 75 correction steps. The correction factors are evenly distributed on a log scale with γ75 /γ1 = 103 . The Monte Carlo approximation (20) is used with d = 200 samples. The prior PDF is π0 (θ) =

q Y

G(ai ; 2, 1/500)US (µi )U(−1,1) (ρi )

i=1

G(τi ; 2, 1/5)G(σi ; 2, 1/5)

(27)

where S ⊂ R2 is the surveillance area. The normalised mean of the PC-G field estimator is shown in Figures 3. The results were obtained by averaging over 100 measurement realisations. For all scenarios q = 5 mixture components were used to model the field. The mean number of true sources detected was 2.86 and the mean number of false sources was 0.01. As would be expected given these detection results, the normalised mean exhibits peaks in the desired locations. Note that the number q of mixture components need not match the number of sources present.

5

Conclusions

A parametric approach to radiation field estimation was proposed. Assuming that the radiation field satisfies certain regularity conditions, such as absolute integrability, a Gaussian mixture was used as a model. Two

2252

y position

a1 , . . . , aq are, for i = 1, . . . , q,

3.20

120 100

2.66

80

2.13

60

∂λj (θ)/∂ai = v(µi − ξ j , Σi )

(30)

The derivatives with respect to the location vectors µ1 , . . . , µq can be found as, for i = 1, . . . , q, Z −1 ∇µi λj (θ) = ai Σ (x − µi )N (x; µi , Σi )k(x − ξj ) dx

1.60

40 1.07

20

(31)

0.53

0 −40 −20

0 20 x position

40

Figure 3: Normalised mean of the field estimator for the experimental scenario. The solid black discs are located at the true source positions.

Bayesian approaches for estimating the mixture parameters were proposed. Both estimators are based on the idea of progressive correction which involves successively approximating a sequence of distributions which approach the posterior. The first approach uses a Gaussian approximation while the second uses Monte Carlo sampling. In the numerical simulations the Gaussian approximation clearly outperformed the Monte Carlo approximation, despite having a much lighter computational load. The Gaussian approximation was also successfully applied to real data. There are several areas for future work. The issue of model selection, i.e., selecting the number of mixture components, should be considered. The simulations presented here considered only the case where the field is actually a Gaussian mixture. It would be of interest to examine the properties of the estimators when applied to arbitrary fields. Previous work on point source estimation has looked at trajectory control of mobile sensors. This problem is also important in field estimation and should be considered in the parametric field estimation framework adopted here.

A

Derivation of the gradient matrix

In this appendix the gradient of λ(θ) = [λ1 (θ), . . . , λm (θ)]′ , with λj (θ) given in (2), with respect to θ is derived. Each element of λ(θ) can be considered separately. Recall that λj (θ) =

q X

ai v(µi − ξj , Σi )

(28)

Matrix calculus theory [22] provides an elegant approach to finding the derivatives with respect to the spreading parameters σ 1 , . . . , σ q . In [22] the derivative of the m × n matrix Y with respect to the p × q matrix X is the mp × nq matrix given by ∇X Y = Y ⊗ ∇X

where ⊗ is the Kronecker product. Using the definition (32) a number of rules analogous to the well-known rules for scalar derivatives can be derived. We make use of the product and chain rules. The latter requires the notion of the star product. Consider a m × n matrix Y and a mp × nq matrix Z. Let yi,j denote the (i, j)th element of Y and Z i,j denote the (i, j)th p × q submatrix of Z for i = 1, . . . , m, j = 1, . . . , n. Then the star product of X and Z is the p × q matrix X ∗Z =

v(µ, Σ) =

Z

N (x; µ, Σ)k(x) dx

m X n X

yi,j Z i,j

The derivative with respect to the spreading parameters is Z ∇σi λj (θ) = ai ∇σi N (x; µi − ξ j , Σi )k(x) dx (34) We consider the Gaussian PDF in isolation. Since the spreading parameters do not appear in the mean we replace µi − ξj by µ to conserve space. Using the chain rule [22] gives ∇σ i N (x; µ, Σi ) = ∇Σi N (x; µ, Σi ) ∗ ∇σ i Σi

(35)

Consider the derivative of the Gaussian PDF with respect to the spreading matrix Σi . It is convenient to define G(x; µ, Σ) = exp[−(x−µ)′ Σ−1 (x−µ)/2]. Using the product rule [22] gives ∇Σi N (x; µ, Σi ) = |2πΣi |−1/2 ∇Σi G(x; µ, Σi ) + G(x; µ, Σi )∇Σi |2πΣi |−1/2

(36)

Using the chain rule and the product rule gives, after some manipulations, ′ −1 ∇Σi G(x; µ, Σi ) = Σ−1 i (x − µ)(x − µ) Σi /2

(29)

(33)

i=1 j=1

i=1

where

(32)

(37)

The derivative of the determinant is found using the chain rule and a result given in [23, Table 1]:

The derivatives with respect to the amplitudes

2253

∇Σi |2πΣi |−1/2 = −|2πΣi |−1/2 Σ−1 i /2

(38)

Substituting (37) and (38) into (36) and then (34) gives Z ∇σ i λj (θ) = ai /2 N (x; µi , Σi ) k(x − ξ j ) −1 ′ × [Σ−1 i (x − µi )(x − µi ) − I 2 ]Σi ∗ ∇σ i Σi dx (39)

where I k is the k × k identity matrix. The derivative of Σi with respect to σ i can be found from the definition of the matrix derivative (32) and the definition of Σi given in (6): ∇σ i Σi =



2σi ρi τi

0 ρi σi

0 σi τi

ρi τi 0

ρi σi 2τi

σi τi 0

′

[9] J. Kalifa and S. Mallat, “Thresholding estimators for linear inverse problems and deconvolutions,” The Annals of Statistics, vol. 31, no. 1, pp. 58– 109, 2003. [10] P. Hansen, “The truncated SVD as a method for regularisation,” BIT Numerical Mathematics, vol. 27, pp. 534–553, 1987. [11] J. Park and I. Sandberg, “Approximation and radial-basis-function networks,” Neural Computation, vol. 5, pp. 305–316, 1993. [12] C. Musso, N. Oudjane, and F. Le Gland, “Improving regularised particle filters,” in Sequential Monte Carlo Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon, Eds. New York: Springer-Verlag, 2001.

References [1] M. Drews, B. Lauritzen, H. Madsen, and J. Smith, “Kalman filtration of radiation monitoring data from atmospheric dispersion of radioactive materials,” Radiation Protection Dosimetry, vol. 111, no. 3, pp. 257–269, 2004. [2] H. Jeong, M. Han, W. Hwang, and E. Kim, “Application of data assimilation to improve the forecasting capability of an atmospheric dispersion model for a radioactive plume,” Annals of Nuclear Energy, vol. 35, no. 5, pp. 838–844, 2008. [3] S. Brennan, A. Mielke, and D. Torney, “Radioactive source detection by sensor networks,” IEEE Transactions on Nuclear Science, vol. 52, no. 3, pp. 813–819, 2005. [4] Y. Cheng and T. Singh, “Source term estimation using convex optimization,” in Proceedings of the International Conference on Information Fusion, Cologne, Germany, 2008. [5] J. Howse, L. Ticknor, and K. Muske, “Least squares estimation techniques for position tracking of radioactive sources,” Automatica, vol. 37, pp. 1727–1737, 2001. [6] M. Morelande, B. Ristic, and A. Gunatilaka, “Detection and parameter estimation of multiple radioactive sources,” in Proceedings of the International Conference on Information Fusion, Quebec, Canada, 2007. [7] R. Nemzek, J. Dreicer, D. Torney, and T. Warnock, “Distributed sensor networks for detection of mobile radioactive sources,” IEEE Transactions on Nuclear Science, vol. 51, no. 4, pp. 1693–1700, 2004. [8] M. Banham and A. Katsaggelos, “Digital image restoration,” IEEE Signal Processing Magazine, pp. 24–41, March 1997.

[13] J. Park and I. Sandberg, “Universal approximation using radial-basis-function networks,” Neural Computation, vol. 3, pp. 246–257, 1991. [14] I. Sandberg, “Gaussian radial basis functions and inner product spaces,” Circuits Systems and Signal Processing, vol. 20, no. 6, pp. 635–642, 2001. [15] M. Morelande, B. Moran, and M. Brazil, “Bayesian node localisation in wireless sensor networks,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, USA, 2008. [16] F. Daum and J. Huang, “Particle flow for nonlinear filters with log-homotopy,” in Proceedings of SPIE, vol. 6969, San Diego, USA, 2008. [17] S. Challa and D. Koks, “Bayesian and DempsterShafer fusion,” Sadhana, vol. 29, no. 2, pp. 145– 176, 2004. [18] Y. Ho and R. Lee, “A Bayesian approach to problems in stochastic estimation and control,” IEEE Transactions on Automatic Control, vol. 9, pp. 333–339, 1964. [19] A. Jazwinski, Stochastic Processes and Filtering Theory. Academic Press, 1970. [20] J. Liu, Monte Carlo Strategies in Scientific Computing. Springer, 2008. [21] B. Silverman, Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986. [22] E. MacRae, “Matrix derivatives with an application to an adaptive decision problem,” The Annals of Statistics, vol. 2, no. 2, pp. 337–346, 1974. [23] P. Dwyer, “Some applications of matrix derivatives in multivariate analysis,” Journal of the American Statistical Association, vol. 62, no. 318, pp. 607– 625, 1967.

2254

Suggest Documents