Sparse Spike Train Deconvolution using the Hunt Filter ... - CiteSeerX

1 downloads 0 Views 207KB Size Report
A new deconvolution method of sparse spike trains is presented. It is based ... restore a sparse spike train signal resulting in the so-called HT method. The paper ...
Sparse Spike Train Deconvolution using the Hunt Filter and a Thresholding Method Vincent Mazet, David Brie, and Cyrille Caironi∗†

Abstract A new deconvolution method of sparse spike trains is presented. It is based on the coupling of the Hunt filter with a thresholding. We show that a good model for the probability density function of the Hunt filter output is a Gaussian mixture, from which we derive the threshold that minimizes the probability of errors. Based on an interpretation of the method as a MAP estimator, the hyperparameters are estimated using a joint MAP approach. Simulations show that this method performs well at a very low computation time.

Keywords: sparse spike train deconvolution, Bernoulli-Gaussian, Hunt filter, thresholding.

1

Introduction

We consider the problem of sparse spike train deconvolution which is classically stated as follows1 : given some observation sequence y ∈ RN , find the sparse spike train sequence x ∈ RN such as: y = Hx + n, where H ∈ RN ×N is a known impulse response matrix, and n ∈ RN a noise term assumed to be Gaussian and i.i.d. (independent, identically distributed) whose PDF (probability density function) is p(n) = N (0, rn I N ), rn being the noise variance. To handle the sparse nature of x, it is modelled as an i.i.d. BG (Bernoulli– Gaussian) sequence [1, 2], that is: ∀ k, p(xk ) = λN (0, rx ) + (1 − λ)δ(xk ), where rx is the pulse variance, λ ≪ 1 is the Bernoulli parameter (ratio between pulse and sample number) and δ is the Dirac mass centered on zero. In the sequel, θ = {λ, rx , rn } gathers the hyperparameters of the problem. Among the many applications of sparse spike deconvolution, we would like to mention partial discharge analysis which has motivated this work [3]. The final goal was to develop algorithms that can be embedded to monitor the insulation of high-voltage AC motors. In addition, in this particular application, one has to process very large signals (typically N = 216 points). These are the reasons which have led us to develop an approach fast and simple to implement needing only low memory space. On the one hand, combinatory algorithms like SMLR [1, 2], IWM [4], or others given in [5] are wellsuited to the problem. Although they give very satisfactory estimations, they are very slow and difficult to implement on embedded systems. On the other hand, the Hunt filter [6] is very fast and simple to implement but it does not yield satisfactory results mainly because of the underlying prior which assumes that the restored signal is Gaussian. To overcome this drawback, we propose to associate the Hunt filter with a thresholding procedure to actually restore a sparse spike train signal resulting in the so-called HT method. The paper is organized as follows. Section 2 presents the HT method: first, the Hunt filter is briefly recalled; then, we derive the approximate PDF of the Hunt filter output from which we determine the threshold that minimizes the probability of errors. In section 3, based on an interpretation of the HT method as a MAP estimator, we propose to estimate the hyperparameters using a JMAP (joint MAP) approach [7]. In section 4, the performances of the HT method are evaluated and compared to those of the SMLR of [1]. Finally, section 5 gives the conclusions and perspectives of this work. ∗ V. Mazet and D. Brie are with the Centre de Recherche en Automatique de Nancy, CNRS-UMR 7039, Universit´ e Henri Poincar´ e, Nancy 1, BP 239, 54506 Vandœuvre-l` es-Nancy Cedex, France (e-mail: [email protected], [email protected]). † C. Caironi is with Alstom Moteurs, 4, rue de la Rompure, 54250 Champigneulles, France (e-mail: [email protected]). 1 In the sequel, v will represent the ith element of vector v and M i i,j the element (i, j) of the matrix M.

1

2

The HT Method

2.1

The Hunt Filter

The Hunt filter [6] results from the Phillips and Twomey criterion minimization which corresponds to the discrete time formulation of the Tikhonov criterion: JP T = (y − Hx)T (y − Hx) + αxT DT Dx, where D = I N so as to favor the restoration of zero value signal. Minimizing JP T yields a first estimation x ˜ of x: x ˜ = Gy

G = (H T H + αI N )−1 H T .

where

(1)

The matrix H being supposed circulant, its (discrete) Fourier transform is a diagonal matrix. Thus the algorithm may be efficiently implemented using the fast Fourier transform, x ˜ being then obtained by an inverse (discrete) Fourier transform. This results in the so-called Hunt filter. A Bayesian interpretation [5] shows that the Hunt filter is equivalent to a MAP approach with a Gaussian prior: it yields α = rn /rx , where rx is the variance of the Gaussian signal to restore. In the sequel, we will also take α = rn /rx with rx the variance of the pulses of the BG sequence to favor the pulse amplitude estimation.

2.2

Approximate PDF of x ˜

The PDF p(˜ x) of x ˜ = Gy = GHx + Gn is needed to calculate the threshold t (section 2.3). PN As (Gn)k = i=1 Gk,i ni , we have2 : N X

p ((Gn)k ) = p

Gk,i ni

i=1

!

=

⋆ Y

p (Gk,i ni ) .

i={1,...,N }

Since p(nk ) = N (0, rn ) and N (0, σ12 ) ⋆ N (0, σ22 ) = N (0, σ12 + σ22 ), we get: ! N X G2k,i . p ((Gn)k ) = N 0, rn

(2)

i=1

Similarly, since p(xk ) = λN (0, rx ) + (1 − λ)δ(xk ) and (GHx)k =

⋆ Y

p ((GHx)k ) =

PN

i=1 (GH)k,i xi ,

we have:

p ((GH)k,i xi )

i={1,...,N }

=

⋆ Y

i={1,...,N }



  λN 0, rx (GH)2k,i + (1 − λ)δ(xi ) .

To simplify the problem, we now need two assumptions. • Firstly we suppose that the off-diagonal terms of GH are negligible as compared to the diagonal terms. This is nothing but considering the values (GHx)k decorrelated. We note that the higher the SNR is (i.e. α decreases), the more valid this approximation is. Indeed, when α decreases, GH tends to I N . With this assumption, we have rx (GH)k,i = 0 ∀ k 6= i. The validity of this approximation depends both on the SNR and the impulse response. Typically, for the impulse response used in this paper (see section 4), the approximation is very realistic for a SNR greater than 15 dB. But if the frequency contents of the impulse response becomes more concentrated in low frequencies, the SNR should be greater; 2



Q

is introduced to improve the readibility of the paper: Y fi (x) = [f1 ⋆ . . . ⋆ fN ](x).



i={1,...,N}

2

• Secondly, we suppose that the diagonal terms of GGT and GHH T GT are constant, i.e. we neglect the boundary effects. Thus, considering that a Gaussian with zero variance is a Dirac mass, we get:  p ((GHx)k ) = λN 0, rx (GH)2k,k + (1 − λ)δ(xk ).

(3)

From (2) and (3), we obtain:

p (˜ xk ) = p ((GHx)k ) ⋆ p((Gn)k ) N X

0, rx (GH)2k,k + rn

= λN

G2k,i

i=1

+ (1 − λ)N

0, rn

N X

G2k,i

i=1

!

!

.

PN PN As rn i=1 G2k,i = rn (GGT )k,k and rx (GH)2k,k = rx i=1 (GH)2k,i = rx (GHH T GT )k,k , it follows that the PDF is a Gaussian mixture centered on zero: p (˜ xk ) = λN (0, r1 ) + (1 − λ)N (0, r0 ), with

T

T

T

r1 = rx (GHH G )k,k + rn (GG )k,k , T

r0 = rn (GG )k,k ,

r0 < r1

(4) (5) (6)

The histogram of x ˜ obtained on a 1024-sample signal (figure 1(a)) clearly shows that the Gaussian mixture is a good model for p(˜ xk ).

2.3

Thresholding

The thresholding separates pulses (term λN (0, r1 )) from noise (term (1 − λ)N (0, r0 )). The estimated signal (1 − λ)N (0, r0 )

Percentage

8% 6%

AND A

4%

AFA

2% 0

λN (0, r1 ) −2

(a)

−1

0

x ˜k

1

2

amplitude x ˜k

Areas of false alarms and non1: detections PDF of x ˜.

Histogram of x ˜ for 1024 samples.

Figure

t (b)

bk = x b k = 0 otherwise. Performing such a (hard) thresholding, one may be expressed as: x ˜k if |˜ xk | > t, x can make two kinds of errors. False alarms correspond to wrongly detected pulses while non-detections correspond to missed pulses. The threshold t is chosen to minimize the probability of error {pFA + pND } where pFA and pND are the probabilities of false alarm and non-detection. Figure 1(b) shows the positive part of the two Gaussians. The area AFA represents half the probability pFA and AND half the probability pND . It appears that AND + AFA is minimal and equal to A when t corresponds to the intersection point of the two Gaussians, which gives: s  r  2r1 r0 r0 λ t = ln . (7) 1 − λ r1 r0 − r1

3

Hyperparameter Estimation

To address the hyperparameter estimation problem, we first need to interpret the HT method in a Bayesian framework. In [8, 9], it is shown that the MAP estimator remains unchanged for data varying in a neighbourhood if and only if the prior PDF is non-smooth (i.e. its derivative is discontinuous) in this neighbourhood. This results in a thresholding effect of the MAP estimator under non-smooth priors. Our prior being BG, the MAP estimation implies a thresholding, that allows us to use the HT method as a MAP estimator. The bk , x b may be interpreted as the point-wise multiplication: ∀k, x bk = x ˜ corresponding to BG estimation x ˜k q b being an estimate of the i.i.d. Bernoulli sequence with parameter λ, controlling the Hunt filter output and q 3

the occurrence of pulses in x. So, we consider the MAP estimation of (x, q), which maximizes the following joint posterior PDF: p(x, q|y, θ) ∝ p(y|x, q, θ)p(x|q, θ)p(q|θ),

(8)

where: – p(y|x, q, θ) = p(y|x, θ) = N (Hx, rn I N ) because p(n|θ) = N (0, rn I N ) and y = Hx + n; – p(x|q, θ) = N (0, rx Q) where Q = diag{q}; – p(q|θ) = λNq (1 − λ)N −Nq , Nq being the number of non-zero samples of q. To estimate the three hyperparameters gathered into θ = {λ, rx , rn }, we consider the following joint MAP (JMAP) [7]: b = arg max p(x, q, θ|y) q , θ) (˜ x, b (x,q,θ)

= arg max p(y|x, q, θ)p(x|q, θ)p(q|θ), (x,q,θ)

where p(θ) does not appear because it is considered as uniform (no a priori on the hyperparameters : it is in fact a ML estimator). This optimisation problem is solved using an iterative procedure:  (i−1) (i−1)  b(i) ) = arg max p(x, q, λ(i−1) , rx x(i) , q , rn |y), (˜ (x,q)

b(i) , rbx (i) , rbn (i) ) = arg max p(x(i) , q (i) , λ, rx , rn |y).  (λ (λ,rx ,rn )

The first optimisation problem is approximatively solved using the HT method and the hyperparameters are estimated using equation (8), with the assumption that x is a BG signal: b = arg max p(q|λ) = Nq /N, λ λ

1 ||b xk ||2 , Nq rx 1 b ||2 . rbn = arg max p(y|x, q, rn ) = ||y − H x N rn rbx = arg max p(x|q, rx ) =

Rather than using equations (5-6) to estimate r0 and r1 , we propose to estimate them directly from x ˜ as: (rb0 , rb1 ) = arg max p(˜ x|q, r0 , r1 ). (r0 ,r1 )

From equation (4), we get for all k:

p(˜ xk |q, r0 , r1 ) = q k N (0, r1 ) + (1 − q k )N (0, r0 ) = N (0, r1 q k + r0 (1 − q k )), then, p(˜ x|q, r0 , r1 ) = N (0, r1 Q + r0 (I N − Q)) Straightforward calculations yield the following explicit expressions: N 1 X 2 bk , x ˜k q rb1 = Nq k=1

N X 1 bk ). x ˜2k (1 − q rb0 = N − Nq k=1

This approach has been chosen because it leads to a faster algorithm and gives better results than those obtained by using equations (5-6). λ, rx and rn are initialized to arbitrary values, while r0 and r1 are initialized using equations (5-6). The convergence test is made by comparing the signal and hyperparameter values from one iteration to the next. The procedure stops when they differ of no more than 1 %. Extensive simulations have shown the very satisfactory behavior of the procedure. 4

0.8

amplitudes

0.6 0.4 0.2 0 −0.2 −0.4 −0.6

2

4

6

(a)

8

10

12

14

16

samples Impulse Response (h)

amplitudes

2

1

0

−1

−2

0

50

100

(b)

150

200

250

samples Data (y)

amplitudes

2 1 0 −1

MSE = 0.0931

−2 0

50

100

(c)

150

200

250

200

250

samples SMLR Estimate

amplitudes

2 1 0 −1

MSE = 0.2641

−2 0

50

100

(d)

150

samples HT Estimate

Figure 2: SMLR and HT estimates. ◦ represents real pulses, and × estimated ones. MSE: mean square error on the detected pulses.

4

Simulations Results

Some simulations were carried out to assess the performances of the HT method and to compare them to those of the SMLR of [1]. The two methods were implemented in Matlab 6.5 on a Pentium III 800 MHz. Figure 2 shows the results achieved by the two methods on a 256-sample signal with a 16-sample impulse response and the following real hyperparameters: λ = 0.08, rx = 1.5, SNR = 15 dB. Comparing the results, it appears that the SMLR performs slightly better than the HT method. Table 1 gives the statistics obtained on 10 simulations. They confirm the superiority of the SMLR at the price of a very high computation time as compared to the HT method. We also note that the performances of the two methods become comparable as the SNR increases. Figure 3 shows the mean computation time evolution of the HT method as a function of N , confirming that the computational burden remains reasonable, even for very large signals.

time (seconds)

102 101

100

10−1 10−2

27

210

213

216

218

N

Figure 3: Mean Computation Time.

5

Table 1: Computation time, percentage of detected pulses (DP) and false alarms (FA) for several SNR. SNR = 20 dB SNR = 15 dB SNR = 10 dB SNR = 5 dB

5

SMLR HT SMLR HT SMLR HT SMLR HT

23.39 s 0.16 s 26.94 s 0.19 s 37.59 s 0.23 s 105.64 s 0.31 s

92.0 % DP 87.0 % DP 88.17 % DP 81.72 % DP 69.07 % DP 61.85 % DP 42.9 % DP 40.2 % DP

2.0 % FA 5.0 % FA 2.15 % FA 4.3 % FA 3.09 % FA 8.24 % FA 3.7 % FA 1.9 % FA

Conclusion

The HT method is a sparse spike train deconvolution consisting in coupling the Hunt filter with a thresholding. We show that, when the signal x is BG, a Gaussian mixture is a good model for the PDF of the Hunt filter output. From this result, we derive the threshold that minimizes the probability of errors. We then propose to use the HT method as a MAP estimator, and to jointly estimate the hyperparameters of the problem. Simulations show that the method performs well at a very low computation time, making this approach very well-suited to process very large signals, and to be implemented on embedded systems. Future works will be directed to find which global criterion is minimized.

References [1] F. Champagnat, Y. Goussard, and J. Idier, “Unsupervised deconvolution of sparse spike trains using stochastic approximation,” IEEE Trans. Signal Process., vol. 44, no. 12, pp. 2988–2998, December 1996. [2] J. Kormylo and J. Mendel, “Maximum likelihood detection and estimation of Bernoulli-Gaussian processes,” IEEE Trans. Inf. Theory, vol. 28, no. 3, pp. 482–488, May 1982. [3] V. Mazet, D. Brie, and C. Caironi, “D´econvolution impulsionnelle par filtre de Hunt et seuillage,” in Proceedings of GRETSI 2003, September 2003. [4] K. Kaaresen, “Deconvolution of sparse spike trains by iterated window maximization,” IEEE Trans. Signal Process., vol. 45, no. 5, pp. 1173–1183, May 1997. [5] J. Idier, Approche bay´esienne pour les probl`emes inverses.

Herm`es Science Publication, Paris, 2001.

[6] B. Hunt, “The inverse problem of radiography,” Mathematical Biosciences, vol. 8, pp. 161–179, 1970. [7] A. Mohammad-Djafari, “Joint estimation of parameters and hyperparameters in a Bayesian approach of solving inverse problems,” in Proceedings of IEEE ICASSP, April 1997, pp. 2837–2840, Munich, Germany. [8] P. Moulin and J. Liu, “Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors,” IEEE Trans. Inf. Theory, vol. 45, no. 3, pp. 909–919, April 1999. [9] M. Nikolova, “Local strong homogeneity of a regularized estimator,” SIAM J. Appl. Math., vol. 61, no. 2, pp. 633–658, 2000.

6

Suggest Documents