An Brief Overview of Particle Filtering - CiteSeerX

7 downloads 0 Views 399KB Size Report
May 11, 2010 - p(x1:n−1|y1:n−1)f(xn|xn−1)g(yn|xn). ∫ g(yn|xn)f(xn|xn−1)p(xn−1|y1:n−1)dxn−1:n. 12. Page 13. Introduction. Particle Methods. Summary.
Introduction

Particle Methods

Summary

References

An Brief Overview of Particle Filtering Adam M. Johansen [email protected] www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/

May 11th, 2010 Warwick University Centre for Systems Biology

1

Introduction

Particle Methods

Summary

References

Outline

I

Background I I

I

Monte Carlo Methods Hidden Markov Models / State Space Models

Sequential Monte Carlo I

Filtering I I I

I I

Sequential Importance Sampling Sequential Importance Resampling Advanced methods

Smoothing Parameter Estimation

2

Introduction

Particle Methods

Summary

References

Background

3

Introduction

Particle Methods

Summary

References

Monte Carlo

Estimating π I

Rain is uniform.

I

Circle is inscribed in square.

I

Asquare = 4r2 .

I

Acircle = πr2 .

I

p=

I

383 of 500 “successes”.

I

383 = 3.06. π ˆ = 4 500

I

Also obtain confidence intervals.

Acircle Asquare

= π4 .

4

Introduction

Particle Methods

Summary

References

Monte Carlo

The Monte Carlo Method I

Given a probability density, p, Z I= ϕ(x)p(x)dx E

I

Simple Monte Carlo solution: I I

iid

Sample X1 , . . . , XN ∼ p. N P ϕ(XN ). Estimate Iˆ = N1 i=1

I I I

Justified by the law of large numbers. . . and the central limit theorem. Can also be viewed as approximating π(dx) = p(x)dx with N 1 X π ˆ (dx) = δXi (dx). N N

i=1

5

Introduction

Particle Methods

Summary

References

Monte Carlo

Importance Sampling I

Given q, such that I I

p(x) > 0 ⇒ q(x) > 0 and p(x)/q(x) < ∞,

define w(x) = p(x)/q(x) and: Z Z I = ϕ(x)p(x)dx = ϕ(x)w(x)q(x)dx. I

This suggests the importance sampling estimator: I I

iid

Sample X1 , . . . , XN ∼ q. N P w(Xi )ϕ(Xi ). Estimate Iˆ = N1 i=1

I

Can also be viewed as approximating π(dx) = p(x)dx with n

π ˆ n (dx) =

1X w(xi )δXi (dx). n i=1

6

Introduction

Particle Methods

Summary

References

Monte Carlo

Variance and Importance Sampling I

Variance depends upon proposal:  !2  N X ˆ =E  1 w(Xi )ϕ(Xi ) − I  Var(I) N i=1

=

I

1 N2

N X

E



w(Xi )2 ϕ(Xi )2 − I 2



i=1

   1 = Eq w(X)2 ϕ(X)2 − I 2 N    1 Ep w(X)ϕ(X)2 − I 2 = N For positive ϕ, minimized by: q(x) ∝ p(x)ϕ(x) ⇒ w(x)ϕ(x) ∝ 1 7

Introduction

Particle Methods

Summary

References

Monte Carlo

Self-Normalised Importance Sampling I I I

Often, p is known only up to a normalising constant. As Eq (Cwϕ) = CEp (ϕ). . . If v(x) = Cw(x), then Eq (Cwϕ) CEp (ϕ) Eq (vϕ) = = = Ep (ϕ). Eq (v1) Eq (Cw1) CEp (1)

I

Estimate the numerator and denominator with the same sample: N P v(Xi )ϕ(Xi ) i=1 ˆ I= . N P v(Xi ) i=1

I I

Biased for finite samples, but consistent. Typically reduces variance. 8

Introduction

Particle Methods

Summary

References

Hidden Markov Models

Hidden Markov Models / State Space Models x1

x2

x3

x4

x5

x6

y1

y2

y3

y4

y5

y6

I

Unobserved Markov chain {Xn } transition f .

I

Observed process {Yn } conditional density g.

I

Density: p(x1:n , y1:n ) = f1 (x1 )g(y1 |x1 )

n Y

f (xi |xi−1 )g(yi |xi ).

i=2

9

Introduction

Particle Methods

Summary

References

Hidden Markov Models

Inference in HMMs

I

Given y1:n : I I I

What is x1:n , in particular, what is xn , and what about xn+1 ?

I

f and g may have unknown parameters θ.

I

We wish to obtain estimates for n = 1, 2, . . . .

10

Introduction

Particle Methods

Summary

References

Hidden Markov Models

Some Terminology Three main problems: I Given known parameters θ, what are: I

I

The filtering and prediction distributions: Z p(xn |y1:n ) = p(x1:n |y1:n )dx1:n−1 Z p(xn+1 |y1:n ) = p(xn+1 |xn )p(xn |y1:n )dxn The smoothing distributions: p(x1:n |y1:n ) =

I

p(x1:n , y1:n ) p(y1:n )

And how can we estimate static parameters: p(θ|y1:n ) and p(x1:n , θ|y1:n )? 11

Introduction

Particle Methods

Summary

References

Hidden Markov Models

Formal Solutions I

Filtering and Prediction Recursions: p(xn |y1:n−1 )g(yn |xn ) p(xn |y1:n ) = R p(x0n |y1:n−1 )g(yn |x0n )dx0n Z p(xn+1 |y1:n ) = p(xn |y1:n )f (xn+1 |xn )dxn

I

Smoothing: p(x1:n |y1:n ) = R

p(x1:n−1 |y1:n−1 )f (xn |xn−1 )g(yn |xn ) g(yn |x0n )f (x0n |xn−1 )p(x0n−1 |y1:n−1 )dx0n−1:n

12

Introduction

Particle Methods

Summary

References

Particle Filtering

Importance Sampling in This Setting I

Given p(x1:n |y1:n ) for n = 1, 2, . . . .

I

We could sample from a sequence qn (x1:n ) for each n.

I

Or we could let qn (x1:n ) = qn (xn |x1:n−1 )qn−1 (x1:n−1 ) and re-use our samples.

I

The importance weight decomposes: p(x1:n |y1:n ) qn (xn |x1:n−1 )qn−1 (x1:n−1 ) p(x1:n |y1:n ) = wn−1 (x1:n−1 ) qn (xn |x1:n−1 )p(x1:n−1 |y1:n−1 ) f (xn |xn−1 )g(yn |xn ) = wn−1 (x1:n−1 ) qn (xn |xn−1 )p(yn |y1:n−1 )

wn (x1:n ) ∝

13

Introduction

Particle Methods

Summary

References

Particle Filtering

Sequential Importance Sampling – Prediction & Update

I

A first “particle filter”: I I

Simple default: qn (xn |xn−1 ) = f (xn |xn−1 ). Importance weighting becomes: wn (x1:n ) = wn−1 (x1:n−1 ) × g(yn |xn )

I

Algorithmically, at iteration n: I

i i Given {Wn−1 , X1:n−1 } for i = 1, . . . , N : I I

i Sample Xni ∼ f (·|Xn−1 ) (prediction) i i Weight Wn ∝ Wn−1 g(yn |Xni ) (update)

14

Introduction

Particle Methods

Summary

References

Particle Filtering

Example: Almost Constant Velocity Model I I

I

States: xn = [sxn uxn syn uyn ]T Dynamics: xn = Axn−1 + n  x   sn 1 ∆t 0 0  uxn   0 1 0 0  y =  sn   0 0 1 ∆t 0 0 0 1 uyn

 sxn−1   uxn−1   y  + n  s  n−1 y un−1 

Observation: yn = Bxn + νn  sxn  uxn   y  + νn  sn  uyn  

rnx rny



=



1 0 0 0 0 0 1 0



15

Introduction

Particle Methods

Summary

References

Particle Filtering

Better Proposal Distributions I

The importance weighting is: wn (x1:n ) ∝

I

f (xn |xn−1 )g(yn |xn ) wn−1 (x1:n−1 ) qn (xn |xn−1 )

Optimally, we’d choose: qn (xn |xn−1 ) ∝ f (xn |xn−1 )g(yn |xn )

I

Typically impossible; but good approximation is possible. I I

Can obtain much better performance. But, eventually the approximation variance is too large.

16

Introduction

Particle Methods

Summary

References

Particle Filtering

Resampling

I

i } targetting p(x Given {Wni , X1:n 1:n |y1:n ).

I

Sample ei ∼ X 1:n

N X j=1

Wni δX j . 1:n

I

e i } also targets p(x1:n |y1:n ). And {1/N, X 1:n

I

Such steps can be incorporated into the SIS algorithm.

17

Introduction

Particle Methods

Summary

References

Particle Filtering

Sequential Importance Resampling I

Algorithmically, at iteration n: I I

I

i i } for i = 1, . . . , N : , Xn−1 Given {Wn−1 e i }. Resample, obtaining {1/N, X n−1 I

i en−1 Sample Xni ∼ q(·|X )

I

Weight Wni ∝

i ei i f (Xn |Xn−1 )g(yn |Xn ) ei q(X i |X ) n

n−1

Actually: I I

You can resample more efficiently. It’s only necessary to resample some of the time.

18

Introduction

Particle Methods

Summary

References

Particle Filtering

What else can we do?

Some common improvements: I

Incorporate MCMC (7)

I

Use the next observation before resampling – auxiliary particle filters (11, 9)

I

Sample new values for recent states (6)

I

Explicitly approximate the marginal filter (10)

19

Introduction

Particle Methods

Summary

References

Particle Smoothing

Smoothing:

I

Resampling helps filtering.

I

But it eliminates unique particle values.

I

Approximating p(x1:n |y1:n ) or p(xm |y1:n ) fails for large n.

20

Introduction

Particle Methods

Summary

References

Particle Smoothing

Smoothing Strategies Some approaches: I Fixed-lag smoothing: p(xn−L |y1:n ) ≈ p(xn−L |y1:n−1 )

I

for large enough L. Particle implementation of forward-backward algorithm: p(x1:n |y1:n ) = p(xn |y1:n )

n−1 Y

p(xn−1 |xn , y1:n )

m=1

I

Doucet et al. have developed a forward-only variant. Particle implementation of two-filter formula: p(xm |y1:n ) ∝ p(xm |y1:m )p(ym+1:n |xm ) 21

Introduction

Particle Methods

Summary

References

Parameter Estimation

Static Parameters

If f (xn |xn−1 ) and g(yn |xn ) depend upon θ: I

We could set x0n = (xn , θn ) and f 0 (x0n |x0n−1 ) = f (xn |xn−1 )δθn−1 (θn )

I

Degeneracy causes difficulties: I I I

Resampling eliminates values. We never propose new values. MCMC transitions don’t mix well.

22

Introduction

Particle Methods

Summary

References

Parameter Estimation

Approaches to Parameter Estimation

I

Artificial dynamics: θn = θn−1 + n .

I

Iterated filtering.

I

Sufficient statistics.

I

Various online likelihood approaches.

I

Practical filtering.

.. .

23

Introduction

Particle Methods

Summary

References

Summary

24

Introduction

Particle Methods

Summary

References

SMCTC: C++ Template Class for SMC Algorithms I

Implementing SMC algorithms in C/C++ isn’t hard.

I

Software for implementing general SMC algorithms.

I

C++ element largely confined to the library.

I

Available (under a GPL-3 license from) www2.warwick.ac.uk/fac/sci/statistics/staff/ academic/johansen/smctc/ or type “smctc” into google.

I

Example code includes simple particle filter.

25

Introduction

Particle Methods

Summary

References

In Conclusion (with references) I

Sequential Monte Carlo methods (“Particle Filters”) can approximate filtering and smoothing distributions (5).

I

Proposal distributions matter. Resampling:

I

I I I

I I

Smoothing & parameter estimation are harder. Software for easy implementation exists: I I

I

is necessary (2), but not every iteration , and need not be multinomial (4).

PFLib (1) SMCTC (8)

Actually, similar techniques apply elsewhere (3) 26

Introduction

Particle Methods

Summary

References

References [1] L. Chen, C. Lee, A. Budhiraja, and R. K. Mehra. PFlib: An object oriented matlab toolbox for particle filtering. In Proceedings of SPIE Signal Processing, Sensor Fusion and Target Recognition XVI, volume 6567, 2007. [2] P. Del Moral. Feynman-Kac formulae: genealogical and interacting particle systems with applications. Probability and Its Applications. Springer Verlag, New York, 2004. [3] P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society B, 63(3):411–436, 2006. [4] R. Douc and O. Capp´ e. Comparison of resampling schemes for particle filters. In Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, volume I, pages 64–69. IEEE, 2005. [5] A. Doucet and A. M. Johansen. A tutorial on particle filtering and smoothing: Fiteen years later. In D. Crisan and B. Rozovsky, editors, The Oxford Handbook of Nonlinear Filtering. Oxford University Press, 2010. To appear. [6] A. Doucet, M. Briers, and S. S´ en´ ecal. Efficient block sampling strategies for sequential Monte Carlo methods. Journal of Computational and Graphical Statistics, 15(3):693–711, 2006. [7] W. R. Gilks and C. Berzuini. Following a moving target – Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society B, 63:127–146, 2001. [8] A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Journal of Statistical Software, 30(6):1–41, April 2009. [9] A. M. Johansen and A. Doucet. A note on the auxiliary particle filter. Statistics and Probability Letters, 78(12):1498–1504, September 2008. URL http://dx.doi.org/10.1016/j.spl.2008.01.032. [10] M. Klass, N. de Freitas, and A. Doucet. Towards practical n2 Monte Carlo: The marginal particle filter. In Proceedings of Uncertainty in Artificial Intelligence, 2005. [11] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association, 94(446):590–599, 1999. 27

Introduction

Particle Methods

Summary

References

SMC Sampler Outline (i)

I

Given a sample {X1:n−1 }N en−1 , i=1 targeting η

I

sample Xn ∼ Kn (Xn−1 , ·),

I

calculate

(i)

(i)

(i)

(i)

Wn (X1:n ) = I I

(i)

(i)

ηn (Xn )Ln−1 (Xn , Xn−1 ) (i)

(i)

(i)

ηn−1 (Xn−1 )Kn (Xn−1 , Xn )

.

(i)

Resample, yielding: {X1:n }N en . i=1 targeting η Hints that we’d like to use

ηn−1 (xn−1 )Kn (xn−1 , xn ) Ln−1 (xn , xn−1 ) = R . ηn−1 (x0n−1 )Kn (x0n−1 , xn ) 28