Introduction
Particle Methods
Summary
References
An Brief Overview of Particle Filtering Adam M. Johansen
[email protected] www2.warwick.ac.uk/fac/sci/statistics/staff/academic/johansen/talks/
May 11th, 2010 Warwick University Centre for Systems Biology
1
Introduction
Particle Methods
Summary
References
Outline
I
Background I I
I
Monte Carlo Methods Hidden Markov Models / State Space Models
Sequential Monte Carlo I
Filtering I I I
I I
Sequential Importance Sampling Sequential Importance Resampling Advanced methods
Smoothing Parameter Estimation
2
Introduction
Particle Methods
Summary
References
Background
3
Introduction
Particle Methods
Summary
References
Monte Carlo
Estimating π I
Rain is uniform.
I
Circle is inscribed in square.
I
Asquare = 4r2 .
I
Acircle = πr2 .
I
p=
I
383 of 500 “successes”.
I
383 = 3.06. π ˆ = 4 500
I
Also obtain confidence intervals.
Acircle Asquare
= π4 .
4
Introduction
Particle Methods
Summary
References
Monte Carlo
The Monte Carlo Method I
Given a probability density, p, Z I= ϕ(x)p(x)dx E
I
Simple Monte Carlo solution: I I
iid
Sample X1 , . . . , XN ∼ p. N P ϕ(XN ). Estimate Iˆ = N1 i=1
I I I
Justified by the law of large numbers. . . and the central limit theorem. Can also be viewed as approximating π(dx) = p(x)dx with N 1 X π ˆ (dx) = δXi (dx). N N
i=1
5
Introduction
Particle Methods
Summary
References
Monte Carlo
Importance Sampling I
Given q, such that I I
p(x) > 0 ⇒ q(x) > 0 and p(x)/q(x) < ∞,
define w(x) = p(x)/q(x) and: Z Z I = ϕ(x)p(x)dx = ϕ(x)w(x)q(x)dx. I
This suggests the importance sampling estimator: I I
iid
Sample X1 , . . . , XN ∼ q. N P w(Xi )ϕ(Xi ). Estimate Iˆ = N1 i=1
I
Can also be viewed as approximating π(dx) = p(x)dx with n
π ˆ n (dx) =
1X w(xi )δXi (dx). n i=1
6
Introduction
Particle Methods
Summary
References
Monte Carlo
Variance and Importance Sampling I
Variance depends upon proposal: !2 N X ˆ =E 1 w(Xi )ϕ(Xi ) − I Var(I) N i=1
=
I
1 N2
N X
E
w(Xi )2 ϕ(Xi )2 − I 2
i=1
1 = Eq w(X)2 ϕ(X)2 − I 2 N 1 Ep w(X)ϕ(X)2 − I 2 = N For positive ϕ, minimized by: q(x) ∝ p(x)ϕ(x) ⇒ w(x)ϕ(x) ∝ 1 7
Introduction
Particle Methods
Summary
References
Monte Carlo
Self-Normalised Importance Sampling I I I
Often, p is known only up to a normalising constant. As Eq (Cwϕ) = CEp (ϕ). . . If v(x) = Cw(x), then Eq (Cwϕ) CEp (ϕ) Eq (vϕ) = = = Ep (ϕ). Eq (v1) Eq (Cw1) CEp (1)
I
Estimate the numerator and denominator with the same sample: N P v(Xi )ϕ(Xi ) i=1 ˆ I= . N P v(Xi ) i=1
I I
Biased for finite samples, but consistent. Typically reduces variance. 8
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Hidden Markov Models / State Space Models x1
x2
x3
x4
x5
x6
y1
y2
y3
y4
y5
y6
I
Unobserved Markov chain {Xn } transition f .
I
Observed process {Yn } conditional density g.
I
Density: p(x1:n , y1:n ) = f1 (x1 )g(y1 |x1 )
n Y
f (xi |xi−1 )g(yi |xi ).
i=2
9
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Inference in HMMs
I
Given y1:n : I I I
What is x1:n , in particular, what is xn , and what about xn+1 ?
I
f and g may have unknown parameters θ.
I
We wish to obtain estimates for n = 1, 2, . . . .
10
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Some Terminology Three main problems: I Given known parameters θ, what are: I
I
The filtering and prediction distributions: Z p(xn |y1:n ) = p(x1:n |y1:n )dx1:n−1 Z p(xn+1 |y1:n ) = p(xn+1 |xn )p(xn |y1:n )dxn The smoothing distributions: p(x1:n |y1:n ) =
I
p(x1:n , y1:n ) p(y1:n )
And how can we estimate static parameters: p(θ|y1:n ) and p(x1:n , θ|y1:n )? 11
Introduction
Particle Methods
Summary
References
Hidden Markov Models
Formal Solutions I
Filtering and Prediction Recursions: p(xn |y1:n−1 )g(yn |xn ) p(xn |y1:n ) = R p(x0n |y1:n−1 )g(yn |x0n )dx0n Z p(xn+1 |y1:n ) = p(xn |y1:n )f (xn+1 |xn )dxn
I
Smoothing: p(x1:n |y1:n ) = R
p(x1:n−1 |y1:n−1 )f (xn |xn−1 )g(yn |xn ) g(yn |x0n )f (x0n |xn−1 )p(x0n−1 |y1:n−1 )dx0n−1:n
12
Introduction
Particle Methods
Summary
References
Particle Filtering
Importance Sampling in This Setting I
Given p(x1:n |y1:n ) for n = 1, 2, . . . .
I
We could sample from a sequence qn (x1:n ) for each n.
I
Or we could let qn (x1:n ) = qn (xn |x1:n−1 )qn−1 (x1:n−1 ) and re-use our samples.
I
The importance weight decomposes: p(x1:n |y1:n ) qn (xn |x1:n−1 )qn−1 (x1:n−1 ) p(x1:n |y1:n ) = wn−1 (x1:n−1 ) qn (xn |x1:n−1 )p(x1:n−1 |y1:n−1 ) f (xn |xn−1 )g(yn |xn ) = wn−1 (x1:n−1 ) qn (xn |xn−1 )p(yn |y1:n−1 )
wn (x1:n ) ∝
13
Introduction
Particle Methods
Summary
References
Particle Filtering
Sequential Importance Sampling – Prediction & Update
I
A first “particle filter”: I I
Simple default: qn (xn |xn−1 ) = f (xn |xn−1 ). Importance weighting becomes: wn (x1:n ) = wn−1 (x1:n−1 ) × g(yn |xn )
I
Algorithmically, at iteration n: I
i i Given {Wn−1 , X1:n−1 } for i = 1, . . . , N : I I
i Sample Xni ∼ f (·|Xn−1 ) (prediction) i i Weight Wn ∝ Wn−1 g(yn |Xni ) (update)
14
Introduction
Particle Methods
Summary
References
Particle Filtering
Example: Almost Constant Velocity Model I I
I
States: xn = [sxn uxn syn uyn ]T Dynamics: xn = Axn−1 + n x sn 1 ∆t 0 0 uxn 0 1 0 0 y = sn 0 0 1 ∆t 0 0 0 1 uyn
sxn−1 uxn−1 y + n s n−1 y un−1
Observation: yn = Bxn + νn sxn uxn y + νn sn uyn
rnx rny
=
1 0 0 0 0 0 1 0
15
Introduction
Particle Methods
Summary
References
Particle Filtering
Better Proposal Distributions I
The importance weighting is: wn (x1:n ) ∝
I
f (xn |xn−1 )g(yn |xn ) wn−1 (x1:n−1 ) qn (xn |xn−1 )
Optimally, we’d choose: qn (xn |xn−1 ) ∝ f (xn |xn−1 )g(yn |xn )
I
Typically impossible; but good approximation is possible. I I
Can obtain much better performance. But, eventually the approximation variance is too large.
16
Introduction
Particle Methods
Summary
References
Particle Filtering
Resampling
I
i } targetting p(x Given {Wni , X1:n 1:n |y1:n ).
I
Sample ei ∼ X 1:n
N X j=1
Wni δX j . 1:n
I
e i } also targets p(x1:n |y1:n ). And {1/N, X 1:n
I
Such steps can be incorporated into the SIS algorithm.
17
Introduction
Particle Methods
Summary
References
Particle Filtering
Sequential Importance Resampling I
Algorithmically, at iteration n: I I
I
i i } for i = 1, . . . , N : , Xn−1 Given {Wn−1 e i }. Resample, obtaining {1/N, X n−1 I
i en−1 Sample Xni ∼ q(·|X )
I
Weight Wni ∝
i ei i f (Xn |Xn−1 )g(yn |Xn ) ei q(X i |X ) n
n−1
Actually: I I
You can resample more efficiently. It’s only necessary to resample some of the time.
18
Introduction
Particle Methods
Summary
References
Particle Filtering
What else can we do?
Some common improvements: I
Incorporate MCMC (7)
I
Use the next observation before resampling – auxiliary particle filters (11, 9)
I
Sample new values for recent states (6)
I
Explicitly approximate the marginal filter (10)
19
Introduction
Particle Methods
Summary
References
Particle Smoothing
Smoothing:
I
Resampling helps filtering.
I
But it eliminates unique particle values.
I
Approximating p(x1:n |y1:n ) or p(xm |y1:n ) fails for large n.
20
Introduction
Particle Methods
Summary
References
Particle Smoothing
Smoothing Strategies Some approaches: I Fixed-lag smoothing: p(xn−L |y1:n ) ≈ p(xn−L |y1:n−1 )
I
for large enough L. Particle implementation of forward-backward algorithm: p(x1:n |y1:n ) = p(xn |y1:n )
n−1 Y
p(xn−1 |xn , y1:n )
m=1
I
Doucet et al. have developed a forward-only variant. Particle implementation of two-filter formula: p(xm |y1:n ) ∝ p(xm |y1:m )p(ym+1:n |xm ) 21
Introduction
Particle Methods
Summary
References
Parameter Estimation
Static Parameters
If f (xn |xn−1 ) and g(yn |xn ) depend upon θ: I
We could set x0n = (xn , θn ) and f 0 (x0n |x0n−1 ) = f (xn |xn−1 )δθn−1 (θn )
I
Degeneracy causes difficulties: I I I
Resampling eliminates values. We never propose new values. MCMC transitions don’t mix well.
22
Introduction
Particle Methods
Summary
References
Parameter Estimation
Approaches to Parameter Estimation
I
Artificial dynamics: θn = θn−1 + n .
I
Iterated filtering.
I
Sufficient statistics.
I
Various online likelihood approaches.
I
Practical filtering.
.. .
23
Introduction
Particle Methods
Summary
References
Summary
24
Introduction
Particle Methods
Summary
References
SMCTC: C++ Template Class for SMC Algorithms I
Implementing SMC algorithms in C/C++ isn’t hard.
I
Software for implementing general SMC algorithms.
I
C++ element largely confined to the library.
I
Available (under a GPL-3 license from) www2.warwick.ac.uk/fac/sci/statistics/staff/ academic/johansen/smctc/ or type “smctc” into google.
I
Example code includes simple particle filter.
25
Introduction
Particle Methods
Summary
References
In Conclusion (with references) I
Sequential Monte Carlo methods (“Particle Filters”) can approximate filtering and smoothing distributions (5).
I
Proposal distributions matter. Resampling:
I
I I I
I I
Smoothing & parameter estimation are harder. Software for easy implementation exists: I I
I
is necessary (2), but not every iteration , and need not be multinomial (4).
PFLib (1) SMCTC (8)
Actually, similar techniques apply elsewhere (3) 26
Introduction
Particle Methods
Summary
References
References [1] L. Chen, C. Lee, A. Budhiraja, and R. K. Mehra. PFlib: An object oriented matlab toolbox for particle filtering. In Proceedings of SPIE Signal Processing, Sensor Fusion and Target Recognition XVI, volume 6567, 2007. [2] P. Del Moral. Feynman-Kac formulae: genealogical and interacting particle systems with applications. Probability and Its Applications. Springer Verlag, New York, 2004. [3] P. Del Moral, A. Doucet, and A. Jasra. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society B, 63(3):411–436, 2006. [4] R. Douc and O. Capp´ e. Comparison of resampling schemes for particle filters. In Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, volume I, pages 64–69. IEEE, 2005. [5] A. Doucet and A. M. Johansen. A tutorial on particle filtering and smoothing: Fiteen years later. In D. Crisan and B. Rozovsky, editors, The Oxford Handbook of Nonlinear Filtering. Oxford University Press, 2010. To appear. [6] A. Doucet, M. Briers, and S. S´ en´ ecal. Efficient block sampling strategies for sequential Monte Carlo methods. Journal of Computational and Graphical Statistics, 15(3):693–711, 2006. [7] W. R. Gilks and C. Berzuini. Following a moving target – Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society B, 63:127–146, 2001. [8] A. M. Johansen. SMCTC: Sequential Monte Carlo in C++. Journal of Statistical Software, 30(6):1–41, April 2009. [9] A. M. Johansen and A. Doucet. A note on the auxiliary particle filter. Statistics and Probability Letters, 78(12):1498–1504, September 2008. URL http://dx.doi.org/10.1016/j.spl.2008.01.032. [10] M. Klass, N. de Freitas, and A. Doucet. Towards practical n2 Monte Carlo: The marginal particle filter. In Proceedings of Uncertainty in Artificial Intelligence, 2005. [11] M. K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journal of the American Statistical Association, 94(446):590–599, 1999. 27
Introduction
Particle Methods
Summary
References
SMC Sampler Outline (i)
I
Given a sample {X1:n−1 }N en−1 , i=1 targeting η
I
sample Xn ∼ Kn (Xn−1 , ·),
I
calculate
(i)
(i)
(i)
(i)
Wn (X1:n ) = I I
(i)
(i)
ηn (Xn )Ln−1 (Xn , Xn−1 ) (i)
(i)
(i)
ηn−1 (Xn−1 )Kn (Xn−1 , Xn )
.
(i)
Resample, yielding: {X1:n }N en . i=1 targeting η Hints that we’d like to use
ηn−1 (xn−1 )Kn (xn−1 , xn ) Ln−1 (xn , xn−1 ) = R . ηn−1 (x0n−1 )Kn (x0n−1 , xn ) 28