NONLINEAR SYSTEM IDENTIFICATION USING PARTICLE FILTERS Thomas B. Sch¨on Department of Information Technology, Uppsala University, Sweden E-mail:
[email protected] ABSTRACT Particle filters are computational methods opening up for systematic inference in nonlinear/non-Gaussian state space models. The particle filters constitute the most popular sequential Monte Carlo (SMC) methods. This is a relatively recent development and the aim here is to provide a brief exposition of these SMC methods and how they are key enabling algorithms in solving nonlinear system identification problems. The particle filters are important for both frequentistic (maximum likelihood) and Bayesian nonlinear system identification. Index Terms— Particle filter, particle smoother, sequential Monte Carlo, maximum likelihood, Bayesian, MCMC, particle MCMC and backward simulation. 1. INTRODUCTION The state space model (SSM) offers a general tool for modeling and analyzing dynamical phenomena. The SSM consists of two stochastic processes; the states {xt }t≥1 and the measurements {yt }t≥1 , which are related according to xt+1 | (xt = xt ) ∼ fθ (xt+1 | xt , ut ),
(1a)
yt | (xt = xt ) ∼ hθ (yt | xt , ut ),
(1b)
Qinghua Zhang >>. The off-line nonlinear system identification problem can (slightly simplified) be expressed as recovering information about the parameters θ based on the information in the T measured inputs u1:T , {u1 , . . . , uT } and outputs y1:T . For a thorough exposition of the system identification problem we refere to link to >. Nonlinear system identification has a long history and a common assumption of the past has been that of linearity and Gaussianity. This assumption is very restrictive and we have now witnessed well over half a century of research devoted to finding useful approximate algorithms allowing this assumption to be weakened. This development has significantly intensified during the past two decades of research on sequential Monte Carlo (SMC) methods (including particle filters and particle smoothers). However, the use of SMC for nonlinear system identification is more recent than that. The aim here is to introduce the key ideas enabling the use of SMC methods in solving nonlinear system identification problems and as we will see it is not a matter of straightforward application. The development of SMC-based identification follows two clear trends (that are indeed more general); 1. The problems we are working with are analytically intractable and hence the mindset has to shift from searching for closed form solutions to the use of computational methods. 2. The new algorithms have basic building blocks that are themselves algorithms. Both these trends call for new developments. Before the SMC methods are introduced in Section 2 their need is clearly explained by formulating both the Bayesian and the maximum likelihood identification problems in Sections 1.1 and 1.2, respectively. Solutions to these problems are then provided in Sections 3 and 4, respectively. Finally, we give some intuition for online (recursive) solutions in Section 5 and in Section 6 we conclude with a summary and directions for future research.
and the initial state x1 ∼ µθ (x1 ). We use bold face for random variables and ∼ means “distributed according to”. The notation xt+1 | (xt = xt ) stands for the conditional probability of xt+1 given xt = xt . The state process {xt }t≥1 is a Markov process, implying that we only need to condition on the most recent state xt , since that contain all information about the past. Furthermore, θ denotes the parameters, fθ (·) and hθ (·) are probability density functions, encoding the dynamic and the measurement models, respectively. In the interest of a compact notation we will suppress the input ut throughout the text. The SSM introduced in (1) is general in that it allows 1.1. Bayesian Problem Formulation for nonlinear and non-Gaussian relationships. Furthermore, it includes both black-box and gray-box models on state In formulating the Bayesian problem the parameters θ are space form. Nonlinear black-box and gray-box models modelled as unknown stochastic variables, i.e. the model (1) are covered by link to link to >
θt = θt−1 + γt ∇θ log pθ (yt | y1:t−1 ), where {γt } is the sequence of step-sizes. Fisher’s identity (12) opens up for the use of SMC in approximating ∇θ log pθ (yt | y1:t−1 ) However, this leads to a rapidly increasing variance, something that can be dealt with by the so called “marginal” Fisher identity, see Poyiadjis et al. (2011) for details. An interesting alternative is provided by an online EM algorithm, see e.g. Capp´e (2011) for a solid introduction. The online EM approaches relies on the additive properties of the Q-function. The area of online solutions via SMC is likely to grow in the future as there is a clear need motivated by the constantly growing data sets and there are also clear theoretical opportunities.
8. RECOMMENDED READING An overview of SMC methods for system identification is provided by Kantas et al. (2009) and a thorough introduction to SMC is provided by Doucet and Johansen (2011). The forthcoming monograph by Sch¨on and Lindsten (2014) provides a textbook introduction to particle filters/smoothers (SMC), MCMC, PMCMC and their use in solving problems in nonlinear system identification and nonlinear state inference. A self-contained introduction to particle smoothers and the backward simulation idea is provided by Lindsten and Sch¨on (2013). The work by Capp´e et al. (2005) also contains a lot of very relevant material in this respect.
References Andrieu, C., Doucet, A., and Holenstein, R. (2010). Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society: Series B, 72(2):1–33. Capp´e, O. (2011). Online EM algorithm for hidden Markov models. Journal of Computational and Graphical Statistics, 20(3):728–749. Capp´e, O., Moulines, E., and Ryd´en, T. (2005). Inference in Hidden Markov Models. Springer, New York, USA. Dempster, A., Laird, N., and Rubin, D. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1):1– 38. Doucet, A. and Johansen, A. M. (2011). A tutorial on particle filtering and smoothing: Fifteen years later. In Crisan, D. and Rozovsky, B., editors, Nonlinear Filtering Handbook. Oxford University Press. Giri, F. and Bai, E.-W., editors (2010). Block-oriented Nonlinear System Identification, volume 404 of Lecture notes in control and information sciences. Springer. Gordon, N. J., Salmond, D. J., and Smith, A. F. M. (1993). Novel approach to nonlinear/non-Gaussian Bayesian state estimation. In IEE Proceedings on Radar and Signal Processing, volume 140, pages 107–113. Hjort, N., Holmes, C., M¨uller, P., and Walker, S., editors (2010). Bayesian nonparametrics. Cambridge University Press. Kantas, N., Doucet, A., Singh, S., and Maciejowski, J. (2009). An overview of sequential Monte Carlo methods for parameter estimation in general state-space models. In Proceedings of the 15th IFAC Symposium on System Identification, pages 774–785, Saint-Malo, France. Lindsten, F. and Sch¨on, T. B. (2013). Backward simulation methods for Monte Carlo statistical inference. Foundations and Trends in Machine Learning, 6(1):1–143. Lindsten, F., Sch¨on, T. B., and Jordan, M. I. (2013). Bayesian semiparametric Wiener system identification. Automatica, 49(7):2053–2063. Poyiadjis, G., Doucet, A., and Singh, S. S. (2011). Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika, 98(1):65–80. Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian processes for machine learning. MIT Press.
Robert, C. P. and Casella, G. (2004). Monte Carlo Statistical Methods. Springer, New York, USA, second edition. Sch¨on, T. B. and Lindsten, F. (2014). Learning of dynamical systems – Particle filters and Markov chain methods. Forthcoming book, see user.it.uu.se/ ˜thosc112/lds. Sch¨on, T. B., Wills, A., and Ninness, B. (2011). System identification of nonlinear state-space models. Automatica, 47(1):39–49.