A state estimation approach based on stochastic ...

8 downloads 0 Views 1MB Size Report
(Blanchard et al., 2010) proposed an approach that made use of an ex- .... cients C: – Non-intrusive methods: the original state estimation problem - e.g., the model ...... For instance, we may use Newton's method in order to solve the equation ...
Noname manuscript No. (will be inserted by the editor)

A state estimation approach based on stochastic expansions R. H. Lopez · J. E. Souza Cursi · A. G. Carlon

Received: date / Accepted: date

Abstract This paper presents a new approach for state estimation problems. It is based on the representation of random variables using stochastic functions. Its main idea is to expand the state variables in terms of the noise variables of the system, and then, estimate the unnoisy value of the state variables by taking the mean value of the stochastic expansion. Moreover, it was shown that in some situations, the proposed approach may be adapted to the determination of the probability distribution of the state noise. For the determination of the coefficients of the expansions, we present three approaches: moment matching (MM), collocation (COL) and variational (VAR). In the numerical analysis section, three examples are analyzed including a discrete linear system, the Influenza in a boarding school and the state estimation problem in the Hodgkin-Huxley’s model. In all these examples, the proposed approach was able to estimate the values of the state variables with precision, i.e. with very low RMS values. Keywords uncertainty quantification · state estimation · polynomial chaos

1 Introduction State estimation of discrete/continuous time systems is a fundamental problem in engineering sciences. Indeed, many models of physical systems use mathematical descriptions involving a finite number of variables, usually referred to as state R. H. Lopez Center for Optimization and Reliability in Engineering (CORE), Department of Civil Engineering – UFSC, Rua Joao Pio Duarte, s/n, Florianopolis, SC E-mail: [email protected] J. E. Souza Cursi Institut National des Sciences Appliques (INSA) de Rouen, 76801 Saint Etienne du Rouvray CEDEX, France E-mail: [email protected] A. G. Carlon Center for Optimization and Reliability in Engineering (CORE), Department of Civil Engineering – UFSC, Rua Joao Pio Duarte, s/n, Florianopolis, SC E-mail: [email protected]

2

R. H. Lopez et al.

variables, collected in a state vector s. The evolution of the system is modeled as changes in the vector s and may be described either by a function of time t −→ s(t) (continuous time), or a sequence of discrete values s(0) , s(1) , s(2) , ..., where s(p) corresponds to time t(p) (discrete time). Evolution may be caused by internal changes or external actions u defined by a function t −→ u(t) (continuous time) or discrete values u(0) , u(1) , u(2) , ..., where u(p) corresponds to time t(p) (discrete time). Continuous time models usually describe the evolution of the system by using differential equations, which have to be discretized in order to determine numerical values. Thus, in practice, we must determine a sequence of discrete values of s, defined by algebraic equations connecting s(p+1) to the preceding ones s(p) , ..., s(p−k) and to the external action u(p) which modifies the system at time t(p) . (p) With these notations, we may collect all the =    state variables in a vector x (p) (p−k) (p+1) (p+1) (p−k+1) s , ..., s and connect it to x = s , ..., s by an iteration   function x(p+1) = ϕ x(p) , u(p) . The aim of state estimation is thus the determination of the vector x(p+1) using these equations and some measured data. The main purposes are generally related to control, safety assessment, diagnosis and/or maintenance. Some examples are: one may want to ensure that s(p+1) will follow a prescribed behavior by defining the external actions u(p) into a convenient way; identify the state s(p+1) in order to evaluate damages, to prevent failures, define maintenance operations; or determine a future state of the system s(p+1) in order to furnish a prevision of the state of the system at future date. State estimation is significant in engineering systems since accurate estimates allow: (i) better and more adapted actions in order to keep their desired behavior, (ii) to avoid undesired states, faults and damages, (iii) to furnish realistic prognostic of the future, and so on. Unfortunately, measured data is generally affected by uncertainties and errors, so that we must modify the iteration equation inorder to include uncertainty, so that  a new quantity w representing   x(p+1) = f x(p) , u(p) , w(p) , instead of x(p+1) = ϕ x(p) , u(p) . In the framework of state estimation, w is usually referred to as the state noise vector, while f defines the state equation and the external action u is called a control. In state estimation problems, u is known, while in control problems, it must be determined in order to satisfy conditions on the state of the system. When statistical data concerning w is available, it is common to model it as a random vector having a given probability distribution. In such a situation, x may also be modeled as a random vector. Then, we must estimate its probability distribution in order to: (i) compute estimations allowing decisions which take uncertainty into account and, (ii) eliminate errors from the estimation. Namely, probabilistic information about x is useful in order to estimate the reliability of the system, by taking into account low probability but catastrophic events. State estimation is often carried by Bayesian filters which look for the minimization of a statistical error (see for instance, Sorenson (1970); Maybeck (1979)). One of the most popular ones is the Kalman’s filter, which was developed to deal with additive Gaussian noise and linear situations, not being able to perform well when the noise is non-Gaussian (see for instance, Kalman (1960); Kalman & Bucy (1961); Harlim et al. (2014)). In order to overcome these limitations modifications

A state estimation approach based on stochastic expansions

3

of the original Kalman filter have been introduced, such as Ljung (1979); Gordon et al. (1993); Evensen (1994, 2009); Grooms et al. (2014); Zhen & Harlim (2015) as well as particle filters (see for instance, Liu & Chen (1998); Carpenter et al. (2004); Kotecha & Djuric (2003); Andrieu et al. (2004); Varon et al. (2015)). However, the application of the particle filters may face expensive computational cost when large number of samples are required for the estimation process, specially for high-dimensional problems and complex forward models Madankan et al. (2013). An alternative approach that has been recently investigated to alleviate this computational burden is the use of polynomial chaos (PC) representations (Wiener, 1938; Xiu & Karniadakis, 2002; Ghanem & Spanos, 2003; Nouy & Matre, 2009) in association with Bayesian filters to solve state estimation problems (Marzouk et al., 2007; Saad, 2007; Saad & Ghanem, 2009; Blanchard et al., 2010; Zeng & Zhang, 2010; Madankan et al., 2013; Nagel & Sudret, 2016). It offers methods for the representation of the random variable x as a stochastic function of a user chosen random vector ξ (Wiener, 1938; Xiu & Karniadakis, 2002). One of the significant features of stochastic expansion based approaches is that the approximated distribution furnished by the representation depends only upon the distribution of ξ: therefore, ξ may be replaced by any variable having the same distribution. Moreover, PC expansions allow the full determination of the probability distribution of x (Lopez et al., 2014; Lopez & vila da Silva Jr., 2015), which makes it possible the evaluation of any of its statistical moments and/or desired probabilities. This fact was pointed out by Madankan et al. (2013) in the context of state estimation problems. In the literature, some researches presenting approaches for state estimation that employ PC representations in association with Bayesian filters may be found, and a few examples among them are: (i) Marzouk et al. (Marzouk et al., 2007) employed a PC expansion and a Markov Chain Monte Carlo approach to find a maximum posteriori estimate for an uncertain source parameter; (ii) Blanchard et al. (Blanchard et al., 2010) proposed an approach that made use of an extended Kalman filter to recalculate the PC expansions of the uncertain parameters and states when new measured data becomes available; (iii) Mandakan et al. (Madankan et al., 2013) employed the PC representation for the system parameters and states, and proposed two different methods to update the PC expansions: one based on the Bayes’s formula and the other on the minimum variance technique; (iv) Nagel and Sudret (Nagel & Sudret, 2016) presented a spectral approach to Bayesian inference that focused on the surrogate modeling of the posterior density. Within this context, this paper proposes an approach for state estimation problems entirely based on the representation of random variables using stochastic expansions. The main idea of the proposed approach was presented in previous works of the authors (Lopez et al., 2011, 2013), in which the random variables to be approximated are expanded as a function of the input random parameters of the problem. In the context of the state estimation problem presented in the first paragraphs of this section, it means that the state variables x are expanded in terms of the noise vector w. Then, the solution of the state estimation problem becomes the determination of the deterministic coefficients of this expansion. For this purpose, three approaches are employed: moment matching (MM), collocation (COL) and the solution of a sequence of variational equations (VAR). It is important to mention that the main assumption of the proposed method is that there are no systematic errors, i.e., w has a null mean.

4

R. H. Lopez et al.

The rest of the paper is organized as follows: a brief introduction which recalls the general framework of state estimation and fixes the notation employed in this text is given in Section 2. Section 3 presents the general framework of the proposed state estimation approach by uncertainty quantification. The three methods for the numerical determination of the coefficients of the PC expansion are detailed in Section 4 together with some simple examples. In order to establish that the proposed method is effective, it is applied to the state estimation of: (i) a discrete model, (ii) a model for the propagation of Influenza in a boarding school, and (iii) a Hodgkin-Huxley’s model. Finally, Section 6 presents the main conclusion drawn from this work.

2 The model situation As presented in the introduction, let us consider a model for the evolution of the vector x given by   x(p+1) = f x(p) , u(p) , w(p) , x(0) given.

(1)

In this equation (p) indexes a time instant t(p) in a dynamic problem, f is a iteration function defining the state equation, x ∈ Rn is comprised by the state variables to be dynamically estimated, u is a given external action and w ∈ Rn is the state noise vector. As observed, such a model takes into account the discretization of differential equations and external actions on the system. We are interested here in three aspects of the problem: – the estimation of the probability distribution of x(p) ; – the estimation of the unnoisy values of x(p) when it is assumed that there are no systematic errors, i.e., in eliminating the effect of w(p) when its mean is zero; – the estimation of the distribution of the noise state vector w. In the sequel, we present the proposed approach of state estimation. As previously introduced, the basic idea consists in representing x as a function of a user-chosen random vector ξ, such as, for instance, the state noise vector itself (ξ = w): x(p+1) = x(p+1) (ξ) . Then, x(p+1) (ξ) is approximated by using an expansion in a convenient family of basis functions. The determination of this expansion furnishes an approximation of the probability density of x(p+1) and leads to an estimation of the unnoisy value of x(p+1) by taking the mean value associated to its probability distribution. It is worth to note that such a procedure assumes that there are no systematic errors.

3 State estimation approach based on stochastic expansions One of the classical approaches for uncertainty quantification consists in determining an expansion of a random variable x as a function of a convenient random vector ξ on a Hilbertian basis F - such as, for instance, polynomial chaos expansions (Wiener, 1938; Xiu & Karniadakis, 2002).

A state estimation approach based on stochastic expansions

5

Let E denotes the expected value operator associating to the random variable n Z its mean E [Z]. We say that vector Z taking its values  a random  on  R is square 2 2 summable if and only if E kZk < + ∞ or, equivalently, E Zi < + ∞, for 1 ≤ i ≤ n. The set of the square summable random vectors  taking  their values on Rn is a Hilbert space for the scalar product (Y, Z) = E Yt Z (see, for instance Bobrowski (2005)). Assuming that x is a square summable random variable, the representation of this random variable as a stochastic expansion may be given as xj =

+∞ X

Cjk ϕk (ξ) ,

k=1

where ξ is a convenient random vector taking its values on Ω ⊂ Rd , and F = {ϕk : k ∈ N∗ (i.e., k ≥ 1)} is a total family or Hilbert basis of the functional space L2 (Ω) (or, in general, of L2 (Ω, µ) where µ is a probability measure having a density). In practice, the series is truncated to a finite sum: xj ≈ (Px)j =

NX X

Cjk ϕk (ξ) ,

(2)

k=1

what corresponds to the use of a finite family FNX = {ϕk (ξ) : 1 ≤ k ≤ NX }. Here, NX ∈ N∗ is the number of elements of the family. In matrix notation, we have x ≈ Px = CΦ,

(3)

Φ = (ϕ1 (ξ) , ..., ϕNX (ξ))t ,

(4)

where

and C ∈ M(p, q), where M is a set of the real p × q−matrices. In order to have the representation of x shown in Eq. (3), the coefficients contained in C must be computed. Three different approaches for this purpose are detailed in the next section. In order to apply the stochastic expansion of Eq.o(3) to state estimation probn

lems, we consider a sequence of states look for an approximation of the form

x(p) : p ≥ 0

defined by Eq. (1) and we

Px(p) = C(p) Φ . Once C(p) is obtained, we may estimate the unnoisy value of x(p) as i h h i (p) xest = E Px(p) = E C(p) Φ .

(5)

(6)

Let η be a random variable having distribution as ξ and h the same i h i h Ψ =i (ϕ1 (η) , ..., ϕNX (η))t . Then, we have E C(p) Ψ = E C(p) Φ , so that E C(p) Ψ (p)

furnishes the same estimation xest . Based on the description given above, to determine a finite expansion Px, we must:

6

R. H. Lopez et al.

– choose ξ and FNX (i.e., the functions  ϕk );  (p) (p) – determine the matrix C = Cij ∈ M(n, NX ). The choice of the family FNX is guided by approximation theory considerations: as previously observed, a total family or Hilbert basis of a functional space may be chosen. For instance, polynomial or trigonometrical basis. By simplicity, orthogonal families may be chosen. The choice of ξ depends on the available information and on the method of approximation used (as shown in the following section). As already mentioned, the uncertainty quantification problem resumes to evaluating the coefficients of the expansion C(p) . Hence, three methods for this purpose are presented in the next section.

4 Numerical methods for the determination of the coefficients In this section, we present two classes of methods for the evaluation of the coefficients C: – Non-intrusive methods: the original state estimation problem - e.g., the model - is not modified. In this category falls the MM and COL approaches; – Intrusive methods: the original problem is modified by introduction of the expansion, and the coefficients are calculated by a variational approach (VAR). In the next subsections, the MM, COL and VAR approaches are presented. Together with each of these approaches some simple examples are given in order to make it clear for the reader how to apply them as well as how to choose the approximating random vector ξ. It is worth to point out here that we present the results of this section as the cumulative density function (CDF) of the state variables x(p) . From the CDF, one may obtain any desired statistical information of the state variables, such as mean value, variance or any given quantile of interest. Before proceeding to the presentation of these approaches, one remark must be made: a potential drawback of the use of polynomial representations is that the computational effort required grows very fast for increasing number of random variables and the order of the approximation, an issue known in literature as curse of dimensionality. For this reason, the proposed approach may be computationally demanding in nonlinear problems in which the dimension of ξ is high, unless some measure is taken to avoid this difficulty (e.g. sparse integration schemes Nobile et al. (2008); Long et al. (2013)).

4.1 Approach by moment matching (MM) In general, many choices are possible for ξ, if our objective is the determination of the distribution of x. As an example, let us consider the simple situation where Eq. (1) is given by a sequence of independent variables of same distribution, i.e., x(p+1) = w(p) . In this case, we may use ξ = w(k) , for any k ≥ 0. Indeed, w(p) and w(k) have the same distribution and the approximation P x(p) ≈ ξ = w(k) furnishes the exact distribution of w(p) , for any k ≥ 0. For instance, if each w(p) is uniformly distributed on (−1, 1) , this is also the case of w(k) and so, the approximation furnishes the exact distribution of x(p) .

A state estimation approach based on stochastic expansions

7

When enough statistical information about x is available and an adapted method is used, the approximation may involve a variable ξ having a different distribution and independent from w. An example of adapted method of approximation is moment matching (see Lopez et al. (2011, 2013); de Cursi & Sampaio (2015)), which consists in determining the coefficients C(p) as follows: let   M(Z) = (M1 (Z), ..., Mm (Z))t ∈ Rm , Mk (Z) = E Z k denote a vector having as components the first m moments of Z. We may look for the coefficients (p)

ci

  (p) (p) (p) (p) such that Mk (ci Φ) = Mk (xi ) for 1 ≤ k ≤ m . (7) = Ci1 , ..., CiNX

When m = NX , these equations form a nonlinear system of algebraic equations which may be solved by appropriated methods. As an alternative, we may look for   (p) (p) = Ci1 , ..., CiNX = o    n  (p) (p) (p) (p) = dist M(ci Φ), M(xi ) : ci ∈ M(1, NX ) , arg min Fi ci

(p)

ci

(8)

where dist : Rm × Rm → R is a pseudo-distance: dist (a, b) measures the distance between a, b ∈ Rm . For instance, we may use dist(a, b) = ka − bk or dist(a, b) = ka − bk / kbk . Eq. (8) may be used in situations where m > NX . As an example, let us consider w(p) uniformly distributed on (−1, 1), ξ ∼ (p) (p) (p) (p) N (0, 1) and P x(p) ≈ c1 + c2 ξ + c3 ξ 2 + c4 ξ 3 and look for the coefficients such that the first four moments of P x(p) coincide with those of x(p) . Recalling that, on the one hand, the first six moments of an uniform distribution on (−1, 1) are 0, 1/3, 0, 1/5, 0 and 1/7, and, on the other hand, the first six moments of N (0, 1) are 0, 1, 0, 3, 0 and 15 , we may consider m = 6. An approximated solution is obtained by using dist(a, b) = ka − bk and leads to c1 = 0, c2 = −0.716408, c3 = 0, c4 = 0.0553735. The cumulative function associated to P x(p) is compared to the exact one in Figure 1: as we see, the approximation has a good quality. In order to apply the moment matching approach, we need to produce estimates (p) (p) of the vectors of moments M(ci Φ) and M(xi ). In the example, the knowledge of the distributions furnishes such an ninformation. o In practice, the information is (p) (p) often carried out by samples X(p) = x1 , ..., xns , Ξ(p) = {ξ1 , ..., ξns }. Notice that X(p) may be numerically generated by using Eq. (1). With the samples, the pseudo-distance Fi (c) is estimated by using the approximations

(p)

Mk (ci Φ (ξ)) ≈

ns ns 1 X  (p) k 1 X  (p)  k (p) ci Φr , Mk (xi ) ≈ xr ; ns r=1 ns r=1 i

Φr = (ϕ1 (ξr ) , ..., ϕNX (ξr ))t .

(9)

As an example, let us consider let us consider again w(p) uniformly distributed (p) (p) (p) (p) on (−1, 1), ξ ∼ N (0, 1) and P x(p) ≈ c1 + c2 ξ + c3 ξ 3 + c4 ξ 3 . We generate

8

R. H. Lopez et al. CDF X(p): using different distributions for ξ and w 1 0.9 0.8

P(X(p)

Suggest Documents