Nonlinear Mapping of Uncertainties in Celestial Mechanics M. Valli1 , R. Armellin2 , P. Di Lizia3 and M.R. Lavagna4 Politecnico di Milano, via La Masa 34, 21056 Milano, Italy
The problem of nonlinear uncertainty propagation represents a crucial issue in celestial mechanics. In this paper a method for nonlinear propagation of uncertainty based on differential algebra is presented. Working in the differential algebra framework enables a general approach to nonlinear uncertainty propagation that can provide highly accurate estimate with low computational cost. The nonlinear mapping of the statistics is shown here adopting the two-body problem as working framework, including coordinate system transformations. The general feature of the proposed method is also demonstrated by presenting long-term integrations in complex dynamical systems, such as the n-body problem or the simplified general perturbation (SGP4) model.
I.
Introduction
A method for nonlinear propagation of uncertainties based on differential algebra is presented. The problem of nonlinear uncertainty propagation represents a crucial issue in celestial mechanics since all practical systems - from vehicle navigation to orbit determination or target tracking - involve nonlinearities of one kind or another. One topic of recent interest, for example, is the accurate representation of spacecraft uncertainty under nonlinear orbital dynamics. Also space surveillance requires methods to propagate the initial condition uncertainty of resident space objects in order to identify and track them. On the other side, many problems in celestial mechanics require conversions
1 2 3 4
Ph.D. Candidate, Aerospace Engineering Department,
[email protected]. Postdoc Fellow, Aerospace Engineering Department,
[email protected]. Postdoc Fellow, Aerospace Engineering Department,
[email protected]. Associate Professor, Aerospace Engineering Department,
[email protected].
1
between different coordinate systems (e.g. the conversion from polar to Cartesian coordinates that forms the foundation for the observation models of many sensors). Such transformations are mostly nonlinear and therefore entail the problem of applying nonlinear transformations to the estimated statistics. Uncertainty propagation and estimation in nonlinear systems is extremely difficult. The optimal Bayesian solution to the problem requires the propagation of the description of the full probability density function (pdf). Because the form of the pdf is not restricted, it cannot be described using a finite number of parameters and any practical estimator must use an approximation of some kind [1]. Hence, the main goal of uncertainty propagation is to obtain an accurate finite-dimensional representation of the predicted state pdf. Several approximated techniques exist in the literature to approximate the initial condition uncertainty evolution. Present-day approaches mainly refer to linearized propagation models [2–4] or full nonlinear Monte Carlo simulations [5]. The linear assumption simplifies the problem significantly, but the accuracy of the solution drops off in case of highly nonlinear systems and/or long time propagations. On the other hand, Monte Carlo simulations provide true trajectory statistics, but are computationally very expensive. Other present-day approaches are Gaussian closure [6], Equivalent Linearization [7], and Stochastic Averaging [8, 9]. All of these algorithms have several features in common and involve the linearization of the underlying dynamic system. As a consequence, they are considered unsuitable when the system is strongly nonlinear, because the effect of the neglected higher order terms can lead to significant errors. The unscented transformation (UT) was developed to address the deficiencies of linearization: a set of samples are deterministically chosen which match the mean and covariance of a (non necessarily Gaussian-distributed) probability distribution. The UT is founded on the idea that it is easier to approximate a pdf than it is to approximate an arbitrary nonlinear function or transformation [1, 10]. Another approach for long-term propagation of arbitrary distributions through nonlinear transformations is represented by Gaussian mixtures [11–13]. The use of Gaussian sum filters is well-suited for problems in space surveillance where, in many cases, Gaussian sum filters in conjunction with a judicious choice of the coordinate system leads to consistent long-term propagation of the state pdf. The well-known
2
Fokker-Plank-Kolmogorov equation (FPKE) [14] captures the exact time evolution of the state pdf of stochastic continuous dynamical systems. Park et al. [15] proved the integral invariance of the pdf via solutions of the FPKE for diffusionless systems and used this property to develop an analytic expression that maps the statistics by approximating the flow of the dynamics in higher order Taylor series. They demonstrated that this approach provides good agreement with Monte Carlo simulations. However, the need to derive the dynamics of the so-called high order tensors makes it in many cases - especially for a sophisticated, high fidelity system model - inadequate due to computational complexity. Up to now limited work has been done to automate and speed up the derivation of the state transition tensors [16, 17]. In this paper we explore the use of differential algebraic (DA) techniques for accurate nonlinear uncertainty propagation. Differential algebra supplies the tools to compute the derivatives of functions within a computerized environment [18–21]. More specifically, by substituting the classical implementation of real algebra with the implementation of a new algebra of Taylor polynomials, any function f of n variables is expanded into its Taylor polynomial up to an arbitrary order k. The availability of such order expansions can be exploited when nonlinear uncertainty propagation is performed. For instance, this has a strong impact when the numerical integration of an ordinary differential equation (ODE) is performed by means of an arbitrary integration scheme. Any integration scheme is based on algebraic operations, involving the evaluation of the ODE right hand side at several integration points. Therefore, starting from the DA representation of the initial conditions and carrying out all the evaluations in the DA framework, the flow of an ODE is obtained at each step as its Taylor expansion in the initial conditions. This Taylor expansion can then be used in order to efficiently compute the statistics of the propagated state, as described in this paper. The manuscript is organized as follows. First a brief introduction to differential algebra is given. Some hints on how to use it to expand the solution of parametric implicit equations and obtain high order expansions of the flow are presented. Then differential algebra and the proposed techniques are applied to obtain the nonlinear mapping of the statistics. Finally, the effectiveness of the method is demonstrated through several numerical examples.
3
II.
Notes on Differential Algebra
DA techniques, exploited here to obtain k-th order Taylor expansions of the flow of a set of ODE’s with respect to initial condition, were devised to attempt solving analytical problems through an algebraic approach [19]. Historically, the treatment of functions in numerics has been based on the treatment of numbers, and the classical numerical algorithms are based on the mere evaluation of functions at specific points. DA techniques rely on the observation that it is possible to extract more information on a function rather than its mere values. The basic idea is to bring the treatment of functions and the operations on them to computer environment in a similar manner as the treatment of real numbers. Referring to Fig. 1, consider two real numbers a and b. Their transformation into the floating point representation, a and b respectively, is performed to operate on them in a computer environment. Then, given any operation ∗ in the set of real numbers, an adjoint operation ! is defined in the set of floating point (FP) numbers so that the diagram in Fig. 1 commutes. (The diagram commutes approximately in practice due to truncation errors.) Consequently, transforming the real numbers a and b into their FP representation and operating on them in the set of FP numbers returns the same result as carrying out the operation in the set of real numbers and then transforming the achieved result in its FP representation. In a similar way, let us suppose two k differentiable functions f and g in n variables are given. In the framework of differential algebra, the computer operates on them using their k-th order Taylor expansions, F and G respectively. Therefore, the transformation of real numbers in their FP representation is now substituted by the extraction of the k-th order Taylor expansions of f and g. For each operation in the space of k differentiable functions, an adjoint operation in the space of Taylor polynomials is defined so that the corresponding diagram commutes; i.e., extracting the Taylor expansions of f and g and operating on them in the space of Taylor polynomials (labeled as k Dn ) returns the same result as operating on f and g in the original space and then extracting the Taylor expansion of the resulting function. The straightforward implementation of differential algebra in a computer allows computation of the Taylor coefficients of a function up to a specified order k, along with the function evaluation, with a fixed amount of effort. The Taylor coefficients of order n for sums and products of functions, as well as scalar products with reals, can be computed from those of
4
Fig. 1 Analogy between the floating point representation of real numbers in a computer environment (left figure) and the introduction of the algebra of Taylor polynomials in the differential algebraic framework (right figure).
summands and factors; therefore, the set of equivalence classes of functions can be endowed with well-defined operations, leading to the so-called truncated power series algebra [22, 23]. Similarly to the algorithms for floating point arithmetic, the algorithms for functions followed, including methods to perform composition of functions, to invert them, to solve nonlinear systems explicitly, and to treat common elementary functions [18, 19]. In addition to these algebraic operations, the DA framework is endowed with differentiation and integration operators, therefore finalizing the definition of the DA structure.
A.
The Minimal Differential Algebra
Consider all ordered pairs (q0 , q1 ), with q0 and q1 real numbers. Define addition, scalar multiplication, and vector multiplication as follows: (q0 , q1 ) + (r0 , r1 ) = (q0 + r0 , q1 + r1 ) t · (q0 , q1 ) = (t · q0 , t · q1 )
(1)
(q0 , q1 ) · (r0 , r1 ) = (q0 · r0 , q0 · r1 + q1 · r0 ). The ordered pairs with the above arithmetic are called 1 D1 . The multiplication of vectors is seen to have (1, 0) as the unity element. The multiplication is commutative, associative, and distributive with respect to addition. Together, the three operations defined in Eq. (1) form an algebra. Furthermore, they form an extension of real numbers, as (r, 0) + (s, 0) = (r + s, 0) and (r, 0) · (s, 0) = (r · s, 0), so that the reals are included. 5
The multiplicative inverse of the pair (q0 , q1 ) in 1 D1 is (q0 , q1 )−1 =
!
1 q1 ,− 2 q0 q0
"
,
(2)
which is defined for any q0 #= 0. One important property of this algebra is that it has an order compatible with its algebraic operations. Given two elements (q0 , q1 ) and (r0 , r1 ) in 1 D1 , it is defined (q0 , q1 ) < (r0 , r1 ) if
q0 < r0
(q0 , q1 ) > (r0 , r1 ) if
(r0 , r1 ) < (q0 , q1 )
(q0 , q1 ) = (r0 , r1 ) if
q0 = r0 and q1 = r1 .
or (q0 = r0 and q1 < r1 ) (3)
As for any two elements (q0 , q1 ) and (r0 , r1 ) only one of the three relation holds, 1 D1 is said totally ordered. The order is compatible with the addition and multiplication; for all (q0 , q1 ), (r0 , r1 ), (s0 , s1 ) ∈ 1 D1 ,
it follows (q0 , q1 ) < (r0 , r1 ) ⇒ (q0 , q1 ) + (s0 , s1 ) < (r0 , r1 ) + (s0 , s1 ); and (s0 , s1 ) > (0, 0) = 0
⇒ (q0 , q1 ) · (s0 , s1 ) < (r0 , r1 ) · (s0 , s1 ). The number d = (0, 1) has the interesting property of being positive but smaller than any positive real number; indeed (0, 0) < (0, 1) < (r, 0) = r. For this reason d is called an infinitesimal or a differential. In fact, d is so small that its square vanishes. Since for any (q0 , q1 ) ∈ 1 D1 (q0 , q1 ) = (q0 , 0) + (0, q1 ) = q0 + d · q1 ,
(4)
the first component is called the real part and the second component the differential part. The algebra in 1 D1 becomes a differential algebra by introducing a map ∂ from 1 D1 to itself, and proving that the map is a derivation. Define ∂ : 1 D1 → 1 D1 by ∂(q0 , q1 ) = (0, q1 ).
(5)
Note that ∂{(q0 , q1 ) + (r0 , r1 )} = ∂(q0 + r0 , q1 + r1 ) = (0, q1 + r1 ) = (0, q1 ) + (0, r1 ) = ∂(q0 , q1 ) + ∂(r0 , r1 )
6
(6)
and ∂{(q0 , q1 ) · (r0 , r1 )} = ∂(q0 · r0 , q0 · r1 + r0 · q1 ) = (0, q0 · r1 + r0 · q1 ) = (0, q1 ) · (r0 , r1 ) + (0, r1 ) · (q0 , q1 )
(7)
= ∂{(q0 , q1 )} · (r0 , r1 ) + (q0 , q1 ) · ∂{(r0 , r1 )} This holds for all (q0 , q1 ), (r0 , r1 ) ∈ 1 D1 . Therefore ∂ is a derivation and (1 D1 , ∂) is a differential algebra. The most important aspect of 1 D1 is that it allows the automatic computation of derivatives. Assume to have two functions f and g and to put their values and their derivatives at the origin in the form (f (0), f " (0)) and (g(0), g " (0)) as two vectors in 1 D1 . If the derivative of the product f · g is of interest, it has just to be looked at the second component of the product (f (0), f " (0))·(g(0), g " (0)); whereas the first component gives the value of the product of the functions. Therefore, if two vectors contain the values and the derivatives of two functions, their product contains the values and the derivatives of the product function. Defining the operator [ ] from the space of differential functions to 1 D1 via [f ] = (f (0), f " (0)),
(8)
it holds [f + g] = [f ] + [g] (9) [f · g] = [f ] · [g] and [1/g] = [1]/[g] = 1/[g]
(10)
by using (2). This observation can be used to compute derivatives of many kinds of functions algebraically by merely applying arithmetic rules on 1 D1 , beginning from the value and the derivative of the identity function [x] = (x, 1) = x + δx. Consider the example 1 x + (1/x)
(11)
(1/x2 ) − 1 . (x + (1/x))2
(12)
f (x) = and its derivative f " (x) =
7
The function value and its derivative at the point x = 3 are f (3) =
3 , 10
f " (3) = −
2 . 25
(13)
Evaluating the function (11) in the DA framework at (3, 1) = 3 + δx yields f ((3, 1)) =
=
1 1 = (3, 1) + 1/(3, 1) (3, 1) + (1/3, −1/9) 1 = (10/3, 8/9)
!
3 8 # 100 ,− 10 9 9
"
=
!
3 2 ,− 10 25
"
(14) .
Thus, the real part of the result is the value of the function at x = 3, whereas the differential part is the value of the derivative of the function at x = 3. This is simply justified by applying the relations (9) and (10) [f (x)] =
=
$
% 1 1 = x + 1/x [x + 1/x]
1 1 = [x] + [1/x] [x] + 1/[x]
(15)
= f ([x]).
B.
The Differential Algebra k Dn
The algebra described in this section was introduced to compute the derivatives up to an order k of functions in n variables. Similarly as before, it is based on taking the space C k (Rn ), the collections of k times continuously differentiable functions on Rn . On this space an equivalence relation is introduced. For f and g ∈ C k (Rn ), f =k g if and only if f (0) = g(0) and all the partial derivatives of f and g agree at 0 up to order k. The relation =k satisfies f =k f for all f ∈ C k (Rn ), f =k g ⇒ g =k f for all f, g ∈ C k (Rn ),
and
(16)
f =k g and g =k h ⇒ f =k h for all f, g, h, ∈ C k (Rn ). Thus, =k is an equivalence relation. All the elements that are related to f can be grouped together in one set, the equivalence class [f ] of the function f . The resulting equivalence classes are often referred to as DA vectors or DA numbers. Intuitively, each of these classes is then specified by a particular collection of partial derivatives in all n variables up to order k. This class is called k Dn . 8
If the values and the derivatives of two functions f and g are known, the corresponding values and derivatives of f + g and f · g can be inferred. Therefore, the arithmetics on the classes in k Dn can be introduced via [f ] + [g] = [f + g] t · [g] = [t · f ]
(17)
[f ] · [g] = [f · g] Under this operations, k Dn becomes an algebra. For each v ∈ 1, . . . , n, define the map ∂v from k Dn to k Dn for f via $ % ∂f ∂v [f ] = pv · , ∂xv
(18)
pv (x1 , . . . , xn ) = xv
(19)
where
projects out the v-th component of the identity function. It’s easy to show that for all v = 1, . . . , n and for all [f ], [g] ∈ k Dn ∂v ([f ] + [g]) = ∂v [f ] + ∂v [g]
(20)
∂v ([f ] · [g]) = [f ] · (∂v [g]) + (∂v [f ]) · [g].
(21)
Therefore, ∂v is a derivation for all v, and hence ( k Dn , ∂1 , . . . , ∂v ) is a differential algebra. Observe that f lies in the same class as its Taylor polynomial Tf of order k around the origin; they have the same function values and derivatives up to order n. Therefore, [f ] = [Tf ]
(22)
and the introduced differential algebra is referred to as Taylor polynomial algebra. Similar to the structure 1 D1 , composition of functions and elementary functions, i.e. exp, sin, and log, are introduced in k Dn . Consequently, the derivatives of any function f belonging to C k (Rn ) can be computed up to order k in fixed amount of effort. The DA sketched in this section was implemented by M. Berz and K. Makino in the software COSY INFINITY [20]. 9
C.
Solution of parametric implicit equations
Well-established numerical techniques (e.g., Newton’s method) exist, which can effectively identify the solution of a classical implicit equation f (x) = 0.
(23)
Suppose an explicit dependence on a vector of parameters p can be highlighted in the previous function f , which leads to the parametric implicit equation f (x, p) = 0.
(24)
Suppose the previous equation is to be solved, whose solution is represented by the function x(p) returning the value of x solving (24) for any value of p. Thus, the dependence of the solution of the implicit equation on p is of interest. DA techniques can effectively handle the previous problem by identifying the function x(p) in terms of its Taylor expansion with respect to p. The DA-based algorithm is presented in the following for the solution of the scalar parametric implicit Eq. (24); the generalization to a system of parametric implicit equations is straightforward. The solution of (24) is sought, where sufficient regularity is assumed to characterize the function f ; i.e., f ∈ C k . This means that x(p) satisfying f (x(p), p) = 0
(25)
is to be identified. The first step is to consider a reference value of p and to compute the solution x by means of a classical numerical method; e.g., Newton’s method. The variable x and the parameter p are then initialized as k-th order DA variables, i.e., [x] = x + δx
(26)
[p] = p + δp. A DA-based evaluation of the function f in (24) delivers the k-th order expansion of f with respect to x and p: δf = Mf (δx, δp),
(27)
where Mf denotes the Taylor map for f . Note that the map (27) is origin-preserving as x is the solution of the implicit equation for the nominal value of p; thus, δf represents the deviation of f 10
from its reference value. The map (27) is then augmented by introducing the map corresponding to the identity function on p (i.e., δp = Ip (δp)) ending up with
δf Mf δx = . Ip δp δp
(28)
The k-th order map (28) is inverted using COSY-Infinity built-in tools (based on fixed point iterations), obtaining
−1
δx Mf = Ip δp
δf . δp
(29)
As the goal is to compute the k-th order Taylor expansion of the solution manifold x(p) of (24), the map (29) is evaluated for δf = 0 The first row of map (30)
−1
δx Mf = Ip δp
0 . δp
δx = Mx (δp),
(30)
(31)
expresses how a variation of the parameter δp affects the solution of the implicit equation as a k-th order Taylor polynomial. In particular, by plugging map (30) in first of (26) we obtain
[x] = x + Mx (δp),
(32)
which is the k-th order Taylor expansion of the solution of the implicit equation. For every value of δp, the approximate solution of f (x, p) = 0 can be easily computed by evaluating the Taylor polynomial (32). Apparently, the solution obtained by means of map (32) is a Taylor approximation of the exact solution of Eq. (24). The accuracy of the approximation depends on both the order of the Taylor expansion and the displacement δp from the reference value of the parameter. Within this work, the algorithm just introduced is utilized for expanding the solution of Kepler’s equations that appear in the two-body problem and in SGP4 model. In this section the classical form of Kepler’s equation is considered ,
µ t = E − e sin E, a3 11
(33)
in which µ is the gravitational parameter of the attractor, a the semi-major axis, t the time since pericenter passage, e the eccentricity, and E the unknown eccentric anomaly. Let us work in adimensional variables, such that µ = 1, a = 1, and thus the orbital period is T = 2π. For the sake of illustrative purposes, we expand the solution of Kepler’s equation in Taylor series with respect to p = (a, e) around the nominal condition p = (1, 0.5). We also fix t = π, so that E = π is the nominal solution. The first step to approximate the solution manifold E = E(a, e) in Taylor series is to initialize E, a, and e as DA variables [E] = π + δE [a] = 1 + δa
(34)
[e] = 0.5 + δe. The algorithm described by Eq.s (27)–(31) is then applied to the implicit equation f (E(a, e), a, e) =
,
µ t − E + e sin E = 0. a3
(35)
The result is [E] = π + ME (δa, δe), which is a Taylor polynomial of arbitrary order k. Figure 2(a) shows some contour lines of [E] for a ∈ [0.9, 1.1] and e ∈ [0.4, 0.6] for a sixth order computation. The Taylor expansion is capable of accurately representing the solution E = π for a = 1. In Figure 2(b) the accuracy of the expansion is studied in logarithmic scale. Taylor polynomial evaluations are compared with the solutions of Kepler’s equation obtained by applying point-wise Newton’s iterations on a fine grid in the a-e plane. It is apparent that the expansion error decreases significantly close to the reference point and, in addition, the polynomial approximation is exact for the entire line a = 1. The computational time for the computation of the sixth order expansion is 0.45 × 10−3 s on an Intel Core 2 Duo @2.4GHz, running on a Mac OS X 10.6.8. (All the computations presented in the manuscript are performed on this machine.) As a last comment, note that both the two-body problem and SGP4 model (in Sec. IV A and IV C) involve different forms of Kepler’s equation. Furthermore, in those cases, the solutions are expanded in Taylor series with respect to uncertainties in the entire initial state (i.e., we have Taylor polynomials of six variables).
12
0.6
e [ïï]
0.55
3.142
0.5
3.202
3.082
3.262
3.022 2.962
3.322
0.45
3.382
2.901
3.442
0.4 0.9
0.95
1 a [ïï]
1.05
1.1
(a) Contour lines of E obtained by evaluating the Taylor map. The solution is E = π for a = 1.
0.6
ï6
0.55
ï7 ï8
e [ïï]
ï9
0.5
ï10
ï12 ïinf
0.45
0.4 0.9
0.95
1 a [ïï]
1.05
1.1
(b) Contour lines of the logarithm of the expansion error. The expansion error is zero for a = 1.
Fig. 2 Kepler’s equation: sixth order Taylor expansion of E(a, e).
D.
High Order Expansion of the Flow
The differential algebra allows the derivatives of any function f of n variables to be computed up to an arbitrary order k, along with the function evaluation. This has an important consequence when the numerical integration of an ODE is performed by means of an arbitrary integration scheme. Any integration scheme is based on algebraic operations, involving the evaluation of the ODE right hand side at several integration points. Therefore, carrying out all the evaluations in the DA framework
13
allows differential algebra to compute the arbitrary order expansion of the flow of a general ODE with respect to the initial condition. Without loss of generality, consider the scalar initial value problem x˙ = f (x, t) x(t0 ) = x0
(36)
and the associated phase flow ϕ(t; x0 ). We now want to show that, starting from the DA representation of the initial condition x0 , differential algebra allows us to propagate the Taylor expansion of the flow in x0 forward in time, up to the final time tf . Replace the point initial condition x0 by the DA representative of its identity function up to order k, which is a (k + 1)-tuple of Taylor coefficients. (Note that x0 is the flow evaluated at the initial time; i.e, x0 = ϕ(t0 ; x0 ).) As for the identity function only the first two coefficients, corresponding to the constant part and the first derivative respectively, are non zeros, we can write [x0 ] as x0 + δx0 , in which x0 is the reference point for the expansion. If all the operations of the numerical integration scheme are carried out in the framework of differential algebra, the phase flow ϕ(t; x0 ) is approximated, at each fixed time step ti , as a Taylor expansion in x0 . For the sake of clarity, consider the forward Euler’s scheme xi = xi−1 + f (xi−1 )∆t
(37)
and substitute the initial value with the DA identity [x0 ] = x0 + δx0 . At the first time step we have [x1 ] = [x0 ] + f ([x0 ]) · ∆t.
(38)
If the function f is evaluated in the DA framework, the output of the first step, [x1 ], is the k-th order Taylor expansion of the flow ϕ(t; x0 ) in x0 for t = t1 . Note that, as a result of the DA evaluation of f ([x0 ]), the (k + 1)-tuple [x1 ] may include several non zeros coefficients corresponding to high order terms in δx0 . The previous procedure can be inferred through the subsequent steps. The result of the final step is the k-th order Taylor expansion of ϕ(t; x0 ) in x0 at the final time tf . Thus, the flow of a dynamical system can be approximated, at each time step ti , as a k-th order Taylor expansion in x0 in a fixed amount of effort.
14
The conversion of standard integration schemes to their DA counterparts is straightforward both for explicit and implicit solvers. This is essentially based on the substitution of the operations between real numbers with those on DA numbers. In addition, whenever the integration scheme involves iterations (e.g. iterations required in implicit and predictor-corrector methods), step size control, and order selection, a measure of the accuracy of the Taylor expansion of the flow needs to be included. For the numerical integrations presented in this paper a DA version of the Dormand and Prince (8-th order solution for propagation, 7-th order solution for step size control) implementation of Runge Kutta integrator is used. In this case, a weighted norm of the coefficients of the Taylor expansion is used in the step size control procedure. Integration schemes based on DA pave the way to the nonlinear mapping of uncertainties investigated in this paper. A first example is presented hereafter about the propagation of uncertainties on initial conditions. The Taylor polynomials resulting from the use of DA-based numerical integrators expand the solution of the initial value problem presented in Eq. (36) with respect to the initial condition. Thus, the dependence of the solution x(t) with respect to the initial condition is available, at a time ti , in terms of a k-th order polynomial map Mx0 (δx0 ), where δx0 represents the displacement from the reference initial condition. The evaluation of the map Mx0 (δx0 ) for a selected value of δx0 supplies the k-th order Taylor approximation of the solution x(t) at ti corresponding to the displaced initial condition. The accuracy of the result depends on the expansion order k and the value of the displacement δx0 . The main advantage of the DA-based approach is that there is no need to write and integrate variational equations in order to obtain high order expansions of the flow. This result is basically obtained by the substitution of operations between real numbers with those on DA numbers, and therefore the method is ODE independent. Furthermore, the efficient implementation of the differential algebra in COSY-Infinity allows us to obtain high order expansions with limited computational time.
15
III.
Uncertainty Propagation
Before focusing on the problem of uncertainty propagation through nonlinear dynamics, let us consider the more general case of an arbitrary nonlinear transformation. Depending on the study case, in fact, the considered nonlinear transformation can have different meanings; i.e., a coordinate transformation, the nonlinear dynamical evolution of a physical system, and so on. However, independently of these single cases, the techniques presented in this paper can be used and remain valid whenever the problem of propagating the statistics of a system through an arbitrary nonlinear function must be solved. Moreover, in many cases of practical interest - and in particular in celestial mechanics and orbit estimation - the Gaussian assumption can not provide a sufficiently accurate statistical approximation of the transformed pdf. Hence, the problem of uncertainty propagation through nonlinear transformations becomes one of computing not only the estimate of the transformed mean and covariance, but also of the higher order cumulants.
A.
Nonlinear Mapping of the Estimate Statistics
Consider random variable x ∈ (n with probability density function p(x) and a second random variable y related to x through the nonlinear transformation y = f (x).
(39)
The problem is to calculate a consistent estimate of the main cumulants of the transformed probability density function p(y). As said before, since f is a generic nonlinear function this formulation includes a wide range of problems involving uncertainty propagation (uncertainty propagation through nonlinear dynamics, uncertainty propagation through nonlinear coordinate transformations, etc.). Define independent variable x as a DA variable ¯ + δx, [x] = x
(40)
¯ is the initial mean, and implement the nonlinear transformation in k Dn (see Sec. II). The where x result is the Taylor expansion of the final solution with respect to deviations δx of the independent variable ¯ + My (δx) = [y] = f ([x]) = y
1
p1 +···+pn ≤k
16
cp1 ...pn · δxp11 · · · δxpnn ,
(41)
¯ is the zeroth order term of the expansion map, and cp1 ...pn are the Taylor coefficients of where y the resulting Taylor polynomial cp1 ...pn =
1 ∂ p1 +···+pn f . · p1 ! · · · pn ! ∂xp11 · · · ∂xpnn
(42)
If an ODE system in the form (36) is considered, Eq. (41) remains unchanged and cp1 ...pn become the terms that relate deviations in the initial conditions to the state at some future time. Park and Scheeres [15] called this operator of higher-order partials of the state the state transition tensor (STT). Note that the case k = 1 corresponds to the ordinary first-order statistics propagation (i.e., the approximation corresponding to a linearized model), where cp1 ...pn are elements of the well-known state transition matrix. The evaluation of (41) for a selected value of δx supplies the k-th order Taylor approximation of the solution y corresponding to the displaced independent variable. Of course, the accuracy of the expansion map is function of the expansion order and can be controlled by tuning it. As already said, working in k Dn allows the computation of derivatives up to order k of functions in n variables (i.e., the coefficients cp1 ...pn of the Taylor series) and the approximation of the original function in the space of Taylor polynomials. So, contrary to other methods, such as the UT, that are founded on the idea that it is easier to approximate a pdf than it is to approximate an arbitrary nonlinear function, the starting point and foundation of our approach is the straightforward approximation of the function itself. The Taylor series in the form (41) can be used to efficiently compute the propagated statistics through two main methods. The first method consists in analytically describing the statistics of the solution by computing the l-th moment of the trasformed pdf using a proper form of the l-th power of the solution map (41). This numerical procedure will be referred to as DAHO-k method, where DAHO stands for “DA-based higher-order” and k indicates the expansion order. On the other hand, the second method consists in running Monte Carlo simulations on map (41) and then calculating the resulting statistics. This numerical procedure will be referred to as DAMC-k throughout this paper, where DAMC stands for “DA-based Monte Carlo” and k indicates the expansion order. The DAHO-k and DAMC-k methods can both be applied to the computation of any moment l of the transformed pdf.
17
B.
DAHO-k method
For a generic scalar random variable z with pdf g(z) the first four moments can be written as µ = E{z} P = E{(z − µ)2 } (43) E{(z − µ)3 } γ= σ3 E{(z − µ)4 } κ= − 3, σ4
where µ is the mean value, P is the covariance, γ and κ are the skewness and the kurtosis, respectively [24], and the expectation value of z is defined as E{z} =
2
+∞
zg(z)dz.
(44)
−∞
So, the moments of the transformed pdf for problem (39) can be computed by applying the multivariate form of Eq. (43) to the Taylor expansion (41). The result for the first two moments becomes 1 µi = E{[yi ]} = ci,p1 ...pn E{δxp11 · · · δxpnn } p1 +···+pn ≤k 1 ci,p1 ...pn cj,q1 ...qn E{δxp11 +q1 · · · δxpnn +qn }, P ij = E{([y i ] − µi )([y j ] − µj )} = p1 +···+pn ≤k,
(45)
q1 +···+qn ≤k
where ci,p1 ...pn are the Taylor coefficients of the Taylor polynomial describing the i-th component of
[y]. Note that in the covariance matrix formula the coefficients ci,p1 ...pn and cj,q1 ...qn already include the subtraction of the mean terms. The coefficients of the higher order moments are computed by implementing the required operations (e.g. ([y i ] − µi )([y j ] − µj ) for the second order moment) on Taylor polynomials in the DA framework. The expectation values on the right side of Eq. (45) are function of p(x). It follows that if the initial distribution is known, all of the moments of the transformed pdf p(y) can be calculated. The number of monomials for which it is necessary to compute the expectation increases with the order of the Taylor expansion and, of course, with the order of the moment we want to compute. Note that, at this time, no hypothesis on the initial pdf has been made. Thus, the method can be applied independently of the considered variable distribution. However, the Gaussian hypothesis is widely used in astrodynamics applications to represent the initial distribution of the state variable. So, for the purpose of illustration and without loss 18
of generality, we will now restrict to the case where x is a Gaussian random variable (GRV), x ∼ N (µ, P ), in which µ is the mean vector and P the covariance matrix. (However, no assumption on the transformed pdf is made throughout the manuscript.) An important property of Gaussian distributions is that the statistics of a GRV can be completely described by the first two moments. In case of zero mean, the expression for computing higher-order moments in terms of the covariance matrix is due to Isserlis [25]. In physics literature, Isserlis’s formula is known as the Wick’s formula. Let s1 to sn be nonnegative integers, and s = s1 + s2 + · · · + sn . Then the Wick’s formula suggests that
E{xs11 xs22 . . . xsnn } =
0,
if s is odd
Haf (P ), if s is even
(46)
where Haf (P ) is the hafnian of P = (σij ), which is defined as s
Haf (P ) =
2 1 3
p∈
and
4
s
Q
s
σp2i−1 ,p2i ,
(47)
i=1
is the set of all permutations p of {1, 2, . . . , s} satisfying the property p1 < p3 < p5 < . . .