PARAMETER ESTIMATION USING INTERVAL ... - CiteSeerX

14 downloads 82 Views 309KB Size Report
1. Introduction. The simulation of real-life phenomena through mathematical ...... Constraint Programming, volume 1894 of LNCS, pages 67–82. Springer, 2000.
PARAMETER ESTIMATION USING INTERVAL COMPUTATIONS LAURENT GRANVILLIERS∗ , JORGE CRUZ† , AND PEDRO BARAHONA‡ Abstract. Parameter estimation is the problem of finding values of the unknowns of a mathematical model for simulating a complex system. A model is generally given by differential equations or systems of equations or inequalities. Interval computations are numerical computations over sets of real numbers. In this paper intervals are used to model uncertainty in parameter estimation problems, for instance some noise associated with measured data. Interval-based algorithms using consistency techniques and local search are developed. The goal is to reliably approximate the set of consistent values of parameters by inner and outer intervals. Such computations allow one to take all possible decisions. Applications from pharmacokinetics, biology or census are described. Finally a set of experimental results is discussed. Key words. mathematical modeling, parameter estimation, interval arithmetic, ordinary differential equation, constraint solving.

1. Introduction. The simulation of real-life phenomena through mathematical models is important for many reasons: analysis of system behavior, optimization, simulation of extreme situations. For instance dynamical systems (systems that are function of time) are often modeled by differential equations, e.g., an ODE dy = g(y, x, t) dt

(1.1)

where y is the output of the system, t is the time and x ∈ Rn the vector of parameters. In this equation g is given while x is unknown. If possible the integration of the ODE gives an equation y(t) = f (x, t).

(1.2)

In most cases observations of the system are given. Suppose that a set of data {(tj , y˜j )}j is known, where each y˜j is expected to be approximately y(tj ). The model driven inverse problem consists in finding x such that ∀j : y˜j = f (x, tj ).

(1.3)

The problem of finding x that satisfies the above equations is generally insolvable. This problem has to be relaxed, which leads to numerical data fitting. In the literature [11, 1, 37] the fitting of y to experimental data is often implemented by iterative methods for nonlinear regression analysis, which compute “best-fit” shapes. For instance, the least squares method corresponds to the minimization of the expression m X

2

(y(x, tj ) − y˜j ) .

(1.4)

j=1 ∗ Laboratoire d’Informatique de Nantes Atlantique, Universit´ e de Nantes, B.P. 92208, F-44322 Nantes Cedex 3, France, [email protected]. This work has been partially supported by the RNTL project CO2 funded by the French Ministry of Industry and a collaborative Procope project between France and Germany. † Dep. de Inform´ atica, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal, [email protected] ‡ Dep. de Inform´ atica, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal, [email protected]

1

However, this approach may be weak for the following reasons. First of all no noise is taken into account: if the system is very noisy then the resulting curve may be far from some interesting model. If the objective function has many local minima the optimization process is very sensitive to initial values. Standard least squares algorithms compute local minima, whereas one may be interested in a global minimum, and thus the entire feasible set has to be examined. A possible approach to overcoming ill-posedness lies in regularization methods [40], yielding smooth solutions. For nonlinear inverse problems, these techniques are based on global optimization. However, it is in general difficult to provide indicators of reliability for solutions. A possible approach will be described in the following paragraph. The data driven inverse problem considers some error (e.g., of measurement), which leads to the expression ∀j : y˜j = f (x, tj ) + ej .

(1.5)

Now suppose that reliable bounds of ej are known, namely aj 6 ej 6 bj . Then the problem is to find x such that ∀j : y˜j − bj 6 f (x, tj ) 6 y˜j − aj .

(1.6)

In other words f (x, tj ) ranges over a continuous bounded set of real numbers, namely an interval (see Fig. 1.1). Interval algorithms were developed in the context of nonlinear bounded-error estimation [24] and refined using consistency techniques [23, 13, 17]. In this paper new algorithms are developed for this problem using consistency techniques and local search methods. The goal is to compute a reliable approximation of the set of consistent values of parameters that may help in decision making. From the extra knowledge that the decider might have the set of acceptable solutions may be subsequently focused into some region of interest defined by means of additional constraints. In this article applications from pharmacokinetics or census (see below) are presented. At the end of the paper a set of experimental results is discussed.

y1

y˜1 y˜2

y2

.. .

.. .

f t1

t2

f t1

...

t2

...

Fig. 1.1. Model Driven Versus Data Driven Inverse Problem.

Drug Concentration. Pharmacokinetics [16] is the quantitative study of the time course of drug concentrations in body fluids. This is a means of predicting conditions not experimentally tested or biological levels in tissues not sampled, of comparing animals within species or analyzing the biological variability. Moreover, physiologically the pharmacokinetics properties of a particular drug should be unique for a patient. As a consequence the results can be reused when the same drug is injected at a different period of time. 2

Drug distribution or elimination are assumed to be first-order in concentration, i.e., a constant percent per time unit, and the distribution or elimination rate decreases as the drug concentration decreases. Pharmacokinetics models are built out of networks of compartments. For instance, the two-compartments model is shown in Fig. 1.2 (kel is the elimination rate and kcp , kpc are distribution rates). kcp Central Compartment (Blood)

Distribution kpc

Peripheral Compartment (Tissue)

kel Elimination Fig. 1.2. Two-Compartment Model in Pharmacokinetics.

Each distribution and elimination process is governed by a first-order kinetics process. As a consequence, the mathematical model is an ODE system dy = −kel y − kcp y + kpc z dt

dz = kcp y − kpc z dt

(1.7)

where y is the concentration in the central compartment, namely the concentration that can be measured, and z is the concentration in the peripheral compartment. Generally the peripheral compartment is not accessible to direct measurement and is not a site of drug elimination. The solution to this equation is a sum of exponentials y(t) = a exp(−αt) + b exp(−βt)

(1.8)

depending on four parameters a, b, α, β, which can be used to express the rates as follows. kpc =

aβ + bα a+b

kel =

αβ kpc

kcp = α + β − kpc − kel

(1.9)

The drug concentration profile is biphasic: the first exponential term models the distribution phase and the second one represents the elimination phase. Clinical studies measure the behavior of drug as series of time-concentration data points (ti , y˜i ). The aim is to compute values of the rate constants k for the fitting of y(t) to experimental data. The resulting shape y(t) is used to calculate clinically useful parameters, e.g.: the half-life, which is the period of time required for the concentration to be reduced to one-half; the apparent volume of distribution, which is the volume of fluid that the drug would occupy if it was distributed as the concentration measured in the central compartment (for instance, a large volume implies wide distribution); the clearance, which is the volume of fluid from which the drug is completely removed by transformation or excretion, per unit time; the bioavailability, which is the percent of dose entering the systemic circulation after administration. Given an initial dose D, the volume of distribution Vc in the central compartment and the clearance C are obtained as follows. µ ¶−1 a b D C =D× + (1.10) Vc = a+b α β 3

Census. The Census problem models the time variation of a population with limited growth, taking into account overcrowding and depletion of resources. It assumes that the relative growth rate is not constant: the growth rate decreases as the population approaches some fixed upper bound called the carrying capacity. The mathematical model is the logistic equation 1 dy = k(L − y) y dt

(1.11)

where y is the population and L is the carrying capacity. The solution to this equation, obtained through the so-called separation of variables method, is y(t) =

Ly0 y0 + (L − y0 ) exp(−kLt)

(1.12)

where y0 is the initial population. Given census data over a time period a main use of the model is to estimate the carrying capacity L. This information also leads to determine an inflection point in the logistic curve, which appears when the population is equal to L/2. Epidemics. Epidemics is the subject of many mathematical models that have been proved useful for the understanding and control of infectious diseases. One such model, the SIR model [30], divides a population into three classes of individuals and is based of the following parametric ODE system: dS = −rSI dt

dI = rSI − aI dt

dR = aI dt

(1.13)

where S are the susceptibles —individuals who can catch the disease, I are the infectives —individuals who have the disease and can transmit it, R are the removed —individuals who had the disease and are immune or died, and r and a are positive parameters. The model assumes a constant population N = S(t) + I(t) + R(t) and a negligible incubation period. Parameter r accounts for the efficiency of the disease transmission (proportional to the frequency of contacts between susceptibles and infectives). Parameter a measures the recovery rate from the infection. Frequently, there is information available about the spread of a disease on a particular population. This is usually gathered as time series of the number of infections (ti ,Ii ) or number of removed (ti ,Ri ) together with the values (t0 ,S0 ), (t0 ,I0 ) or (t0 ,R0 ) that initiated the epidemics on the population. An important problem is to predict the behavior of a similar disease (with similar parameter values) when occurring in a different environment, namely with a different population size or a different number of initial infectives. Therefore, the aim is to compute values of the parameters r and a for the fitting of the model with the available data to answer important questions in similar epidemic situations. Such questions are, for example, whether the infection will spread or not, what will be the maximum number of infectives, when will it start to decline, when will it ends and how many people will catch the disease. 2. Interval Computations. Interval computations, which have been introduced by Moore in the early sixties [29], can be used in many problems: • Modeling of uncertainty: in many engineering problems data are not precisely known, due to inaccuracy of measurements, approximations in modeling, phenomena that are not precisely handled. However, reliable bounds can be provided in many situations, leading to interval data. 4

• Rounding errors of machine arithmetic [25, 21]: every real number can be enclosed by an interval and rounding errors during computations can be accumulated. • Set computations over the real numbers: rather than approximated computations over the real numbers, intervals support reasonings over sets. • Reliable numerical algorithms like integration, equation solving [34], differentiation or global optimization [19]. 2.1. Numbers. Let R be the set of real numbers. Let F ⊂ R be a finite subset of reals corresponding to binary floating-point numbers in a given format [21]. Let F∞ be the set F compactified with {−∞, +∞}. Definition 2.1. An interval is a connected set of real numbers bounded by floating-point numbers. Let I be the set of intervals. Given a ∈ F ∪ {−∞} and b ∈ F ∪ {+∞}, let the interval x bounded by a and b be defined by x = [x, x] = [a, b] = {r ∈ R | a 6 r 6 b}.

(2.1)

F∞ is an ordered set: for every a ∈ F∞ , let a− be the greatest element in F∞ smaller than a and a+ be the smallest element in F∞ greater than a (with the conventions: (+∞)+ = +∞, (−∞)− = −∞, (+∞)− = max(F), (−∞)+ = min(F)). A non empty interval [a, b] is said to be canonical if a+ 6 b. The width of [a, b] is the quantity w([a, b]) = (b − a). Given an integer n > 1 a n-ary box X = x1 × · · · × xn is a Cartesian product of intervals, namely an axis-aligned n-dimensional rectangle. Given a set ρ ⊂ R, the convex hull of ρ is defined as the smallest interval enclosing ρ. In practice, the hull is computed using the outward rounding mode of floating-point computations. For every r ∈ R, let r ↓ be the greatest element of F∞ smaller than or equal to r (downward rounding), and let r ↑ be the smallest element of F∞ greater than or equal to r (upward rounding). The hull of ρ is then defined by hull(ρ) = [inf(ρ) ↓, sup(ρ) ↑].

(2.2)

Given a set ρ ⊂ Rn and an integer i ∈ {1, . . . , n} the i-th projection of ρ is the set ρi = {ai ∈ R | ∃(a1 , . . . , ai−1 , ai+1 , . . . , an ) ∈ Rn−1 : (a1 , . . . , an ) ∈ ρ}.

(2.3)

The hull of ρ ⊂ Rn is the box hull(ρ) = hull(ρ1 ) × · · · × hull(ρn ).

(2.4)

As a consequence, every subset of Rn can be enclosed by a box or more generally a set of boxes. 2.2. Operations. Each interval operation [20] is defined as a set theoretic extension of the corresponding real operation. For every couple of intervals x, y ∈ I and every operation ¦ ∈ {+, −, ×, /}, we have x ¦ y = hull({x ¦ y | (x, y) ∈ x × y}).

(2.5)

Due to monotonicity properties, these operations can be implemented by floatingpoint computations over the bounds of intervals.   [a, b] + [c, d] = [(a + c) ↓, (b + d) ↑] [a, b] − [c, d] = [(a − d) ↓, (b − c) ↑] (2.6)  [a, b] × [c, d] = [min(ac, ad, bc, bd) ↓, max(ac, ad, bc, bd) ↑] 5

Elementary functions are also defined in such a way. Given φ : R → R and its domain Dφ , for every interval x ∈ I , we have φ(x) = hull({φ(x) | x ∈ x ∩ Dφ }).

(2.7)

Many numerical algorithms have been extended to intervals, using the notion of interval extension of a real function. Definition 2.2 (interval extension of a function). A function f : In → I is an interval extension of f : Rn → R if for all n-ary boxes X = x1 × · · · × xn , {f (r1 , . . . , rn ) | (r1 , . . . , rn ) ∈ X} ⊆ f (x1 , . . . , xn ).

(2.8)

In other words, the range of f is included in the evaluation of f over X. The natural interval extension of a function f corresponds to a componentwise extension of a given expression of f , where each real number r is replaced with hull({r}), each variable is replaced with an interval variable, and each operation is replaced with the corresponding interval operation. Example 1. Consider the natural interval extension f of the function f (x1 , x2 ) = log(x1 ) + x22 − 1 and the box X = [−∞, 1] × [−2, 1]. Then we have {f (x1 , x2 ) | (r1 , r2 ) ∈ X} ⊆ f (x1 , x2 )

= log([−∞, 1]) + [−2, 1]2 − 1 = [−∞, 3]

(2.9)

2.3. Constraints. Let a constraint c : f (x1 , . . . , xn )♦g(x1 , . . . , xn ) be an equality or an inequality over the real numbers. Exact constraint solving is generally intractable due to the limitations of machine arithmetic. However, two kinds of proofs can be obtained using interval reasonings. Given a relation symbol ♦ ∈ {=, 6, >} and two intervals x, y, define the following interpretations: 1. Possibly interpretation: ∃(x, y) ∈ x × y : x♦y ⇐⇒ x♦p y. The interval relation is true if there exists a couple of real numbers verifying the relation over R. The computational expressions are as follows:   x =p y ⇐⇒ x ∩ y 6= ∅ x >p y ⇐⇒ x > y (2.10)  x 6p y ⇐⇒ x 6 y 2. Certainly interpretation: ∀(x, y) ∈ x × y : x♦y ⇐⇒ x♦c y. The interval relation is true if the relation is verified for all couples of real numbers. The computational expressions are as follows:   x =c y ⇐⇒ ∃a ∈ R : a = x = x = y = y x >c y ⇐⇒ x > y (2.11)  x 6c y ⇐⇒ x 6 y Now, let f be an interval extension of f and g be an interval extension of g. Given a n-ary box X = x1 × · · · × xn the interval reasonings are as follows: 1. if f (x1 , . . . , xn )♦p g(x1 , . . . , xn ) is false then c has no solution in X. 2. if f (x1 , . . . , xn )♦c g(x1 , . . . , xn ) then each point of X is a solution of c. 6

These interval tests give a characterization of a box wrt. a constraint. In the first case the box contains no solution and in the second case the box is entirely included in the solution set. The resulting properties are very strong, as shown in the following example. Example 2. Consider the natural interval extension f of the function f (x1 , x2 ) = log(x1 ) + x22 − 1 and the box X = [−∞, 1] × [−2, 1]. Recall that we have f (x1 , x2 ) = [−∞, 3]. Consider the following constraints and apply the interval tests. • c1 : f (x1 , x2 ) > 4: since [−∞, 3] >p [4, 4] is false then we conclude that c1 has no solution in X. • c2 : f (x1 , x2 ) 6 4: since [−∞, 3] 6c [4, 4] is true then we conclude that each point of X is a solution of c2 . Such a box is called an inner approximation. • c3 : f (x1 , x2 ) 6 2: no conclusion can be drawn using the interval tests. In this case X can be bisected in several parts and the tests implemented for each smaller box.

Fig. 2.1. Box approximations of a constraint system solution set.

These interval tests can be applied over constraint systems as follows. If at least one constraint is false given the possibly interpretation then the system has no solution. If all constraints are satisfied given the certainly interpretation then the box is an inner approximation of the whole system. The principle of interval bisection algorithms is to process a constraint system using box bisection and the aforementioned interval tests. The result is a set of boxes characterized by the following properties: • Each box is either inner to the solution set or small enough wrt. a given precision, i.e., the width of the box in each dimension is smaller than a real number ε > 0; • Each solution is contained in at least one box (reliability property). • If the set of boxes is empty then the solution set is empty. • If an inner box is computed then the solution set is not empty. Fig. 2.1 illustrates these notions, where the gray surface is the solution set. The union of boxes contains the solution set. Specific algorithms processing constraint systems will be implemented in the following sections. 3. Constraint Satisfaction. Consider the model y(t) = f (x, t)

(3.1)

where x ∈ Rn and suppose that a set of data {(tj , y˜j )}m j=1 is known. We consider the following hypotheses. 7

• Each parameter xi belongs a priori to some known interval representing the region of interest fixed by the user. Let X = x1 × · · · × xn denote the initial box for x. • Each y˜j is subject to an error ±ej , ej > 0. As a consequence, the exact value yj lies in the domain [(y˜j − ej ) ↓, (y˜j + ej ) ↑]. Let Y = y 1 × · · · × y m denote the initial box for data. Let f(x) denote the vector of functions (f (x, t1 ), . . . , f (x, tm )). The data driven inverse problem aims at characterizing the set S = {x ∈ X | f(x) ∈ Y}.

(3.2)

Moreover, an auxiliary problem consists in removing from Y the values that do not correspond to images of f over X. 3.1. Constraint-based Model. The set S defined in (3.2) can be computed by interval computations. Each data (tj , y˜j ) leads to the membership relation f (x, tj ) ∈ y j

(3.3)

y j 6 f (x, tj ) 6 y j .

(3.4)

that is equivalent to the inequalities

The interval tests presented in Section 2 can be directly applied to process 3.4. However two problems remain. First bisection algorithms are NP-complete, and their time-complexity is exponential in the worst case. So there is a great importance to accelerate computations using tractable algorithms. Two techniques can be combined: use of consistency techniques to reduce boxes and use of local search to find inner boxes. Second the box Y cannot be reduced using the aforementioned constraints (3.4). Variable yj has to be explicit. Formula (3.4) is then transformed into the existentially quantified equation ∃yj ∈ y j (yj = f (x, tj ))

(3.5)

where yj is a constrained variable. A solution to (3.5) is a point x ∈ X such that the formula is verified. The resulting constraint system is a conjunction of existentially quantified equations associated with all data (∃y1 ∈ y 1 (y1 = f (x, t1 ))) ∧ · · · ∧ (∃ym ∈ y m (ym = f (x, tm ))) .

(3.6)

3.2. Outer Computations. Consistency techniques can be used to process the system (3.6). The first goal is to reduce boxes by removing inconsistent values, i.e., values that do not satisfy constraints. The so-called outer computations are based on constraint projections, i.e., projections of the constraint solution set. Consider a constraint c(x1 , . . . , xk ) and a box X = x1 × · · · × xk . Suppose that a superset Pi of the projection of c over xi is known. Then the values in the set xi \Pi are inconsistent, i.e., they do not participate in any solution to c. The domain of xi can be reduced by the operation xi := hull(xi ∩ Pi )

(3.7)

while preserving the constraint solution set. The hull is enforced since the set Pi may be non convex. Two consistency techniques have been proposed for computing supersets of projections. 8

Hull consistency [8, 5] is based on constraint inversion. Consider an equation f (x1 , . . . , xk ) = 0, a box X and a variable xi and suppose that an equivalent constraint xi = g(x1 , . . . , xk ) is available1 . Given an interval extension g of g the domain of xi can be reduced as follows. xi := hull(xi ∩ g(x1 , . . . , xk ))

(3.8)

It suffices to note that g(x1 , . . . , xk ) is a superset of the projection of c over xi . The inversion algorithms are described in the cited references. Example 3. Consider the constraint x2 = x21 and the box [−2, 0] × [1, 9]. The domain of x2 is reduced as follows. x2

:= := := :=

hull(x2 ∩ x21 ) hull([1, 9] ∩ [−2, 0]2 ) hull([1, 9] ∩ [0, 4]) [1, 4]

(3.9)

The reduction of x1 needs the constraint to be inverted. Denote h the square inverse operation (non convex operation that uses the square root). Then we have x1

:= := := :=

hull(x1 ∩ h(x2 )) hull([−2, 0] ∩ h([1, 9])) hull([−2, 0] ∩ ([−3, −1] ∪ [1, 3])) [−2, −1]

(3.10)

Hull consistency can be directly used to reduce the box Y wrt. constraint (3.5). If an interval extension f of f is known the domain y j is reduced by the operation y j := hull(y j ∩ f (X, tj )).

(3.11)

The reduction of X requires the constraint to be inverted. Box consistency [6] combines box bisection and the interval test based on the possibly interpretation. The algorithm implements a dichotomous search used to eliminate sub-domains. Example 4. Consider the constraint f (x1 ) = 0 where f (x1 ) = x21 − 5x1 + 4 and the interval [0, 100]. Let f be the natural interval extension of f . First of all note that the interval test f (x1 ) =p 0 succeeds. Now bisect x1 in [0, 50] ∪ [50, 100]. Interestingly the interval test over [50, 100] fails as described below. f ([50, 100]) = [2004, 9754] and [2004, 9754] ∩ [0, 0] = ∅

(3.12)

As a consequence the interval [50, 100] can be eliminated and the process iterated over the remaining interval [0, 50]. Given a constraint c(x1 , . . . , xk ), a box X and a variable xi box consistency over xi computes either an empty set or the greatest interval [a, b] included in the initial domain such that both intervals [a, a+ ] and [b− , b] cannot be removed using the interval test. In this case the precision of interval arithmetic does not permit to eliminate the canonical intervals at domain bounds. Box consistency is more efficient than hull consistency when variables occur more than once in constraints [5]. 1 Practical inversion algorithms compute g numerically. The inversion is always possible, even for non continuous and non monotonic functions.

9

3.3. Inner Computations. The computation of inner boxes is very important for decision making since all computed points are known to be solutions of the constraint system. This approach makes sense for systems of inequalities since the volume of the solution set needs to be nonzero. To this end recall that the equations in (3.6) can be replaced with inequalities. Two kinds of algorithms are described in the following. The first algorithm [4] implements consistency techniques to process constraint negations. For instance consider the constraint c : f (x1 , . . . , xk ) > 0 and the box X. Suppose that a point (a1 , . . . , ak ) ∈ X violates the negation f (x1 , . . . , xk ) < 0 of c. Then we have f (a1 , . . . , ak ) > 0. We just conclude that the point is a solution to c. Now it suffices to enforce some consistency technique over the negation of c. Every eliminated box X0 ⊂ X is proved to be an inner box. 2 2 Example × [0, 1]. √ 5. Consider the constraint c : x1 + x2 6 4 and the box2 [0, 2] The box [0, 3] × [0, 1] is easily eliminated using box consistency over x1 + x22 > 4. As √ a consequence [0, 3] × [0, 1] is an inner box for c.

The second approach uses the interval test based on the certainly interpretation. The main problem here is that the initial box is not necessarily an inner box, and then a search procedure has to be executed. The simplest approach is to bisect boxes and use the interval test over every generated box to process the center. Once an inner point is found a box enclosing this point can be extended using the method of [9]. 3.4. Constraint Solving Algorithm. When implementing numerical algorithms there is a main question: what is the kind of approximation needed by the user? We believe that two important motivations are to preserve the solution set of the constraint system in order to allow all possible decisions and to control the approximation size, for efficiency reasons and easy representation in the user’s mind. In the following we propose a skeleton of algorithm that compute two boxes: a box Bo enclosing the solution set and a box Bi which each facet crosses the solution set. Note that this idea has been proposed in [23]. However, in [23], the algorithms do not take care about rounding errors and the computed results may be wrong. Fig. 3.1 illustrates these notions where the gray surface is the solution set. Note that Bi ⊂ Bo and S ⊂ Bo . Moreover each facet of Bi crosses S. We have the relation Bi ⊂ hull(S) ⊂ Bo .

(3.13)

Define the distance between Bo = t1 × · · · × tk and Bi = u1 × · · · × uk as the width of Bo if Bi is empty, and otherwise as the real number max{max{ | inf(ti ) − inf(ui )| ↑ }ki=1 , max{ | sup(ti ) − sup(ui )| ↑ }ki=1 }.

(3.14)

This number provides some useful information about the precision of computations around the frontier of the solution set S. The algorithm presented in Table 3.1 combines the aforementioned techniques. Three data are processed: a list of boxes L to be processed and the boxes Bo and Bi . During the while loop the current box is processed in several steps: • The box is reduced by the reduce algorithm that implements consistency techniques (e.g., a combination of hull and box consistency [5]). • If it is precise enough the new box Y is added in Bo since the precision of interval arithmetic does not allow to deduce additional information. In this 10

Bo B

i

S

Fig. 3.1. Box Approximations of a Constraint System Solution Set.

case Y may contain a solution. Actually a precision ε ∈ R+ is fixed a priori. A box is precise enough if the width componentwise is smaller than or equal to ε. The value ε = 0 corresponds to canonical boxes. • Otherwise the infer algorithm is used to compute a box Z, each facet of which crosses the solution set. Inner computations techniques are implemented as described above. If such a box is obtained, i.e., Z is not empty, then it is added to Bo and Bi . The remaining boxes in Y \ Z are added in L for further search. Note that a complete description of efficient inner algorithms is out of scope of this paper. • The last case just corresponds to a failure of all interval reasonings. As a consequence the current box is bisected in two parts, which are added in L for further search. We have the following proposition stating properties of Algorithm Solve. Proposition 3.1. Given a constraint system (C, X) and two boxes Bo and Bi such that (Bo , Bi ) := Solve (C, X) the following properties hold: 1. If Bo is empty then the system has no solution. 2. If Bi is not empty then the solution set is not empty. 3. The algorithm terminates in finite time. Proof. The proof for (1) is as follows: prove that every solution of the system is either in Bo or in L. Since L is empty at the end of the loop then every solution is in Bo . If Bo is empty so does the solution set. Now suppose that the box X at line 7 contains a solution s that does not belong to Bi . After the application of reduce s is in Y since no solution is lost by reduce. If Y is precise enough then Y is added in Bo and so does s. Otherwise the infer algorithm is executed. There are three cases: if Z is not empty and if s belongs to Z then s is added in Bo . Otherwise s is added in L. If Z is not empty s is either in Y1 or in Y2 due the properties of the bisection operation, and then it is added in L. The proof of (2) just follows from the properties of the algorithm infer. The proof of (3) uses a loop invariant on L: at each step either the cardinality of L decreases or the boxes pushed in the list (Y \ Z or Y1 and Y2 ) are strictly smaller than X. Each precise enough box is removed from L and processed at line 13. So the end condition of the while loop is necessarily verified after a finite number of iterations. Finally the Solve algorithm may be integrated into interactive solving processes. Given a result (Bo , Bi ) from an initial box X the user (1) would focus on some region 11

Table 3.1 Constraint Solving Algorithm. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

Solve (C: set of constraints, X: box): couple of boxes begin L := {X} Bo := ∅ Bi := ∅ while L is not empty do X := pop (L) % a box to be processed if X is not included in Bi then % otherwise X useless X := hull(X \ Bi ) % search in Bi useless Y := reduce (C, X) % outer computation if Y is not empty then if Y is precise enough then Bo := hull(Bo ∪ Y) % too small consistent box else Z := infer (C, Y) % inner computation if Z is not empty then Bo := hull(Bo ∪ Z) Bi := hull(Bi ∪ Z) L := push (L, Y \ Z) % remaining part of Y else (Y1 , Y2 ) := bisect (Y) % bisection of current box L := push (L, Y1 ) L := push (L, Y2 ) fi fi fi fi od return (Bo , Bi ) end

of interest in Bo or Bi , (2) would extend the initial box X or (3) would focus on another region. Actually it suffices to run the algorithm another time. Furthermore the algorithm could be optimized using an input box to initialize Bi at the beginning of the algorithm: for case (1) it can be equal to the intersection of Bi and the new region; for case (2) it is simply equal to Bi . 4. Differential Equations. Parametric differential equations are general and expressive mathematical means to model system dynamics. Notwithstanding its expressive power, reasoning with such models may be quite difficult. Analytical solutions are available only for the simplest models. Alternative numerical simulations require precise numerical values for the parameters involved, often impossible to gather given the uncertainty on available data. This may be an important drawback since small differences on input values may cause important differences on the output produced. To overcome this limitation, Monte Carlo methods rely on a large number of simulations to estimate the likelihood of the different options under study. However, they cannot provide safe conclusions regarding these options, given the various sources 12

of errors accumulated in the simulations (both input and rounding errors). In contrast, interval methods [29, 27, 32] for solving differential equations with initial conditions do verify the existence of unique solutions and produce guaranteed error bounds for the solution trajectory along an interval of time T . They use interval arithmetic to compute safe enclosures for the trajectory, explicitly keeping the error term within safe interval bounds. The application of constraint techniques provided competitive results either in the precision of the trajectory enclosure bounds or in the efficiency of the computations [14, 22]. However, these methods, do not allow full constraint reasoning in general differential problems, since they do not explicitly model the solution functions. Hence, it is not possible to declaratively handle constraints such as “the solution function should not exceed a certain value” or “should not exceed a value for more than a certain period of time”, neither is it possible to derive safe ranges for the parameters that satisfy such constraints. Data driven inverse problems cannot be easily handled by such approaches. The full integration of differential equations into the constraint framework was only recently proposed [12]. In this approach (used for solving the epidemic problem), an ODE system together with related information is denoted a Constraint Satisfaction Differential Problem (CSDP) and is integrated in the framework as a new kind of constraint. The procedure proposed for solving CSDPs is used as a safe procedure for pruning the variable domains accordingly to the specification of the differential equations and an associated set of restrictions. Initial and boundary conditions are represented by an appropriate set of constraints denoted value restrictions, which associate variables with the value of trajectory components at particular time points. Besides initial and boundary conditions, and viewing an ODE solution as a continuous vector function (and each of its components as a continuous real function), several other conditions of interest may be imposed in the CSDP framework, namely, maximum, minimum, time, area, first and last restrictions. For instance, a maximum restriction maximum j,τ (x) associates x with the maximum value of a trajectory component j within a time interval τ , and the area restriction area j,τ,>θ (x) associates x with the area of a trajectory component j, within time period τ , above threshold θ. The next subsection extends the data driven inverse problem to models defined as differential equations and subsection 4.2 presents the respective constraint-based model represented as a CSDP. 4.1. Data Driven Inverse Problem. Consider the differential model dy = g(y, x, t) dt

(4.1)

where x ∈ Rn and y is a function from T ⊂ R to Rk . Suppose that a set of data {(tj , y˜1,j , . . . , y˜k,j )}m ˜i,j is expected to be approximatively j=1 is known, where each y the ith component of vector y(tj ) with tj ∈ T . We consider the following hypotheses. • Each parameter xi belongs a priori to some known interval representing the region of interest fixed by the user. Let X = x1 × · · · × xn denote the initial box for x. • Each y˜i,j is subject to an error ±ei,j , with ei,j > 0. In particular, ei,j = +∞ if the ith component of vector y(tj ) is unknown. Hence, the exact value of the ith component of vector y(tj ) lies in the domain y i,j = [(˜ yi,j −ei,j ) ↓, (˜ yi,j +ei,j ) ↑]. Let Yj = y 1,j × · · · × y k,j denote the initial box for the known data at tj . 13

The data driven inverse problem aims at characterizing the set S = {x ∈ X | y(t1 ) ∈ Y1 ∧ · · · ∧ y(tm ) ∈ Ym ∧ ∀t ∈ T :

dy = g(y, x, t)}. dt

(4.2)

Moreover, an auxiliary problem is removing from each Yj the values that do not correspond to images of y(tj ) when x ranges over S. 4.2. Constraint-based Model. The previous data driven inverse problem is represented in the extended constraint framework as a CSDP associated, for every t within the interval T , with the following (k + n)-ary ODE system dz = f (z, t) dt

(4.3)

where fi (z, t)=gi ([z1 , . . . , zk ], [zk+1 , . . . , zk+n ], t) if i 6 k or zero otherwise. The last n components of the ODE system represent the n parameters x of the original data driven differential model. Hence, for each tj corresponding to a known data point (1 6 j 6 m), n value restrictions are added: zk+1 (tj ) = x1 ∧ · · · ∧ zk+n (tj ) = xn

(4.4)

Additionally, other value restrictions are included for associating constrained variables with the first k components of the ODE system. For each tj a set of k value restrictions is added: z1 (tj ) = y1,j ∧ · · · ∧ zk (tj ) = yk,j

(4.5)

To sum up, the aim is to find vectors x ∈ x such that the following formula is satisfied. 

∀t∈T :

dz dt



= f (z, t) and

   ∃z ∈ F T   ∃y1,1 ∈ y 1,1 · · · ∃yk,1 ∈ y k,1 · · · ∃y1,j ∈ y 1,j · · · ∃yk,j ∈ y k,j  ∀16j6m (∀16i6n : zk+i (tj ) = xi and ∀16i6k : zi (tj ) = yi,j )

(4.6)

where F T is the set of all functions from T to Rk+n . Note that, as in the case of the non differential model, the y variables are existentially quantified. 4.3. Solving CSDPs. The solving procedure for CSDPs maintains a safe enclosure for the set of functions that are solutions of the ODE system and satisfy all the restrictions. Such trajectory enclosure is based on an Interval Taylor Series method [29, 32] for solving ODE problems with initial conditions. A sequence of discrete time points t0 , t1 , . . . , ti is considered within the whole interval of time T and enclosures are computed not only for the trajectory values at these time points but also for every time gap between two consecutive points. An enclosure at a time point is obtained from a Taylor series expansion around an adjacent time point with the error term bounded as a result of an a priory enclosure computed for the time gap between the two points. An a priory enclosure between two consecutive points is computed based on the interval Picard operator (see appendix A for details). Consider the following binary ODE system defined for t ∈ [0, 6]: dy1 = −0.7y1 dt

dy2 ln(2) = 0.7y1 − y2 dt 5 14

(4.7)

and the initial condition y1 (0) = 1.25 and y2 (0) ∈ [0.4, 0.8]. Fig. 4.1 shows the set of real functions that satisfy the ODE system with the required initial condition. For keeping the illustration in two dimensions, each component (y1 and y2 ) of each real function (y) is represented in a different graphic sharing the same time axis. A single line represents its first component y1 and the grey area its second component y2 . A possible trajectory enclosure computed by the successive application of the Interval Taylor Series method is also presented in Fig. 4.1. The ODE trajectory is defined through a sequence of seven time points and the time gaps in between. For each component, the interval enclosures associated to each time point and time gap are represented, respectively, as a vertical line and a dashed rectangle. Such trajectory enclosure includes all functions whose components are continuous functions enclosed by the rectangles and crossing all the vertical lines, thus, it is a safe enclosing for the set of real functions that satisfy the ODE system and the initial condition.

1.5 1.0 y1(t) 0.5 0.0 1.5 1.0 y2(t) 0.5 0.0 0.0

1.0

2.0

3.0

4.0

5.0

6.0

t

Fig. 4.1. A trajectory enclosure for the set of real functions that satisfy the ODE system 4.7 and the initial condition y1 (0) = 1.25 and y2 (0) ∈ [0.4, 0.8].

The improvement of the quality of a trajectory enclosure is combined with the enforcement of the ODE restrictions through constraint propagation on a set of domain reduction functions associated with the CSDP. Some are responsible for reducing the domain of a restriction variable according to the current trajectory enclosure. Others are responsible for reducing the uncertainty of the trajectory enclosure according to the domain of a restriction variable. Finally there are domain reduction functions responsible for reducing the uncertainty of the trajectory enclosure by the successive application of the validated method between consecutive time points. In the previous example, with extra knowledge about the possible values of the second component at particular (observed) time points, a better trajectory enclosure could be computed. For instance, observing y˜2 (3) = 1.1 and y˜2 (6) = 1.1 and admitting an error of ±0.2 for both observations, the current enclosures for y2 (3) and y2 (6) could be updated by intersecting their current values with the interval [0.9, 1.3]. Such local updates of the trajectory should then be propagated along the whole trajectory enclosure through the reapplication of the Interval Taylor Series method. The new trajectory enclosure computed for the second component y2 is illustrated in Fig. 4.2. Such enclosure is safe as can be easily checked in the figure, where the 15

grey area represents the set of real functions for y2 that satisfy the ODE system, the initial condition and the additional value restrictions. Hence, the enclosures for the restriction variables representing the values of y2 at time points 0, 3 and 6 may be safely narrowed by intersecting their current domains with y2 (0), y2 (3) and y2 (6), respectively.

1.5 1.0 y2(t) 0.5 0.0 0.0

1.0

2.0

3.0

4.0

5.0

6.0

t

Fig. 4.2. A new trajectory enclosure for y2 with the additional restrictions y2 (3) ∈ [0.9, 1.3] and y2 (6) ∈ [0.9, 1.3].

Since each restriction variable represents a particular property of a real function (some component of a solution of the ODE system) the safety of its domain pruning may be guaranteed by identifying the functions within the current trajectory enclosure that maximize and minimize its value. Whereas in the case of value restrictions the new enclosures may be obtained directly from the respective time point enclosures, for other types of restrictions the new safe bounds must be computed from the current trajectory enclosure. For example, the domain of a restriction variable representing the maximum trajectory value during some period of time τ may be safely upper bounded by the maximum of the upper bounds of the trajectory enclosures within τ and safely lower bounded by the maximum of the lower bounds of the enclosures at the time points within τ . Conversely, if the domain of such restriction variable is upper bounded by some known value, then the trajectory enclosures within the time period τ should also be upper bounded by the same value. 5. Experimental Results. The experiments have been conducted on a PCLinux/Celeron 733MHz using RealPaver [18] for constraint satisfaction problems and algorithms from [12] for solving differential equations. The precision of Algorithm Solve is fixed to 10−5 . 5.1. Drug Concentration. Consider the problem of a bolus intravenous injection of 800mg of a drug to a patient. Data for the concentration in the central compartment over a period of time are given in Table 5.1. Table 5.1 Drug concentration data. t y˜

0.1 16.1

0.25 14.3

0.5 12.0

0.75 10.3

1 9.0

1.5 7.2

2 6.1

2.5 5.2

3 4.6

4 3.7

6 2.5

8 1.7

10 1.18

12 0.81

The model is the exponential sum y(t) = a exp(−αt) + b exp(−βt).

(5.1)

Note there is a symmetry between the two exponential terms. The symmetry can be broken if we consider that the amplitude of the elimination phenomenon (modeled 16

by the second term) is smaller than the distribution. As a consequence we fix the domain of α to [0, 10] and the domain of β to [0, 1]. Parameters a and b belong to the interval [1, 100]. For each data y˜j a relative error of 5% is taken into account. The Solve algorithm computes the following box approximation for (a, α, b, β) in 90s. Bi

= [9.3607864, 9.7183469] [7.6097993, 7.8688821]

× [1.2386054, 1.3435076] × [0.18695175, 0.19030738]

×

Bo

= [9.360784, 9.7183502] [7.5986183, 7.868885]

× [1.234975, 1.3435096] × [0.18684669, 0.19030741]

×

(5.2)

We deduce a reliable enclosure of the volume of distribution in the central compartment as the interval evaluation of Vc over Bo . Vc ∈

800 = [45.487536, 47.171474] [9.360784, 9.7183502] + [7.5986183, 7.868885]

The clearance is obtained in the same way. µ ¶−1 [7.5986183, 7.868885] [9.360784, 9.7183502] C ∈ 800 × + [1.234975, 1.3435096] [0.18684669, 0.19030741] = [16.005316, 17.059192] l/hr

(5.3)

(5.4)

(5.5)

5.2. Census. The US Census over the years 1790 to 1900, normalized to 0, is given in Table 5.2. The error ej for each data is equal to 1. Table 5.2 US Census data. t y˜ t y˜

0 3.929 100 62.947

10 5.308 110 75.994

20 7.239 120 91.972

30 9.638

40 12.866

50 17.069

60 23.191

70 31.433

80 39.818

90 50.155

Recall the model is y(t) =

Ly0 . y0 + (L − y0 ) exp(−kLt)

(5.6)

The unknowns are L, r defined as kL and y0 since the initial population is not exactly known. The initial box for (L, r, y0 ) is given by [1, 1000] × [0.001, 0.1] × [0, 100]. Algorithm Solve computes the following box approximation in 174s. Bi

= [166.3818, 260.30636] [3.4503374, 4.5458635]

× [0.028686055, 0.033687279]

×

Bo

= [166.37682, 260.31398] [3.4502058, 4.5459041]

× [0.028685855, 0.033687887]

×

(5.7)

In particular we see that the carrying capacity is bounded by 260, which was the population of the US in the mid 90’s. Actually the population did not level off. In this case we may conclude that the model is too simplistic to be used for predictions for human populations. 17

5.3. Epidemics. In the British Medical Journal (4th March 1978) the following data was reported from an influenza epidemic that occurred in an English boarding school (taken from [30]): a single boy (from a total population of 763) initiated the epidemics and the evolution of the number of infectives, available daily from day 3 to day 14, is shown in Table 5.3. Table 5.3 Infectives reported during an epidemics in an English boarding school. t I˜j

0 1

3 22

4 78

5 222

6 300

7 256

8 233

9 189

10 128

11 72

12 28

13 11

14 6

The goal of our study is to characterize an epidemic disease which is similar to the one reported in the boarding school, that is, to determine values or ranges for its parameters r and a. Solving the model driven inverse problem, and using the least squares method, the parameter values r = 0.218 and a = 0.440 are computed (in the model equations the r parameter is multiplied by 0.01 re-scaling it to the interval [0, 1]). However, generating a single value for each parameter does not capture the essence of the problem which is not to determine the most similar disease but rather to reason with a set of similar enough diseases. Moreover such approach does not provide any sensitive analysis about the quality of the data fitting, namely on the effects of small changes on the parameter values. Alternatively, the data driven inverse problem considers acceptable errors for the observed data and the goal is to compute ranges for the parameters such that the distance between the model predictions and the observed data does not exceed these errors. Since the epidemic model has no analytical solution form, the classical constraint approaches cannot be applied for obtaining such ranges. However, the problem can be handled by the extended constraint framework presented in section 4. A CSDP constraint with value restrictions at each observed data point tj is considered for obtaining the value of the infectives trajectory at those points. The parameter ranges are then obtained by enforcing some consistency requirement. For example, by considering an acceptable data error within [−30, +30], and enforcing global hull-consistency [13], the initial box [0, 1] × [0, 1] for the parameter ranges (r, a) is narrowed into the following box approximation: Bi

= [0.214293, 0.221591]

× [0.425274, 0.465323]

Bo

= [0.214268, 0.221617]

× [0.425240, 0.465354]

(5.8)

Once obtained the parameter ranges that may be considered acceptable to characterize epidemic diseases similar to the one observed, they can be used for making predictions in the new environment contexts. Again, the expressive power of the extended constraint framework may be used for representing the relevant epidemic properties. A maximum restriction may represent the infectives maximum value and a first restriction may represent the time of such maximum. A last restriction may represent the duration of the epidemics as the last time that the number of infectives exceeds 1. Finally a value restriction may represent the number of susceptible at a time safely after the end of the epidemics. 18

6. Conclusion. We have shown that interval-based algorithms can reliably handle parameter estimation problems. The main improvement compared to more classical numerical approaches is that uncertainty in modeling is quantified and propagated using interval algorithms. ODE or constraint systems are handled by consistency techniques to approximate the parameters consistent values. The experimental results show that this approach is useful. This framework can be extended to tackle two specific problems. It is assumed that uncertainty can be bounded in a reliable way. However this task may be difficult in itself. If the assumptions on the uncertainty is too broad, the solution set may be too large and then useless. We propose to develop some “Russian doll” algorithm for refining the bounds of uncertainty, from an initial overestimated value, according to the size of the solution set (which is estimated using inner boxes). Only one erroneous data may lead to conclude that no reliable model exists (empty solution set). In this case two approaches may be useful. First constraints may be relaxed, i.e., changing hard constraints in soft constraints [7]. Second erroneous data may be extracted using explanation-based techniques for constraint programming [35]. Further work should show the scalability of interval-based techniques, to process e.g., problems with more than 10 variables and hundreds of ODEs. Acknowledgments. We want to thank F. Benhamou and F. Goualard for interesting discussions on these topics, and L. Jaulin who shared with us his experience about parameter estimation problems. REFERENCES [1] G. Ammar, W. Dayawansa, and C. Martin. Exponential Interpolation: Theory and Numerical Algorithms. Applied Mathematics and Computation, 41:189–232, 1991. [2] C. Bendsten and O. Stauning. Fadbad, a flexible c++ package for automatic differentiation using the forward and backward methods. Technical Report 1996-x5-94, Department of Mathematical Modelling, Technical University of Denmark, Lyngby, Denmark, 1996. [3] C. Bendsten and O. Stauning. Tadiff, a flexible c++ package for automatic differentiation using taylor series. Technical Report 1997-x5-94, Department of Mathematical Modelling, Technical University of Denmark, Lyngby, Denmark, 1997. [4] F. Benhamou and F. Goualard. Universally Quantified Interval Constraints. In R. Dechter, editor, Proceedings of CP’2000, International Conference on Principles and Practice of Constraint Programming, volume 1894 of LNCS, pages 67–82. Springer, 2000. [5] F. Benhamou, F. Goualard, L. Granvilliers, and J.-F. Puget. Revising Hull and Box Consistency. In D. De Schreye, editor, Proceedings of ICLP’99, International Conference of Logic Programming, pages 230–244. The MIT Press, 1999. [6] F. Benhamou, D. McAllester, and P. Van Hentenryck. CLP(Intervals) Revisited. In M. Bruynooghe, editor, Proceedings of ILPS’94, International Logic Programming Symposium, pages 124–138. MIT Press, 1994. [7] M. Ceberio. Soft Constraints over Continuous Domains. PhD thesis, University of Nantes, 2003. Submitted. [8] J. G. Cleary. Logical arithmetic. Future Computing Systems, 2(2):125–149, 1987. [9] H. Collavizza, F. Delobel, and M. Rueher. Extending consistent domains of numeric CSPs. In T. Dean, editor, Proceedings of IJCAI’99, International Joint Conference on Artificial Intelligence, pages 406–413, 1999. [10] G. F. Corliss and R. Rihm. Validating an a priori enclosure using high-order Taylor series. In G¨ otz Alefeld, Andreas Frommer, and Bruno Lang, editors, Scientific Computing and Validated Numerics: Proceedings of the International Symposium on Scientific Computing, Computer Arithmetic and Validated Numerics - SCAN ’95, pages 228–238. Akademie Verlag, Berlin, 1996. [11] R. G. Cornell. A Method for Fitting Linear Combinations of Exponentials. Biometrics, pages 104–113, 1962. 19

[12] J. Cruz. Constraint Reasoning for Differential Models. PhD thesis, New University of Lisbon, 2003. [13] J. Cruz and P. Barahona. Global Hull Consistency with Local Search for Continuous Constraint Solving. In P. Brazdil and A. Jorge, editors, Proceedings of EPIA’2001, Portuguese Conference on Artificial Intelligence, volume 2258 of LNCS, pages 349–362, 2001. [14] Y. Deville, M. Jansen, and P. Van Hentenryck. Consistency techniques in ordinary differential equations. In Proceedings of CP’98, Principles and Practice of Constraint Programming, volume 1520 of LNCS, pages 162–176, 1998. [15] P. Eijgenraam. The solution of initial value problems using interval arithmetic. Technical Report 144, Math. Centre Tracts, Amsterdam, 1981. [16] M. Gibaldi and D. Perrier. Pharmacokinetics. Marcel Dekker, Inc., New-York, 1982. [17] L. Granvilliers. From Interval Arithmetic to Interval Constraints for Parameter Set Estimation. In Proceedings of SCAN’2002, International Conference on Scientific Computing and Validated Numerics, Paris, France, 2002. [18] L. Granvilliers. RealPaver: User’s Manual. IRIN, University of Nantes, 0.1 edition, February 2002. [19] E. R. Hansen. Global Optimization using Interval Analysis. Marcel Dekker, 1992. [20] T. J. Hickey, Q. Ju, and M. H. van Emden. Interval arithmetic: From principles to implementation. JACM, 48(5):1038–1068, 2001. [21] IEEE. IEEE Standard for Binary Floating-Point Arithmetic. Technical Report IEEE Std 754-1985, 1985. Reaffirmed 1990. [22] M. Jansen, P. Van Hentenryck, and Y. Deville. Optimal Pruning in Parametric Differential Equations. In Proceedings of CP’2001, Principles and Practice of constraint Programming, volume 2239 of LNCS, pages 539–553, 2001. [23] L. Jaulin. Interval Constraint Propagation with Application to Bounded-Error Estimation. Automatica, 36:1547–1552, 2000. [24] L. Jaulin and E. Walter. Set Inversion via Interval Analysis for Nonlinear Bounded-Error Estimation. Automatica, 29(4):1053–1064, 1993. [25] W. M. Kahan. A more complete interval arithmetic. Technical report, University of Toronto, 1968. [26] F. Kr¨ uckeberg. Ordinary differential equations. In E. Hansen, editor, Topics in Interval Analysis, pages 91–97. Clarendon Press, Oxford, 1969. [27] R. J. Lohner. Einschliebung der L¨ osung gew¨ ohnlicher Anfangs und Randwertaufgaben und Anwendungen. PhD thesis, University of Karlsruhe, 1988. [28] R. J. Lohner. Step size and order control in the verified solution of ivp with odes. In SciCADE’95, International Conference on Scientific Computation and Differential Equations, Stanford, Calif, 1995. [29] R. E. Moore. Interval Analysis. Prentice-Hall, Englewood Cliffs, NJ, 1966. [30] J. D. Murray. Mathematical Biology. Springer, 1991. [31] K. R. Jackson N. S. Nedialkov and J. D. Pryce. An effective high-order interval method for validating existence and uniqueness of the solution of an ivp for an ode. Reliable Computing, 7(6):449–465, 2001. [32] N. S. Nedialkov. Computing Rigorous Bounds on the Solution of an Initial Value Problem for an Ordinary Differential Equation. PhD thesis, University of Toronto, 1999. [33] N. S. Nedialkov, K. R. Jackson, and G. F. Corliss. Validated solutions of initial value problems for ordinary differential equations. Appl. Math. & Comp., 105(1):21–68, 1999. [34] A. Neumaier. Interval Methods for Systems of Equations. Cambridge University Press, 1990. [35] S. Ouis, N. Jussien, and P. Boizumault. K-relevant explanations for constraint programming. In I. Russell and S. Haller, editors, Proceedings of FLAIRS’2003, Florida Artificial Intelligence Research Society Conference. AAAI Press, 2003. [36] R. Rihm. Interval methods for initial value problems in odes. In J. Herzberger, editor, Topics in Validated Computations: Proceedinds of the IMACS-GAMM International Workshop on Validated Computations. Elsevier, 1994. [37] K. Schittkowski. Numerical Data Fitting in Dynamical Systems. Kluwer Academic Publishers, 2002. [38] O. Stauning. Enclosing solutions of ordinary differential equations. Technical Report IMMREP-1996-18, Department of Mathematical Modelling, Technical University of Denmark, Lyngby, Denmark, 1996. [39] O. Stauning. Automatic Validation of Numerical Solutions. Lyngby, denmark, Technical University of Denmark, 1997. [40] C. Vogel. Computational Methods for Inverse Problems. SIAM, 2002.

20

Appendix A. Interval Taylor Series Methods. Interval Taylor Series (ITS) methods are based on the Taylor series expansion of the solution function s(t) of an ODE system around point ti . From Taylor’s theorem, if s(t) is p times continuously differentiable on the closed interval [ti , ti+1 ] and p+1 times differentiable on the open interval (ti , ti+1 ) then (with h=ti+1 -ti and ξ ∈ [ti , ti+1 ]): s(ti+1 ) = s(ti ) +

p µ k X h k=1

k!

¶ s

(k)

(ti ) +

hp+1 (p+1) s (ξ) (p + 1)!

(A.1)

Instead of neglecting the error term, as done in traditional Taylor Series methods, ITS methods use interval arithmetic to obtain reliable enclosures not only for the error term but also for every term of the series, allowing the computation of a reliable enclosure of the solution function at point ti+1 . Usually, and without loss of generality, ITS methods assume that the ODE system is autonomous and rewrite the above equation into: s(ti+1 ) = s(ti ) +

p ³ X

´ hk f [k] (s(ti )) + hp+1 f [p+1] (s(ξ))

(A.2)

k=1

where f [k] (s(ti )) denotes the k th Taylor coefficient of function s at the point ti : f [k] (s(ti )) =

1 (k) s (ti ) k!

(A.3)

In [29], Moore proposed a simple procedure for the reliable computation of the Taylor coefficients up to some intended order. An efficient implementation of this method is available at the public domain software package TADIFF [3] (implemented in C++).With reliable enclosures for the Taylor coefficients, interval extensions of the Taylor series expansion of ODE solution functions may be computed. This is extensively used in ITS methods not only for enclosing the value, at point ti+1 , of a single solution function s(t) with initial condition s(ti ) = si , but also to enclose such value for the set of solution functions whose values at the point ti are within interval Si . Usually the validation and enclosure of solutions of an ODE system between two discrete points ti and ti+1 is based on the Banach fixed-point theorem and the application of the Picard-Lindel¨of operator (see [33, 38] for details). The following theorem (proved in [15, 27]) may be used for defining a first order enclosure method based on the (first-order) interval Picard operator. Theorem A.1. ( Interval Picard Operator). Let O be an autonomous ODE system of n equations dy dt = f (y). Let f be continuous with first order partial derivatives over t ∈ [ti , ti+1 ]. Let Si ⊆ S be two n-ary boxes, F an interval extension of f , and h = ti+1 − ti . The interval Picard operator Φ is a vector interval function: Φ(S) = Si + [0, h]F (S)

(A.4)

If Φ(S) ⊆ S then for every si ∈ Si , the IVP defined by O and the initial condition y(ti ) = si has a unique solution s and ∀t∈[ti ,ti+1 ] s(t) ∈ Φ(S). Based on the interval Picard operator, algorithms to obtain an enclosure for the set of solution functions whose values at ti are within the box Si may be generally described as follows. Firstly, an adequated step size h is chosen together with an 21

initial guess S 0 for the enclosure (with Si ∈ S 0 ). Then the interval Picard operator is applied to obtain the box S = Φ(S 0 ). If S ⊆ S 0 then, by theorem A.1, S is an enclosure for the set of solution functions between ti and ti+h . Otherwise, two different strategies may be recursively applied: either the initial guess S 0 is inflated to enclose more solutions of the ODE for the same step size; or the step size is reduced to satisfy Φ(S 0 ) ⊆ S 0 (note that for a small enough step h this property can always be satisfied). The final result of such algorithms is a box S[i,i+1] and a step size h (not necessarily the initially one) for which the box is an enclosure of the set of solution functions whose values at ti are within the box Si . Several ITS proposals [26, 15, 27, 39] rely on the use of a first order enclosure method for the validation and enclosure of ODE solutions at its first phase. The major drawback of these approaches is that the step size restriction imposed by the (firstorder) interval Picard operator is often much more severe than the limitations imposed in the second phase, based on higher order Taylor series expansions. Alternative higher order enclosure methods [28, 10, 32, 31] were also proposed for this first phase, allowing larger step sizes more compatible with the second phase algorithms. Once obtained an enclosure box S[i,i+1] for the set of solutions between two points, ti and ti+1 , a straightforward ITS method for computing a tight enclosure at ti+1 is directly based on the interval extension of (6): Si+1

p ³ ´ X = Si + hk F [k] (Si ) + hp+1 F [p+1] (S[i,i+1] )

(A.5)

k=1

where Si and Si+1 are enclosing boxes at points ti and ti+1 respectively, and F [k] (S) is a reliable enclosure of the k th Taylor coefficient of the solution function at any point within the box S. However, the above method usually leads to large overestimations of the enclosing box at point ti+1 . A better approach is to use a Mean Value interval extension of the Taylor series with respect to the box Si . In this case, a method known as the ITS direct method is obtained: # " p ³ p ³ ´ ´ X X hk J(f [k] , Si ) ×(Si −c) hk F [k] (c) +hp+1 F [p+1] (S[i,i+1] )+ I + Si+1 = c+ k=1

k=1

(A.6) where c is the mid point of box Si and J(f [k] , Si ) is the Jacobian of f [k] evaluated at box Si . The Jacobian may be obtained by automatic differentiation of the Taylor coefficient [2, 3]. The above form possesses a quadratic approximation property, quite advantageous when the boxes are small. However, the overestimation of enclosing boxes at consecutive points may accumulate as the integration proceeds (a phenomenon known as the wrapping effect) and lead to unreasonable results. Several strategies have been proposed for reducing the overestimation and, in particular, for handling the wrapping effect [29, 26, 15, 27, 36]. The most successful enclosing methods are based on changes of the coordinate system at each step of the integration process, aiming at reducing the most the overestimation of the domains box representation.

22