Interval Arithmetic: from Principles to Implementation T. Hickey, Q. Ju
Department of Computer Science, Brandeis University, USA
M.H. van Emden
Department of Computer Science, University of Victoria, Canada
Abstract
We start with a mathematical de nition of a real interval as a closed, connected set of reals. Interval arithmetic operations (addition, subtraction, multiplication and division) are likewise de ned mathematically and we provide algorithms for computing these operations assuming exact real arithmetic. Next, we de ne interval arithmetic operations on intervals with IEEE 754 oating point endpoints to be sound and optimal approximations of the real interval operations and we show that the IEEE standard's speci cation of operations involving the signed in nities, signed zeros, and the exact/inexact ag are such as to make a sound and optimal implementation more ecient. From the resulting theorems we derive data that are suciently detailed to convert directly to a program for eciently implementing the interval operations. Finally we extend these results to the case of general intervals, which are de ned as connected sets of reals that are not necessarily closed.
1 Introduction The original motivation of interval arithmetic was to have forward error analysis performed automatically, concurrently with the computation being analyzed. In this way the interval obtained was the estimated value of the result together with a forward error analysis. With the advent of IEEE standard arithmetic it became possible to control the direction of rounding. When this is done correctly, the computed interval is an inclusion for the real-valued result of the computation. This inclusion property introduces a qualitative improvement in numerical computation: although rounding errors reduce the amount of information the computation gives about the result, the truth of the statement that the result is contained in the computed interval is no longer aected by the limitations of the
oating-point system of arithmetic as actually implemented. When such interval arithmetic is used to solve equations, the computed interval may be empty: in that case a computergenerated proof has been obtained of the absence of solutions. 1
2 Though so far interval arithmetic has not been widely used, it is likely to be more important in the future. for a number of reasons: 1. The prevalence of memory hierarchies. The extra eort required in interval arithmetic is local and therefore usually references only the lowest level(s) of the memory hierarchy. In this way, there is less of a performance penalty for the use of interval arithmetic than in the era when architectures had only a single level of memory. It was during this earlier era that opinions about the feasibility of interval methods were formed. 2. An increased awareness of safety-critical computations. Safety-critical applications are often exclusively identi ed with real-time kernels for operating systems, which are not aected by rounding errors. But oating-point arithmetic is used in software that controls physical systems, and these are often safety-critical. 3. An increased awareness that sacri cing speed can give worthwhile advantages. Speed used to be all-important, taking precedence over security. Perhaps Java is not just an anomaly, but a harbinger of change. In this changed climate, interval arithmetic may be more readily accepted than before. 4. The rise of interval constraints. A recent application of interval arithmetic is interval constraints [2, 9]. The Numerica package [9] shows that in applications such as nonlinear systems of algebraic equations and non-convex global optimization, interval constraints is easier to use and gives higher performance than algorithms that are programmed directly in interval arithmetic. Many benchmark results are given in [9]. An additional example is the con rmation of the uniqueness of the previously known solution to the Ebers-Moll transistor model [3]. Such a con rmation has not been possible without interval methods of some kind. Interval constraints reduced the required computation time from over a thousand hours to less than one [17, 16]. 5. Near-universal acceptance of IEEE-standard arithmetic. This makes it possible to implement an interval arithmetic system that is sound : The resulting intervals contain the real number that is the theoretically correct value. closed : The result of an interval operation is always de ned, while being sound in the above sense. Speci cally, this means that computation never needs to be halted because of over ow, division by zero, or production of a NaN. ecient : The features of IEEE 754 oating-point arithmetic can be exploited to minimize the tests needed to achieve the soundness and closedness. In this paper we provide algorithms and correctness proofs for interval arithmetic operations de ned over several dierent classes of intervals. We begin with the class of all closed intervals and we provide algorithms for computing the arithmetic operations on such intervals using exact real arithmetic. We also derive algorithms for computing the
3 sound, optimal approximations of these operations on the subset of intervals whose endpoints are IEEE oating point numbers, including 1. Finally, we extend our results to the case of intervals that are not necessarily closed and to their sound and optimal IEEE approximations. We do not restrict our results to bounded intervals. This makes a sound, complete, and optimal treatment of interval division possible.
1.1 Conventional methodology
Interval and non-interval numericists agree that the purpose of numerical computation is to use computers to make quantitative predictions about natural or man-made phenomena that are represented by a mathematical model M expressed in terms of real variables. Computers are restricted to representing the oating-point subset of the reals. This subset is not closed under the arithmetic operations. How to handle this discrepancy? Here interval and non-interval numericists disagree. The conventional method is to make a model M of the model M. M is obtained, via a programming language, by translating the reals in M to oating-point numbers. Each real is mapped to a single oating-point number. Each operation on reals is mapped to the corresponding operation on oating-point numbers, as far as these are de ned. This introduces discrepancies between the models. Via stability analysis or backward error analysis one may gain con dence that solutions obtained from M are a suciently good approximations to those of M. The danger of omitting such analyses was demonstrated early on by Forsythe [4].
1.2 Development of interval arithmetic
One of the striking features of oating-point arithmetic is that it can happen that evaluating a seemingly innocuous expression gives rise to an enormous error. In the early stages, interval arithmetic was preoccupied with the eect of rounding errors on the accuracy of expression evaluation. Only later it was realized that interval arithmetic has the potential of going beyond expression evaluation and on to solving problems that are inaccessible to conventional approaches. We brie y review these two stages in the development of interval arithmetic.
Interval arithmetic for expression evaluation The less ambitious use of interval arithmetic is to replace in a conventional algorithm all operations on oating-point numbers by the corresponding interval operations. Though limited in scope, this approach yields important bene ts. It is no longer as important to do an a priori error analysis. Instability or poor condition manifest themselves automatically by output intervals that are too wide to be useful. Error analysis can predict these. But one may not need a prediction if one can aord, in case of intervals that are too wide, to try again with another method. In this way one can often get the same con dence without error analysis. However, this less ambitious use only covers errors from a single source: rounding. By using a conventional algorithm, results are, in nonlinear problems, still subject to dis-
4 cretization or truncation errors. It is possible to be more ambitious and to cover errors from all sources in nonlinear problems. Although the potential in this direction has probably hardly been touched, there is an example of an algorithm that does achieve this goal: Interval Newton.
Interval arithmetic for solving equations In Newton's method for nding a zero
of a nonlinear function f , one incurs rounding errors in evaluating the function and its derivative. In addition, there is a truncation error. This error is called thus because in an analytic justi cation of the method one truncates a Taylor expansion of f beyond the linear term. In a geometric justi cation of Newton's method, this truncation corresponds to locally replacing f by its tangent. The Interval Newton method (due to R.E. Moore [13], see also [7]), sketched in this paragraph, captures in its interval the errors from all sources. Given any interval X , Moore used the idea of Newton's method to de ne by means of interval arithmetic an operator N such that N (X ) contains all zeros of f that may exist in X . The Interval Newton method relies on the fact that if f is a function which is continuously dierentiable on an interval X and if x and t are in X , then there is a between x and t such that f (x) = f (t) + f 0() (x ? t) Let t 2 X be any point with f (t) 6= 0, if there is a point x 2 X for which f (x) = 0, then such a point will satisfy 0 = f (t) + f 0() (x ? t) equivalently x = t ? f (t)=f 0() for some in X . In other words, if we let f 0(X ) denote an interval containing all f 0() for in X , then we must have x 2 N (X ) where N (X ) = ftg ? (ff (t)g=f 0(X )) Here, A ? B is de ned pointwise (as the set of all dierences of elements in A by elements in B ) and A=B is de ned, not as the pointwise quotient of elements in A by elements in B , but rather as the set of all solutions to a = b x with a 2 A and b 2 B . This is an important point that we will come back to when we formally de ne the arithmetic operations on intervals. Also, we have left unspeci ed the dependence of t on X , usually one takes t as the midpoint of X . Dierent choices for t yield dierent operators N . Thus, Moore's version of the Newton method successively replaces the interval X of interest by X \ N (X ). When X is suciently narrow and contains a simple zero, then N (X ) X and the width of N (X ) is about the square of the width of X . Moreover, N (X ) X is a sucient condition for the existence of a zero in N (X ). See [13, 6]. If X \ N (X ) is empty, then X contains no zeroes of f . Of course this assertion is conditional upon N (X ) indeed containing all zeroes that may exist in X . The rounding errors made in computing the bounds of the intervals used in the computation of N (X ) now assume a crucial importance. One of the motivations for including the control of direction of rounding in IEEE standard 754 has been that this speeds up the outward rounding required in interval arithmetic. An equally important contribution of the standard is that rounding sacri ces as little accuracy as possible.
5
1.3 Incompatibilities between existing approaches to interval arithmetic
Perhaps because the distinction between the two stages in the development of interval arithmetic has not been clearly recognized, there are a number of incompatibilities between existing approaches to interval arithmetic, as evidenced by the varied answers they provide to the following questions. Can intervals be unbounded? Can intervals contain in nity? Is division by an interval containing zero de ned? Can an interval operation yield more than one disjoint interval as result? The questions are interrelated in that the answer to one may constrain the answers to the other ones. We believe that the method followed in this paper can answer all these questions in such a way that favorable answers to the other ones are not compromised.
Can intervals be unbounded? As for the rst question, Alefeld and Herzberger [1]
answer this in the negative. According to their de nition a real interval generalizes a real in that it is a pair of reals, rather than a single one. If one does not take into account the need to implement intervals, their de nition may well be satisfactory: for every nite set of reals there is a bounded interval containing it. However, this would only be satisfactory because there is no largest real. As there is a largest oating-point number, it seems preferable to allow real intervals to be unbounded, as unbounded intervals can be represented according to the IEEE oating-point standard. In that way, however coarsegrained an approximation the implemented intervals are, we can make sure that for every set of reals there is an implemented interval containing it.
Can intervals contain in nity? A natural representation of an unbounded interval is
as a pair of oating-point numbers at least one of which is in nity. The question arises: does the set denoted by such a pair only contain reals, or does it also contain an in nity? For Alefeld and Herzberger [1] as well as Hansen [7], \interval", unquali ed, is a bounded set of reals. Kahan [10] and Walster [20] recognize the importance of unbounded intervals. They achieve this by allowing the sets denoted by intervals to contain an in nity, so that intervals are no longer sets of reals. We believe that this is a commitment not to be lightly undertaken. The purpose of numerical computation is to make predictions about mathematical models of continuous natural phenomena. It has taken a century for the calculus to evolve a satisfactory model of continuous change. By the end of the nineteenth century this evolution settled down to the standard concept of real number. All reals are nite, although there is no largest one. We feel that any modi cation of the classical concept of the reals
6 should be a last resort. As we will show in this paper, it is feasible to eciently operate in IEEE standard oating-point arithmetic on intervals that are possibly unbounded sets consisting of reals only.
Is division by an interval containing zero de ned? The question of in nity of course
only comes up when considering division by an interval containing zero. From the point of view of Interval Newton, such a division is perfectly natural, and occurs for example when the midpoint of the interval is near a positive, local maximum of the function. In such a case, any solutions in the interval must be to the left or right of the maximum, so N (X ) is the union of two disjoint intervals. If only one of these intervals intersects X , then Interval Newton will have eliminated over half of the points in X . However, according to the de nitions of normal, i.e. not extended, interval arithmetic, as given in [1, 7], division by an interval containing zero is unde ned. This restriction complicates programming. Conventional non-interval programming with oating-point numbers would be considerably complicated if every division had to be preceded by code to handle the case that the divisor is too close to zero. Because that case can be assumed exceptional, we do not clutter the code this way and rely on the over ow exception. It is reasonable to assume that it is a rare occurrence for a single oating point divisor to come too close to zero. It is not, however, reasonable assume that it is rare for an interval divisor to contain zero. Intervals can be very wide. So in interval arithmetic one should not have to rely on exceptions or else meticulously precede every division by an appropriate safeguard. The need for an extended interval arithmetic where divisors are allowed to contain zero is acknowledged by most texts. Hansen [7] attributes extended interval arithmetic to Kahan [10] and, independently, to Hanson [8]. It has been described by Kearfott [11], who cites modi cations to extended interval arithmetic due to M. Novoa [14] and D. Ratz [18]. Kearfott refers to the result as the Kahan-Novoa-Ratz formulas. Although these formulas do allow the denominator to contain zero, which can result in unbounded intervals, they require the two intervals to be bounded. This incompleteness is not a problem in Ratz's model of interval arithmetic because these \extended" intervals are only envisaged as appearing temporarily inside an Interval Newton intersection, which will always yield one or two closed bounded intervals anyway. The paper by Walster [20] is also interesting in that it does not treat the possibility to divide by an interval containing zero as a secondary issue. Instead, he describes one single interval arithmetic where divisor intervals can contain zero. In this paper, we extend the Kahan-Novoa-Ratz formulas to allow the intervals to be unbounded.
Can operations yield disjoint sets? The major obstacle to adoption of interval arith-
metic has always been the cost in performance. This additional cost is moderate in the situation when intervals are represented by a single pair of oating-point numbers. Allowing the result of an operation to be sometimes one, sometimes more, intervals implies a considerable further performance penalty. Thus there is a strong motivation to replace a pair of disjoint intervals by the smallest single interval containing their union. But then one loses information about the value of the variable associated with the result of the operation.
7 Older [15] describes an ingenious compromise. He observes that for most purposes, the pair of disjoint intervals is only required for intersection with a single interval. This suggests that in addition to pure interval division X=Y we adopt a three-argument operation on intervals X , Y , and Z yielding the smallest oating-point interval containing (X=Y ) \ Z , where, internally to the function implementing the operation, provision is made allowing X=Y to be a pair of disjoint intervals. For example, X , Y , and Z could be [1; 1], [?1; 1], and [0; 0:9], respectively. On the one hand, we need to get back to a single interval without too much loss of performance. Replacing X=Y immediately by the least interval containing it has the disadvantage of not detecting the situation that Z ts inside the gap in X=Y in which case (X=Y ) \ Z is empty, a result one does not want to miss. With Older's method one indeed does not miss that valuable result, yet performance is minimally impacted. The Kahan-Novoa-Ratz formulas give in some cases disjoint intervals as result of interval division. These are really the exterior intervals of Kahan and need not entail extra representational costs.
2 Method The discrepancies noted in the previous section show the need to explicitly formulate a method, which we do in this section. The Interval Newton method raises the stakes to be won by interval methods: not only can they give warning against instability of conventional algorithms, but can, if properly implemented, function as computer-generated proofs that the real-valued unknowns de ned in the problem statement are contained in the result intervals. These considerations suggest that interval arithmetic should adhere to the following principles.
2.1 Principles
I: Variables in mathematical models range over the reals Numerical computation
is often the solving of equalities or inequalities between real-valued expressions. As all continuous-variable mathematical models are in terms of reals, the unknowns that occur in the problems to be solved are reals also. The in nities only come in as useful for representing unbounded intervals, but such intervals, bounded or not, are sets containing reals only. Conventional numerical computation assumes that it is necessary to replace the mathematical model in which variables are reals by a model of the mathematical model in which the variables are oating-point numbers. Such an assumption is unnecessary in interval methods. The extra layer of modeling incurred in conventional numerical computation gives rise to a richly documented set of problems, e.g. [4]. Interval methods work directly on the mathematical model, not on the oating-point model of the mathematical model.
8
II: Computation is in terms of sets of reals Interval methods can work directly with
the mathematical model by associating variables in the model with sets of reals. Interval methods can, if implemented correctly, be sound in the sense that each unknown is a member of its associated set at each stage of the computation. In this way, the elementary operations of computation are operations on sets of reals yielding sets of reals. III: Sets of reals are restricted to intervals Given that computation consists of operations on sets of reals, it has to be decided which nite set of sets of reals is to be represented on the computer. The choice is determined by the criteria of close approximation and eciency. Approximation now takes the form of replacing a non-representable set by a representable superset. Good approximation means that, relative to the magnitude of the numbers involved, the approximating superset is not much larger than the approximand. Eciency takes into account storage and time. Restricting to closed intervals is one reasonable choice since such intervals require just two double precision numbers to store. Nevertheless, there are some applications in which it may be useful to allow open or half-open intervals, which requires an extra two bits (one for each endpoint) and additional processing time for the interval operations. We provide algorithms for computing the interval operations for both closed intervals and non-closed intervals. This principle suggests that an interval be regarded as a set of reals rather than as a new entity in need of an independent axiomatic de nition. Typical of this latter approach is [1]. IV: Intervals have oating-point numbers as bounds The desirability of representing unbounded sets implies that the sets should be represented by pairs of extended reals. Fortunately, IEEE standard 754 represents not only reals but also the two in nities. In this way the standard allows implementation of sets of reals with this closure property. The decision to represent a nite set of reals by a pair of oating-point numbers does not leave open the option of whether or not to include the bounds. Such an option would require two additional bits, say, a byte. As most architectures are 32-bit or 64-bit, there is an overwhelming advantage in the use of only a pair of IEEE standard
oating-point numbers for the representation of sets of reals.
2.2 Applying the principles
If one considers the meaning of an interval (namely a set of reals) separately from its eventual representation in computer memory, then it is clear how to resolve the incompatibilities between the various approaches to interval arithmetic discussed in the previous section. We
9 refer to this distinction between meaning and representation in the usual way as semantics versus syntax. At the semantic level, we only concentrate on de ning a set of reals. Here conventional set-theoretical notation is perfectly adequate; for example, fx 2 R j a xg, where a 2 R. At the semantic level intervals play no special role: many types of non-interval sets are as easy to describe as intervals. At the syntactic level, we consider economy of representation in computer memory and economy of execution of operations on sets of reals. At the syntactic level, pairs of
oating-point numbers are especially attractive as representation. Any approach to interval arithmetic has to decide what meaning to associate with any given such pair.
The role of the in nities and interval bounds We have argued that numerical com-
putation is about mathematical models de ned in terms of reals. All reals are nite. What, then is the role of the in nities? Dierent authors have given dierent answers. Our answer to this conundrum is to compute on sets of reals, which can be unbounded. For example, syntax requires a suitable representation in computer memory of sets such as fx 2 R j a xg, where a 2 R. Most processors conform to the IEEE-standard for oatingpoint arithmetic, which is modeled on the extended reals. This suggests representing sets that do not have an upper bound by pairs which have a second element equal to +1, similarly for sets unbounded in the other direction. In syntax we work in two stages. In the rst stage we postpone considering many of the details of the IEEE standard by representing intervals as pairs of extended reals. This allows the main features of interval arithmetic to be worked out. By ensuring that operations on extended reals are always de ned, we ensure that later NaNs never arise. Only in the nal stage do we consider the additional details of the IEEE standard. For example, here we need to ensure that a zero bound has the right sign.
De ning interval division Separation of syntax from semantics is especially useful
in the investigation of interval division. Semantically, we give a complete description of [a; b]=[c; d] as a set of reals. It turns out that there are more special cases than one might like, but that this number is manageable. These can be classi ed according to whether there are one or two disjoint connected sets, and according to whether a zero bound is open or not.
3 Semantical treatment of real intervals and their operations Theorem 1 Let a and b be reals. The following are closed connected sets of reals: fx 2 R j a x bg; fx 2 R j x bg; fx 2 R j a xg, and R: There are no other closed connected sets of reals.
10 This theorem is a well-known result in topology; see for example [12]. Note that the theorem also identi es ; as a closed, connected set. In the sequel we will assume that a b in [a; b] and use a non-interval notation for the empty set. De nition 1 A real interval is a closed connected set of reals.
3.1 Arithmetic operations on real intervals
The literature is unanimous in de ning the interval operations X + Y , X ? Y , and X Y by De nition 2 Let X and Y be real intervals and let be one of the operators +, ?, or . X Y = fx y j x 2 X ^ y 2 Y g (1) The right-hand side is the conventional pointwise extension of a function of type S ! T to a function of type 2S ! 2T . There is, however, no consensus on interval division. This is puzzling, as equation 1 is well-de ned in case is =: X=Y = fx=y j x 2 X ^ y 2 Y g (2) That this is so can be seen from the de nition of x=y, which is the real z, if any, such that x = z y. Of course, in case x 6= 0 and y = 0, no such z exists, hence x=y does not exist. Hence in equation 2 no such x=y is included in X=Y . In case one worries that equation 2 suggests an oversight, it is perhaps advisable to write X=Y = fx=y j x 2 X ^ y 2 Y ^ x=y existsg (3) In most expositions, interval division is only de ned under the condition that 0 not be contained in Y . Such a restriction is acknowledged by several authors to be unacceptable (it makes Interval Newton awkward) and unnecessary. However, the obvious remedy in equation 2 has for a long time been considered an exotic variant of interval arithmetic with as sole references two inaccessible publications [10, 8]. Only recently [14, 18, 20] has interval division received the attention it needed. We have seen, however, that the most useful de nition of division to use in the Interval Newton is to de ne the quotient in terms of solutions to a linear equation, since that quotient arises from solving the equation (x ? t) f 0() = ?f (t) for x ? t. This suggests letting X Y denote the set of all solutions to x = y z with x 2 X and y 2 Y . X Y = fz j 9x 2 X; y 2 Y:x = y zg (4) In particular, if f (t) = f 0() = 0, then any x ? t will solve this equation, so this would suggest that we should have f0g f0g = [?1; 1], which is indeed the case. To keep our discussion as widely applicable as possible, we will consider both forms of division, as put forth in the following de nition.
11
De nition 3 Let X and Y be real intervals, then a) the functional quotient of X and Y is de ned by:
X=Y = fz j 9x 2 X; y 2 Y:y 6= 0; z = x=yg
(5)
b) the relational quotient of X and Y is de ned by:
X Y = fz j 9x 2 X; y 2 Y:x = y zg
(6)
The functional quotient should be used in those cases where the quotient represents a function de ned only where the denominator is non-zero, e.g. when evaluating (X Y )=(X + Y ) which is de ned and continuous everywhere except at (0; 0). The relational quotient should be used when the expression arises as the solution to a linear equation, as in the Interval Newton method. The safest approach is to always use the relational quotient since it is more conservative than the functional as shown in the following: 2
2
Theorem 2 Let X and Y be real intervals, then X=Y X Y and ( if 0 62 X \ Y X Y = X=Y (7) R otherwise Proof. First observe that if x=y = z, then x = y z, so X=Y X Y . The converse is true if 0 is not in both X and Y . Indeed, if 0 62 Y , then x = y z implies that y 6= 0 and z = x=y. Similarly, if 0 62 X , then x = y z implies y = 6 0 and so z = x=y. On the other hand, if 0 is contained in both X and Y , then X Y = R as 0 = 0 z holds for all real z. 2 This theorem shows that X Y can be easily computed from X=Y by simply checking whether X and Y both contain 0, and either returning X=Y or returning R, based on the
result. It is well known that the interval arithmetic operations on bounded intervals are closed (provided one disallows division by intervals containing zero). We state and prove this theorem below for completeness. In the next section we will compute explicit formulae for these operations on unbounded intervals (possibly containing zero) and will in this way prove that the addition, subtraction, and multiplication operations are closed on possibly unbounded intervals, whereas division is not.
Theorem 3 If S and T are non-empty, bounded, real intervals, then so are S + T , S ? T , and S T . If, in addition, T does not contain zero, then S=T = S T is a non-empty, bounded, real interval as well. More generally, f (S; T ) will be a bounded, real interval, provided f is continuous on a domain containing S T . Proof. Note that according to De nition 1, \real interval" means closed connected set.
12 This theorem is a consequence of some of the general properties of continuous functions, namely, that continuous functions map connected sets into connected sets, and map compact sets into compact sets. Since the compact sets of R are just the closed and bounded subsets, the theorem follows from the fact that the rst three operators are continuous on R and the functional division operator is continuous on R (R n f0g). Since we assume 0 62 T , the relational and functional division de nitions agree. 2 In the case where T contains zero or is unbounded, the quotient S=T may not be an interval. Indeed, even though S = f1g and T = fx 2 R j ?1 x 1g are intervals and S=T = fx j (?1 x) _ (1 x)g is de ned as a set of reals, S=T is not a connected set, hence not an interval. Similarly, although S = f1g and T = fx j x 1g are intervals, and S=T = fx j 0 < x 1g is not closed, and hence not an interval. 2
4 Syntactical treatment of real intervals and their operations Although our semantics has de ned real intervals and their operations in a mathematically rigorous way, so far we could only use cumbersome set-comprehension expressions such as
fx 2 R j a x bg: What we need in addition are concise expressions for real intervals. We also need rules for computing the operations of De nitions 2 and 3 on the basis of such expressions. These expressions and their manipulation we regard as the syntactical aspect of interval arithmetic. We adopt the notation used by Hansen [7].
4.1 Expressions for real intervals
De nition 4 Let a and b be reals such that a b. [a; b] = fx 2 R j a x bg [?1; b] = fx 2 R j x bg [a; +1] = fx 2 R j a xg [?1; +1] = R def
def
def
def
The de nition gives an expression for each of the types of nonempty real interval that exist according to Theorem 1. To take full advantage of the notation, we regard each expression abstractly as a pair. The rst (second) element of a pair is called lower (upper) bound of the interval denoted by the pair. Thus we can summarize all expressions of the de nition by [a; b] where a and b belong to the set R [ f?1; +1g of extended reals and a b. The above notations do not cover
13 the empty interval. We have not found it urgent nd a special notation for it and will use ;. Below, we summarize the properties of the extended reals. We note that Corollary 1 If [a; b] is a non-empty real interval, then a 6= +1 and b 6= ?1. Proof. See De nition 4 and Theorem 1 that says that there are no other non-empty intervals than the ones covered by De nition 4. 2 The notation [a; b] has been used by Hansen [7]. It is borrowed from mathematics and coincides with conventional mathematical usage if a and b are reals. However, mathematical usage implies that a 2 [a; b] and b 2 [a; b]. Hansen's notation does not have this property: for example ?1 62 [?1; b] because ?1 62 R and [?1; b] R. Mathematical usage would require one to write (?1; b] rather than [?1; b]. We agree that it is regrettable that De nition 4 departs from mathematical usage. But we think that the departure is justi ed by the resulting exibility: in this notation [a; b] can denote any nonempty, closed, connected set of reals, bounded or not. Another potential terminological problem: in [a; b], we call e.g. a the lower bound of [a; b]. \Lower bound" in this sense is not to be confused with \greatest lower bound" (often abbreviated to glb). If a 2 R, then a happens to be the glb of [a; b]. But if a = ?1, then the lower bound of [a; b] is not the glb of [a; b] because [a; b] does not have a glb in this case.
4.2 Examples
Now that we have a convenient notation for intervals, let us illustrate by means of examples some of the consequences of our semantic de nitions of the interval operations. 1. [2; 2][; ] = [2; 2]. In other approaches this holds because intervals are generalized reals. In our approach this is true because in De nition 2 all reals in the set f2g combine with all reals in the set fg to produce f2g. 2. [0; 0] [?1; 1] = [0; 0] This is easily veri ed with De nition 2. 3. [0; 0]=[0; 0] = ;. This follows from De nition 3. 4. [0; 0] [0; 0] = [?1; 1]. This follows from De nition 3 as 0 = 0 z for all real numbers z. 5. [0; 1]=[0; 1] = [0; 1]. This holds as any non-negative number can be expressed as x=y with x; y 2 [0; 1]. 6. [0; 1] [0; 1] = [?1; 1]. This follows since 0 = 0 z holds for any z. 7. [1; 1]=[0; 0] = [1; 1] [0; 0] = ;; which is a real interval. 8. [1; 1]=[?1; +1] = [1; 1] [?1; +1] = R n f0g; which is not a real interval. In fact, it is not even connected.
14 Class at least one at least one Signs of of [a,b] negative positive bounds M yes yes a0 Z no no a=0^b=0 P no yes a0^b>0 P no yes a=0^b>0 P no yes a>0^b>0 N yes no a > > [?1; 1] if 0 2 [a; b] ^ 0 2 [c; d] > > [ b=c; 1 ] if b > < [?1; b=d] [ [b=c; 1] if b < 0 ^ c < 0 < d if b < 0 ^ 0 = c < d [a; b] [c; d] = > [?1; b=d] > [ ?1 ; a=c ] if 0
[?1; a=c] [ [a=d; 1] if 0 < a ^ c < 0 < d > > > if 0 < a ^ 0 = c < d > : [;a=d; 1] if 0 62 [a; b] ^ c = d = 0 Ratz proves this by considering each of these cases and deriving the result by a series of direct transformations from the de nition of X Y . There are several reasons for extending Ratz's theorem. 1. It is only de ned for operations on bounded intervals. The theorem shows that the quotient of two bounded intervals is either empty, or a bounded interval, or an unbounded interval, or a union of two unbounded intervals. Although the result can be an unbounded interval (or a union of two unbounded intervals), the Ratz formula does not allow the arguments to be unbounded intervals. Of course, Ratz intended this extended interval division only to be used in the context of the Interval Newton method, where the possibly unbounded set would be intersected with the original bounded interval, to give zero, one, or two bounded intervals. It turns out to be hardly more complex to de ne a general-purpose interval division. 2. Theorem 8 relies on the multiplication formulas by converting many of the quotients into products [a; b][1=d; 1=c]. This can be inecient and also can introduce additional roundo errors (as, for example, a=d will in general be more precise than a (1=d) when evaluated in oating point arithmetic). 3. It only computes the result of the relational division X Y . Although this is the safest division operation, there may be times when functional division is more appropriate. For example, if we evaluate xy=(x + y ) on the interval x = y = [0; 1] using functional division we obtain [0; 1]=[0; 2] = [0; 1] which shows that the function is non-negative on that interval, whereas using relational division yields [0; 1] [0; 2] = [?1; 1] which conveys no information. Of course, if one allows division by unbounded intervals one admits a complication that Ratz did not have to handle: the resulting interval is no longer guaranteed to be a closed set. For example, [1; 1]=[1; 1] = fx 2 R j 0 < x 1g is a connected set that does not 2
2
20
[a; b]=[c; d] [a; b]=[c; d] general formula unless exception case N N [b=c; a=d] n f0g d = 0 [b=c; 1] n f0g S N N [0; a=d] d=0 [0; 1] S M N [b=d; a=d] d=0 [?1; 1] S P N [b=d; 0] d=0 [?1; 0] S P N [b=d; a=c] n f0g d = 0 [?1; a=c] n f0g S N M ([?1; b=d] [ [b=c; 1]) n f0g S N M [?1; +1] S M M [?1; +1] D P M [?1; +1] D P M ([?1; a=c] [ [a=d; 1]) n f0g D N P [a=c; b=d] n f0g c = 0 [?1; b=d] n f0g S N P [a=c; 0] c=0 [?1; 0] S M P [a=c; b=c] c=0 [?1; 1] D P P [0; b=c] c=0 [0; 1] D P P [a=d; b=c] n f0g c = 0 [a=d; 1] n f0g D Figure 4: Case analysis for functional division of real intervals, [a; b]=[c; d] when a b, c d, and neither interval is [0; 0]. The last column refers to how the formula has been proved (\D" for a direct proof, \S " and \S " refer to a symmetry used to reduce it to an earlier case.) Class Class of [a; b] of [c; d] 1
2
0
2
1
0
1
1
1
1
2
0
2
0 1
1
2
0
2
0 1
1
2
21 contain its greatest lower bound, and hence is not a closed set. The following theorem shows that this complication is conveniently handled by our classi cation of intervals. Theorem 9 If [a; b] and [c; d] are nonempty, real intervals, then their functional quotient can be computed as follows. If either is [0; 0], then ( 6= [0; 0] [a; b]=[0; 0] = ;; [0; 0]=[c; d] = ;[0; 0] ifif [[c;c; dd]] = [0; 0] If neither is equal to [0; 0], then [a; b]=[c; d] is given as in the \general formula" column of the table in Figure 4, unless the speci ed condition in column 4 holds, in which case the quotient is given by the exception case formula in column 5. Proof. Before beginning the proof, we make the observation that the exception cases in the table all arise because the general formula contains a quotient of the form u=v with u 6= 0, and in the exception case v = 0. We will see in the next section that the IEEE signed zero properties can be used to entirely eliminate the exception column, thereby greatly simplifying the computation of interval quotients. According to the table in Figure 4, we prove directly the cases MM (i.e. [a; b] 2 M and [c; d] 2 M ), P M , P P , P P , MP , and P M . These six directly proved cases are indicated by a D in the last column. Then we use the fact that N is the symmetrical counterpart of P according to symmetry x=y = ?(x= ? y) (indicated by S in column six). This gives MN from MP , P N from P P , and P N from P P . Finally, we obtain all six cases where [a; b] 2 N or [a; b] 2 N from those where [a; b] 2 P or [a; b] 2 P by using symmetry x=y = ?(?x=y), (indicated by S ). Thus it remains to prove table entries MM , P M , P P , P P , MP , P M . In cases MM and P M we have [0; +] [a; b] and [?; +] [c; d] for some > 0. This ensures that all reals occur in [a; b]=[c; d], so that the quotient interval is [?1; +1]. Case P P : a = 0 < b and 0 c. If c = 0, then the quotient contains [0; ]=[0; ] = [0; 1] and contains no negative values so the exception case must be [0; 1]. If c 6= 0, then c > 0 and so [0; b]=[c; d] = [0; b=c]. This holds also if b or d is +1. Case P P : 0 < a and 0 c. Note that a and c are nite. Suppose rst that c 6= 0, then [a; b] and [c; d] are single-signed and non-zero, so [a; b]=[c; d] = [a=d; b=c], provided that b and d are nite. The formula holds when b = +1 because the quotient contains arbitrarily large numbers, so the upper bound should be 1. If d is in nite, then a=d = a=1 = 0 by the rules of extended arithmetic, but since 0 62 [a; b]=[c; d] we include the possibility of unbounded [c; d] by excluding 0: [a; b]=[c; d] = [a=d; b=c] n f0g. If c = 0, then the quotient contains arbitrarily large positive values, so the exception case is [a=d; 1] n f0g. Let us next consider case MP : a < 0 < b and 0 c. If c is zero, then the quotient [a; b]=[c; d] contains [?; ]=[0; ] for some > 0, so the result should be [?1; 1] in this exception case. Otherwise, c > 0, so splitting into single-signed components and using our formulas for P =P we get we get [a; b]=[c; d] = [a; 0]=[c; d] [ [0; b]=[c; d] = [a=c; 0] [ [0; b=c] = [a=c; b=c] 0
0
1
1
1
0
1
0
1
0
1
2
0
1
1
0
0
1
0
1
0
0
22 This formula also holds for a = ?1 and/or b = +1 as c is nite and so 1=c = 1. Case P M : 0 < a and c < 0 < d. Note that a is nite. Splitting into single-signed components, and using our formula for P =P , we get [a; b]=[c; d] = [a; b]=[c; 0] [ [a; b]=[0; d] = ([?1; a=c] n f0g) [ ([a=d; +1] n f0g) = ([?1; a=c] [ [a=d; +1]) n f0g 1
1
The right-hand side is a union of connected sets that are closed except when one of the bounds is zero.
2
Corollary 2 If [a; b] and [c; d] are nonempty, real intervals, then their relational quotient
can be computed as follows. If either is [0; 0], then ( ( ; if 0 2 6 [ a; b ] ; 0] if 0 62 [c; d] [a; b] [0; 0] = R if 0 2 [a; b] [0; 0] [c; d] = [0 R if 0 2 [c; d] If neither is [0; 0] then [a; b][c; d] is given by the functional division table in Figure 4, except that the results in the exception column for the four cases N0 N; N0P; P0 N; P0 P should be replaced by [?1; 1]. Proof. By Theorem 2, we know that X=Y and X Y are equal except possibly in the case where both numerator and denominator contain zero, in which case X Y is [?1; 1]. Thus, the cases of division by [0; 0] and of [0; 0] being divided, must be modi ed to check for the case when the other interval contains zero. Also one must potentially modify all other cases where 0 can be in both numerator and denominator. These consist of the four cases mentioned above N0N; N0P; P0 N; P0 P , as well as N0M , MN0, P0M , MP0, MM . These last ve yield [?1; 1] for functional division also, so the only places the table must be changed is the four cases speci ed in the corollary. 2
5 Exploiting the IEEE-standard for interval arithmetic So far our considerations have been entirely in the realm of conventional mathematics, only taking into account the properties of the reals and the extended reals. Our method for representing sets of reals by pairs of extended reals has the property that bounds of computed intervals are de ned according to the arithmetic of extended reals even when in nity is involved. In the next section we rst give an outline of the IEEE 754 standard for oating point arithmetic. Then we show how it can be used to ensure that bound computation always yields a de ned, correct, and optimal result. In other words, from the point of view of interval arithmetic, the standard extends the extended reals in just the right way.
23
5.1 Overview of the IEEE 754 standard
In this section we review the IEEE-standard oating-point number system as far as needed for this paper. The standard speci es several formats diering only in the sizes of certain elds. In this paper we are only concerned with features of the standard common to all formats. For any particular format, the set of possible bit patterns is partitioned into the following categories: 1. non-zero reals 2. ?0, +0, ?1, and +1 3. bit patterns that do not represent reals and are called NaN (Not a Number) The IEEE standard orders the non-NaN oating-point numbers of a given format as follows: ?1, the negative real oating-point numbers in increasing order, ?0, +0, the positive real oating-point numbers in increasing order, +1. In this paper we consider the operation of addition, subtraction, multiplication, and division. The standard speci es a resulting oating-point number for each operation on each of the bit patterns, whether or not a corresponding mathematical de nition exists. There is a mathematical de nition, according to the eld of reals, only if both operands are reals (therefore not if either is ?0, +0, ?1, or +1). If a mathematical de nition applies, then the resulting real may not be a oating-point number. In such cases the standard speci es that the result is one of the bounds of the least interval of reals with non-NaNs as bounds that contains the result according to the eld of reals. For the purpose of this sentence, ?1 < x < ?0 for all negative real x, +1 > x > +0 for all positive real x, and ?0 = 0 = +0. Which of the two bounds is selected as result depends on the rounding mode selected in the operation of the oating-point number system. In this paper we consider the mode of downward rounding, where the lesser bound is selected, and the mode of upward rounding, where the greater bound is selected. In cases where the eld of reals does not provide a result, the standard speci es as result the one of a limiting process, if one is unambiguously suggested by the operands. In other cases, +0 + (?0) and +0 ? (+0), the result is arbitrarily de ned. In the remaining cases, the result is a NaN. These considerations are summarized in the tables of Figure 5.1. In this paper we are only concerned with some of the operations speci ed in the standard. For each of these we will need the result rounded in a speci c direction. Thus we rely on combinations of the operations +, ?, , =. De nition 5 The operations of addition, subtraction, multiplication, and division of the IEEE standard with rounding towards ?1 are denoted +lo, ?lo , lo , and =lo respectively. The same operations with rounding towards +1 are denoted +hi , ?hi , hi , and =hi respectively. The standard also requires each operation to set a boolean ag exact which will be true if and only if the computed result is equal to the mathematically de ned result. In this case no rounding takes place, so the rounding mode is irrelevant.
24 x+y
?1
y
NR -0 +0 PR +
x
?1 NR -0 +0 PR ?1 ?1 ?1 ?1 ?1 FR FR FR FR ?0 0 F R 0
+0
0
1
xy y
?1 NR ?0 +0 PR +
1
FR FR
?1 +1
NR +
1
FR
0
x -0 NaN +0 +0
+0 NaN 0 0 +0
? ?
PR
?1 FR ?0 0
+0
FR
0
1 1 1 1 1 1
+ NaN + + + + +
+1 ?1 ?1
NaN NaN + +
1 1
x?y y
?1 NR ?0 +0 PR +
1
x=y
y
?1 NR ?0 +0 PR +
1
x
?1 NR ?0 +0 PR NaN +1 +1 +1 +1 ?1 F R F R F R F R ?1 F R 0 +0 F R ?1 F R ?0 0 F R ?1 F R F R F R F R ?1 ?1 ?1 ?1 ?1
0
0
?1 NR NaN +0 +1 FR +1 +1 ?1 ?1 ?1 F R NaN ?0
0
0
x
?0
+0 +0 NaN NaN 0 0
? ?
+0 0 0 NaN NaN +0 +0
? ?
PR 0
? FR ?1 +1
0
FR
0
+0
1 1 1 1 1 1
+ + + + + + NaN
1 ?1 ?1 +1 +1
+ NaN
NaN
Figure 5: The arithmetic operations on the IEEE Standard oating-point numbers. The FR, FR0 entries denote a result obtained according to the mathematical de nition of the eld of reals and then rounded according to the selected or default rounding mode. Such rounding results in a non-NaN oating-point number. The result in some cases is de nitely nite (FR); in others it may possibly be in nite (FR0). In the addition/subtraction tables, 0 is +0 in all rounding modes except when the rounding mode towards ?1, and then it is ?0.
5.2 The signed zero convention
In this section we show how signed zeroes can be used to simplify the formulas for interval addition and multiplication. In particular, we use the fact that if we let ?0 and +0 denote the signed zeroes of IEEE arithmetic, then division by signed zero x=(?0) and x=(+0) is a non-NAN oating point number for all non-zero x. If we introduce signed zeroes into our extended real arithmetic, all of the exception formulas that arose in Figure 4 are properly handled by the general formulas provided we adopt the convention that all zero endpoints are signed zeroes and +0 is used for lower bounds, while ?0 is used for upper bounds. For example, consider the rst line of that Figure, which gives the formula when b < 0 and d 0: [a; b]=[c; d] = [b=c; a=d] n f0g unless d = 0, in which case [a; b]=[c; d] = [b=c; 1] n f0g If we adopt the convention that (?0) is used when d is zero, then since a is negative and non-zero, the IEEE speci cation on division of signed zeroes implies that a=d = +1, and so the exception case is not needed. The following de nition of IEEE intervals adopts this convention.
De nition 6 An IEEE-standard interval is a real interval whose endpoints are represented by IEEE oating point numbers. We further require that ?0 can only appear as a upper
25 bound, and +0 can only appear as a lower bound.
With this convention, we nd that the IEEE standard facilitates interval arithmetic to a remarkable extent. However, we realize that the beauty of the standard is in the eye of the beholder: other researchers in interval arithmetic [19] criticize the signed zeros as follows: While it is possible to concoct examples where this feature saves an instruction or two, in the vast majority of applications this value is an annoying distraction and a source of subtle bugs.
5.3 Optimal IEEE Approximations of Interval Arithmetic
In interval arithmetic, rounding need not lead to error. By rounding outward, soundness is maintained and rounding only has the eect of including some values that would have been left out were the result exact. The following is an obvious fact. The reason for presenting it as a theorem is its great importance.
Theorem 10 For every set of reals there is a unique least IEEE-standard oating-point interval containing it.
Proof.
Included among the IEEE-standard oating-point numbers are ?1 and +1. Hence there exists such an interval containing the given set of reals. As the number of such intervals is nite and closed under intersection, there is a least interval containing the given set. 2 One reason for the great importance of this theorem is that the uniqueness of the existing least containing interval compels the de nition of the following function.
De nition 7 For any set of reals, ?() is the least oating-point interval containing it. De nition 8 We call a set a sound approximation of a set of reals if . We call the IEEE interval ?( ) the optimal IEEE approximation of . For addition, subtraction and multiplication it is a simple matter to obtain sound and optimal approximations:
Theorem 11 Let X = [a; b] and Y = [c; d] be non-empty IEEE-standard intervals, then ?([a; b] + [c; d]) = [a+loc; b+hid] and ?([a; b] ? [c; d]) = [a?lod; b?hic] The formulas in Figures 6 and 7 give sound approximations to X Y and X=Y respectively. The former give the optimal IEEE approximation of X Y . The latter is contained in an optimal IEEE approximation of X=Y , but is more informative.
26
Proof. First note that X=Y may not be connected, and hence, when it contains two components, we compute the optimal approximation of each component. Moreover, X=Y may not be closed and our tables also indicate this occurrence by using the set dierence operator A n f0g. The formulas in the Theorem and in Figures 6 and 7 are obtained from the corresponding formulas in the case of real intervals (Figures 3 and 4) by using outward rounding, i.e., rounding upper bounds toward positive in nity and lower bounds toward negative in nity. It is clear that this results in a sound approximation. Optimality follows from the fact that the upward rounded arithmetic operations is required by the IEEE standard to return the smallest oating point number which is not smaller than the true result, and similarly for downward rounded operations. We are also using the signed zero convention, which eliminates the exception conditions of Theorem 9. 2
6 Arithmetic on connected subsets of the reals. The results of the previous sections extend to the more general class of connected subsets of R.
De nition 9 A general real interval is a connected set of reals. To represent such an interval X syntactically, we must provide both its endpoints [a; b] and two bits of information and , with (resp. ) specifying whether a (resp. b) belongs to X . We formalize this in the following de nition:
De nition 10 Let R = R [ f?1; 1g denote the extended reals. For any u; v 2 R and any boolean values ; 2 B = ft; f g, let [u; v]; denote the set X of all real values between u and v, where (resp. ) is true i u (resp. v) is contained in X . Thus, [u; v]t;t = fx 2 R j u x vg [u; v]t;f = fx 2 R j u x < vg [u; v]f;t = fx 2 R j u < x vg [u; v]f;f = fx 2 R j u < x < vg
Note that this only de nes subsets of the reals. A consequence of this de nition is that, e.g., [u; v]t;f = [u; v]t;t if v = +1. Of course, [u; v]t;f 6= [u; v]t;t whenever u; v 2 R and u < v. In our interval arithmetic formulas for general real intervals, we will use intersections and unions of general intervals. We include the relevant formulas below.
27
Class Class of [a; b] of [c; d]
[blod; ahic] [bloc; ahic] [bloc; ahid] [alod; ahic] [min(alod; bloc); max(bhid; ahic)] [bloc; bhid] [alod; bhic] [alod; bhi d] [aloc; bhid] Figure 6: Multiplication of IEEE intervals when neither interval is [0; 0].
N M P N M P N M P
N N N M M M P P P
Class Class of [a; b] of [c; d]
N N M P P N N M P P N N M P P
0
1
1 0
0 1
1 0
0 1
a sound approximation of [a; b]=[c; d]
[b=loc; a=hid] n f0g [0; a=hi d] [b=lod; a=hi d] [b=lod; 0] [b=lod; a=hic] n f0g ([?1; b=hid] [ [b=loc; 1]) n f0g [?1; +1] [?1; +1] [?1; +1] ([?1; a=hic] [ [a=lod; 1]) n f0g [a=loc; b=hid] n f0g [a=loc; 0] [a=loc; b=hi c] [0; b=hic] [a=lod; b=hic] n f0g Figure 7: Functional division of IEEE intervals when neither interval is [0; 0].
1
0
?([a; b] [c; d]) general formula
N N N N N M M M M M P P P P P
28
Theorem 12 Let X = [a; b]; and Y = [c; d] ; be general real intervals and suppose X\Y = 6 ;. Then, [a; b]; \ [c; d] ; = [ max(a; c); min(b; d)] a;c; ; ; b;d; ; [a; b]; [ [c; d] ; = [ min(a; c); max(b; d)] a;c;; ; b;d;; where
(
)
(
)
(
)
(
)
(a; c; ; ) = ((a < c) ^ ) _ ((a = c) ^ ( ^ )) _ ((a > c) ^ ) (a; c; ; ) = ((a < c) ^ ) _ ((a = c) ^ ( _ )) _ ((a > c) ^ )
Proof.
The only subtle point here is to determine whether or not each endpoint is contained in the result interval. For an intersection, the lower bound L = max(a; c) is either a or c, whichever is larger. If a is larger, then L is contained in the intersection if and only if a is in X . Similarly, if c is the larger, L is in the intersection if and only if c is in Y . If a and c are equal, then L is in the intersection if and only if both a 2 X and c 2 Y . Similarly with the upper bound. For the union of two general intervals, the similar arguments apply except that when a = c, L is in the union if and only if a 2 X or c 2 Y .
2
We now provide the formulas for interval arithmetic on general real intervals. Theorem 13 Let X = [a; b]; and Y = [c; d] ; be general real intervals. Then, [a; b]; + [c; d] ; = [a + c; b + d]^ ; ^ [a; b]; ? [c; d] ; = [a ? d; b ? c]^; ^ and X Y , and X=Y are given by the formulas in the tables in Figures 8 and 9, which contain all relevant cases modulo the following symmetry operations: xy = yx, xy = ?(x(?y)), x y = ?((?x) y), x=y = ?(x=(?y)), and x=y = ?((?x)=y). Proof. With one exception, the formulas are identical to the formulas in Figures 3 and 4 for the multiplication and division of closed intervals, except that the endpoints of the result interval in the closed case may or may not be contained in the set in this more general case. The one exception is P =M where if 0 62 X , then 0 62 X=Y and so X=Y is either [?1; 1] if 0 2 X or is [?1; 1] n f0g otherwise. For the case of non-zero endpoints, it is easy to see that the endpoint (u v or u=v) is contained in the set if and only if the corresponding endpoints (u and v) appearing in the formula for that endpoint are contained in their sets. On the other hand, zero endpoints only appear in the result of an interval multiplication if one or both of the argument intervals has a zero endpoint, and clearly the result contains zero precisely if zero is contained in either of the argument intervals. For interval division, zero appears as an element of the resulting interval only if the numerator contains zero. These observations, when applied to the formulas from the closed, real interval cases, result in the formula tables of Figures 8 and 9. 2 To obtain optimal approximations for the multiplication and division of (non-closed) intervals, we need to de ne optimal approximation in the context of general intervals. 0
29
Class Class of [a; b]; of [c; d] ;
[a; b]; [c; d] ; general formula
[a c; b d]^ ; ^ [0; b d]; ^ [0; b d]_ ; ^ [b c; b d] ^ ; ^ [a d; b d]^; ^ [ [b c; a c] ^ ;^ Figure 8: Multiplication of general non-empty real intervals when neither interval is [0; 0].
P P P P M
1 0 0
P P P M M
1 1 0
Class Class of [a; b]; of [c; d] ;
P P M P P M P P M
[a=d; b=c]^; ^ [0; b=c]; ^ [a=c; b=c]^ ; ^ [0; 1];f [a=d; 1]^;f [?1; +1]f;f [?1; a=c]f;^ [ [a=d; 1]^;f [?1; 0]f; [ [0; 1];f [?1; +1]f;f Figure 9: Functional division of general non-empty, real intervals when neither interval is [0; 0]. 1 0
0 1
1 0
P P P P P P M M M
[a; b]; =[c; d] ; general formula
1 1
1
0 0
0
30
De nition 11 The optimal general IEEE approximation of a set U is a set S satisfying U S, S is a nite union of general intervals with IEEE endpoints, and no smaller such S exists.
We denote S , if it exists, by ? (U ). Observe that not every set has an optimal general IEEE approximation, but any set with nitely many connected components will have one. To be able to compute optimal general IEEE approximations, we need to introduce the following boolean function, . De nition 12 Let F denote the set of oating point numbers and let (x) be the boolean function on the extended reals R = R [ f?1; 1g which is true if x 2 F \ R and false otherwise. That is, is the characteristic function of F \ R. Observe that (x y) and (x=y) for oating point numbers x and y can eciently be computed using the IEEE standard, by checking for the \inexactness exception" which the hardware throws whenever the operations require rounding. This information is precisely what we need to obtain an optimal approximation of the interval arithmetic operations, as the following theorem shows. Theorem 14 Let X = [a; b]; and Y = [c; d] ; be general IEEE-standard intervals. ?([a; b]; + [c; d] ;) = [a+loc; b+hid]^ ^(a+c); ^^(b+d) ?([a; b]; ? [c; d] ;) = [a?lod; b?hic]^^(a?d); ^ ^(b?c) and the optimal general IEEE approximations of X Y and X=Y are given by the formula tables in Figures 10 and 11, where we have omitted the cases that can be obtained from the given ones by symmetry in the multiplication table. Proof. The formulas in these tables are obtained from the corresponding formulas for the case of general real intervals by using outward rounding to guarantee that the resulting interval is an IEEE-standard interval which provides a sound approximation to the result, and observing that if outward rounding is necessary in computing a given endpoint, then the optimal approximation is obtained by not including that endpoint in the interval. This is indicated by conjoining the boolean formula for the endpoint E with (E ). Also, in the division table, there is no longer any need to use the set subtraction notation to explicitly remove f0g from some of the answer sets. Indeed, this is handled easily using the boolean endpoint tags. For example, in the case P1=P , the formula in the table is [a=lod; b=hi c]^^(a=d); ^ ^(b=c) in the case where d = 1, we observe that = f and so the result is [0; b=hi c]f; ^ ^(b=c) and 0 is excluded from the solution set, as it should be. 2
31
?([a; b]; [c; d] ;) general formula
Class of Class of [a; b]; [c; d] ;
[0; bhid]_ ; ^^ bd [0; bhid]; ^^ bd [aloc; bhid]^ ^ ac ; ^^ bd [bloc; bhid] ^ ^ bc ; ^^ bd [alod; bhid]^^ ad ; ^^ bd [ [bloc; ahic] ^ ^ bc ;^ ^ ac Figure 10: Multiplication of general non-empty IEEE intervals when neither is [0; 0].
P P P P M
P P P M M
0
0
0
(
1
1
(
1
(
Class Class of [a; b]; of [c; d] ;
N N M P P N N M P P N N M P P
)
(
(
)
(
)
(
)
(
)
)
(
)
(
)
?([a; b]; =[c; d] ; ) general formula
[b=loc; a=hid] ^ ^ b=c ;^^ a=d [0; a=hid] ;^^ a=d [b=lod; a=hid] ^^ b=d ;^^ a=d [b=lod; 0] ^^ b=d ; [b=lod; a=hic] ^^ b=d ;^ ^ a=c [ ? 1; b=hid]f; ^^ b=d [ [b=lo c; 1] ^ ^ b=c ;f [ ? 1; 0]f; [ [0; 1] ;f [?1; +1]f;f [ ? 1; 0]f; [ [0; 1];f [ ? 1; a=hic]f;^ ^ a=c [ [a=lod; 1]^^ a=d ;f [a=loc; b=hid]^ ^ a=c ; ^^ b=d [a=loc; 0]^ ^ a=c ; [a=loc; b=hi c]^ ^ a=c ; ^ ^ b=c [0; b=hic]; ^ ^ b=c [a=lod; b=hic]^^ a=d ; ^ ^ b=c Figure 11: Functional division of general IEEE intervals, when neither interval is [0; 0]. Note: This table shows the optimal approximation ?(X=Y ), i.e., the smallest union of general intervals with IEEE endpoints which contains X=Y . 1 0
0 1
1 0
0 1
1 0
0 1
N N N N N M M M M M P P P P P
)
)
(
)
(
(
(
)
(
)
(
)
)
)
(
(
)
)
(
)
(
(
)
(
(
)
(
(
)
(
(
(
)
(
)
(
)
)
)
)
)
)
32
7 Related work Contributions to division by an interval containing zero span a long period. The initial contribution by Kahan [10] was important if only for pointing out that such an operation can be usefully de ned. Kahan capitalized on the fact that, whether division results in a connected set or not, only a single pair of reals is needed to specify the result. Such a pair is, according to Kahan, an interior interval (a connected set) or an exterior interval (a union of two connected sets). Expressions for the bounds of the result of interval division can be found in Novoa [14], Hansen [7], Hammer/Hocks/Kulisch/Ratz [5], Ratz [18], and Walster [20]. The book [7] gives some unnecessarily wide intervals; [5] gives the formulas in the form of the PascalXSC code and is an improvement on the later [18] in that it does not let division depend on the inverse operation. The work on BNR Prolog was in many ways the most advanced when its Unix version came out in 1994. The semantics of the language suggest the soundness properties that interval arithmetic can have. Like Pascal-XSC, BNR Prolog optimally determines [?0:5; 0:5] \ ([1; 1]=[?1; 1]) to be empty. Unlike Pascal-XSC code in [5], BNR Prolog optimally determines [?1; 0] \ ([1; 1]=[1; 1]) to be empty because the quotient does not contain its greatest lower bound. Unfortunately, apart from [15, 2] nothing seems to have been published about the arithmetic of BNR Prolog.
8 Conclusions We should emphasize that we do not attempt to decree that the full detail of interval division, as revealed in this paper, be implemented. In any implementation, eciency and simplicity of code have to be weighed against minimizing departures from optimality. Our work can be interpreted as saying to implementers: \Here are all the cases in which topologically distinguishable results appear | there is no more detail. Preserving soundness, simplify as much as you need to." Ideally, interval arithmetic should have the following properties. In this paper we have shown that the ideal is realizable. 1. Faithfulness to the mathematical model: Much of numerical computation is based on a mathematical model in which the variables range over the reals. In conventional computation oating-point numbers are substituted for the reals, with all the well-documented dire consequences; see Forsythe [4] for an early warning. The current state of the art makes it possible to regard interval computations as computergenerated proofs that certain reals (real reals) belong to certain small sets of reals (intervals with oating-point bounds that are not much wider than the limits imposed by the processor's precision). This breakthrough has taken place by piecemeal improvements through the long history of interval arithmetic.
33 2. Soundness: the interval resulting from an operation contains all values that can result from the arithmetic operations on the reals contained in the argument intervals. As a result actual hardware can be used to prove nonexistence of solutions. 3. Optimality: the result is the smallest interval that produces a sound result. We have proved the optimality of our algorithms for computing the basic arithmetic operators for all classes of intervals considered in the paper (real and oating point, closed and non-closed). 4. Closure: unde ned results are to be avoided. Even in the ideal case, that of the real numbers, there is an unde ned operation: division by zero. The IEEE standard has reduced this to just the case of 0=0. To obtain closure, the in nities had to be added, and these introduced more unde ned cases than they removed. It is still an important goal to avoid such unde ned results as much as possible. In this paper we show that the features of the IEEE standard farsightedly conspire to make it possible to avoid all unde ned cases . 5. Eciency: to use the operations of the standard as much as possible rather than use tests to single out as special cases divisions by zero or operations involving +1 or ?1. Numerical analysis grew up with the rule of thumb that tests are cheap while divisions are expensive. No longer: a test can empty the instruction pipeline, resulting in many more lost cycles than required by a division. For the sake of eciency it is therefore important to avoid tests as much as possible and rely on the results de ned by the IEEE standard for operations involving the in nities. 1
Long experience has shown that reals are hard to represent and operate on. Paradoxically, sets of reals are more tractable. Unde ned results have given trouble in various forms, such as over ow and division by zero. We have shown that these are avoidable. What is widely perceived as a problem of interval arithmetic, that result intervals are unnecessarily wide, is only a problem in pure expression evaluation. In practice, expression evaluation is subordinate to a solving algorithm, such as Interval Newton. This algorithm forces intervals to become as narrow as required by the user, while conserving soundness. Interval Newton is the paradigm on which other solving algorithms and as well as optimization are modeled [7, 9]. What most people know about interval arithmetic is that it is a safe alternative for expression evaluation. The practitioner rightly suspects that the usual examples, where expression evaluation goes wildly wrong, are contrived. What is not widely known is that interval arithmetic is a powerful method for extending numerical computation into areas such as nonlinear equation solving and non-convex global optimization [7, 9] where noninterval methods experience serious diculties. 1 For
us, \NaN" stands for \Never a NaN!".
34
Acknowledgment The authors wish to thank Huan Wu for valuable discussions and comments.
References [1] Gotz Alefeld and Jurgen Herzberger. Introduction to Interval Computations. Academic Press, 1983. [2] Frederic Benhamou and William J. Older. Applying interval arithmetic to real, integer, and Boolean constraints. Journal of Logic Programming, 32:1{24, 1997. [3] J.J. Ebers and J.L. Moll. Large-scale behaviour of junction transistors. IEE Proceedings, 42:1761{1771, 1954. [4] George E. Forsythe. Pitfalls of computation, or why a math book isn't enough. Amer. Math. Monthly, 77:931{956, 1970. [5] R. Hammer, M. Hocks, U. Kulisch, and D. Ratz. Numerical Toolbox for Veri ed Computing I. Springer-Verlag, 1993. [6] Eldon Hansen. Topics in Interval Analysis. Oxford University Press, 1969. [7] Eldon Hansen. Global Optimization Using Interval Analysis. Marcel Dekker, 1992. [8] R.J. Hanson. Interval arithmetic as a closed arithmetic system on a computer. Technical Report 197, Jet Propulsion Laboratory, 1968. [9] Pascal Van Hentenryck, Laurent Michel, and Yves Deville. Numerica: A Modeling Language for Global Optimization. MIT Press, 1997. [10] W.M. Kahan. A more complete interval arithmetic. Technical report, University of Toronto, Canada, 1968. [11] R. Baker Kearfott. Rigorous Global Search: Continuous Problems. Kluwer Academic Publishers, 1996. Nonconvex Optimization and Its Applications. [12] Seymour Lipschutz. General Topology. Schaum's Outline Series, 1965. [13] Ramon E. Moore. Interval Analysis. Prentice-Hall, 1966. [14] Manuel Novoa. Theory of preconditioners for the interval Gauss-Seidel method and existence/uniqueness theory with interval Newton methods. Department of Mathematics, University of Southwestern Louisiana, 1993. [15] W.J. Older. Interval arithmetic speci cation. Technical report, Bell-Northern Research Computing Research Laboratory, 1989.
35 [16] Jean-Francois Puget and Pascal Van Hentenryck. A constraint satisfaction approach to a circuit design problem. Journal of Global Optimization, 13(1):410{423, 1998. [17] H. Ratschek and J. Rokne. Experiments using interval analysis for solving a circuit design problem. Journal of Global Optimization, 3:501{518, 1993. [18] D. Ratz. On extended interval arithmetic and inclusion isotonicity. Institut fur Angewandte Mathematik, Universitat Karlsruhe, 1996. [19] J. Stol and L. de Figueiredo. Self-validated numerical methods and applications, 1997. [20] G. William Walster. The extended real interval system. available on the internet, 1998. http://www.mscs.mu.edu/ globsol/readings.html.