Mathematics: Examples, Methods and. Implications. David H. Bailey and
Jonathan M. Borwein. 502. NOTICES OF THE AMS. VOLUME 52, NUMBER 5.
Experimental Mathematics: Examples, Methods and Implications David H. Bailey and Jonathan M. Borwein
The object of mathematical rigor is to sanction and legitimize the conquests of intuition, and there was never any other object for it. —Jacques Hadamard1 If mathematics describes an objective world just like physics, there is no reason why inductive methods should not be applied in mathematics just the same as in physics. —Kurt Gödel2
Introduction Recent years have seen the flowering of “experimental” mathematics, namely the utilization of modern computer technology as an active tool in mathematical research. This development is not David H. Bailey is at the Lawrence Berkeley National Laboratory, Berkeley, CA 94720. His email address is
[email protected]. This work was supported by the Director, Office of Computational and Technology Research, Division of Mathematical, Information, and Computational Sciences of the U.S. Department of Energy, under contract number DE-AC03-76SF00098. Jonathan M. Borwein is Canada Research Chair in Collaborative Technology and Professor of Computer Science and of Mathematics at Dalhousie University, Halifax, NS, B3H 2W5, Canada. His email address is
[email protected]. This work was supported in part by NSERC and the Canada Research Chair Programme. 1 Quoted at length in E. Borel, Leçons sur la theorie des
fonctions, 1928. 2 Kurt Gödel, Collected Works, Vol. III, 1951.
502
NOTICES
OF THE
limited to a handful of researchers nor to a handful of universities, nor is it limited to one particular field of mathematics. Instead, it involves hundreds of individuals, at many different institutions, who have turned to the remarkable new computational tools now available to assist in their research, whether it be in number theory, algebra, analysis, geometry, or even topology. These tools are being used to work out specific examples, generate plots, perform various algebraic and calculus manipulations, test conjectures, and explore routes to formal proof. Using computer tools to test conjectures is by itself a major timesaver for mathematicians, as it permits them to quickly rule out false notions. Clearly one of the major factors here is the development of robust symbolic mathematics software. Leading the way are the Maple and Mathematica products, which in the latest editions are far more expansive, robust, and user-friendly than when they first appeared twenty to twenty-five years ago. But numerous other tools, some of which emerged only in the past few years, are also playing key roles. These include: (1) the Magma computational algebra package, developed at the University of Sydney in Australia; (2) Neil Sloane’s online integer sequence recognition tool, available at http://www.research.att.com/njas/ sequences; (3) the inverse symbolic calculator (an online numeric constant recognition facility), available at http://www.cecm.sfu.ca/projects/ISC; (4) the electronic geometry site at http://www. eg-models.de; and numerous others. See
AMS
VOLUME 52, NUMBER 5
http://www.experimentalmath.info for a more complete list, with links to their respective websites. We must of course also give credit to the computer industry. In 1965 Gordon Moore, before he served as CEO of Intel, observed: The complexity for minimum component costs has increased at a rate of roughly a factor of two per year. . . . Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years. [29] Nearly forty years later, we observe a record of sustained exponential progress that has no peer in the history of technology. Hardware progress alone has transformed mathematical computations that were once impossible into simple operations that can be done on any laptop. Many papers have now been published in the experimental mathematics arena, and a full-fledged journal, appropriately titled Experimental Mathematics, has been in operation for twelve years. Even older is the AMS journal Mathematics of Computation, which has been publishing articles in the general area of computational mathematics since 1960 (since 1943 if you count its predecessor). Just as significant are the hundreds of other recent articles that mention computations but which otherwise are considered entirely mainstream work. All of this represents a major shift from when the present authors began their research careers, when the view that “real mathematicians don’t compute” was widely held in the field. In this article, we will summarize some of the discoveries and research results of recent years, by ourselves and by others, together with a brief description of some of the key methods employed. We will then attempt to ascertain at a more fundamental level what these developments mean for the larger world of mathematical research.
Integer Relation Detection One of the key techniques used in experimental mathematics is integer relation detection, which in effect searches for linear relationships satisfied by a set of numerical values. To be precise, given a real or complex vector (x1 , x2 , · · · , xn ) , an integer relation algorithm is a computational scheme that either finds the n integers (ai ) , not all zero, such that a1 x1 + a2 x2 + · · · an xn = 0 (to within available numerical accuracy) or else establishes that there is no such integer vector within a ball of radius A about the origin, where the metric is the Euclidean norm: A = (a12 + a22 + · · · + an2 )1/2 . Integer relation computations require very high MAY 2005
precision in the input vector x to obtain numerically meaningful results—at least dn-digit precision, where d = log10 A . This is the principal reason for the interest in very high-precision arithmetic in experimental mathematics. In one recent integer relation detection computation, 50,000-digit arithmetic was required to obtain the result [9]. At the present time, the best-known integer relation algorithm is the PSLQ algorithm [26] of mathematician-sculptor Helaman Ferguson, who, together with his wife, Claire, received the 2002 Communications Award of the Joint Policy Board for Mathematics (AMS-MAA-SIAM). Simple formulations of the PSLQ algorithm and several variants are given in [10]. The PSLQ algorithm, together with related lattice reduction schemes such as LLL, was recently named one of ten “algorithms of the century” by the publication Computing in Science and Engineering [4]. PSLQ or a variant is implemented in current releases of most computer algebra systems.
Arbitrary Digit Calculation Formulas The best-known application of PSLQ in experimental mathematics is the 1995 discovery, by means of a PSLQ computation, of the “BBP” formula for π : (1) π=
∞ k=0
1 16k
2 1 1 4 − − − . 8k + 1 8k + 4 8k + 5 8k + 6
This formula permits one to directly calculate binary or hexadecimal digits beginning at the n-th digit, without needing to calculate any of the first n − 1 digits [8], using a simple scheme that requires very little memory and no multiple-precision arithmetic software. It is easiest to see how this individual digitcalculating scheme works by illustrating it for a similar formula, known at least since Euler, for log 2 : ∞ 1 log 2 = . n n2 n=1 Note that the binary expansion of log 2 beginning after the first d binary digits is simply {2d log 2} , where by {·} we mean fractional part. We can write ∞ d ∞ 2d−n 2d−n 2d−n = + n n n=d+1 n n=1 n=1 d ∞ 2d−n mod n 2d−n + , = n n n=1 n=d+1
{2d log 2} = (2)
where we insert “mod n ” in the numerator of the first term of (2), since we are interested only in the fractional part after division by n. Now the expression 2d−n mod n may be evaluated very rapidly by means of the binary algorithm for exponentiation, where each multiplication is reduced
NOTICES
OF THE
AMS
503
modulo n . The entire scheme indicated by formula (2) can be implemented on a computer using ordinary 64-bit or 128-bit arithmetic; high-precision arithmetic software is not required. The resulting floating-point value, when expressed in binary format, gives the first few digits of the binary expansion of log 2 beginning at position d + 1 . Similar calculations applied to each of the four terms in formula (1) yield a similar result for π . The largest computation of this type to date is binary digits of π beginning at Figure 1. Ferguson’s “Figure the quadrillionth (1015 -th) binary Eight Knot Complement” digit, performed by an internasculpture. tional network of computers organized by Colin Percival. The BBP formula for π has even found a practical application: it is now employed in the g95 Fortran compiler as part of transcendental function evaluation software. Since 1995 numerous other formulas of this type have been found and proven using a similar experimental approach. Several examples include: (3) ∞ 9 1 16 8 2 1 π 3= − − − , 32 k=0 64k 6k + 1 6k + 2 6k + 4 6k + 5 (4) π2 =
∞ 144 1 1 216 72 54 9 − − − + , 8 k=0 64k (6k + 1)2 (6k + 2)2 (6k + 3)2 (6k + 4)2 (6k + 5)2
π2 =
∞ 2 1 405 81 27 243 − − − 27 k=0 729k (12k + 1)2 (12k + 2)2 (12k + 4)2 (12k + 5)2
(5) −
72 9 9 5 1 − − − + (12k + 6)2 (12k + 7)2 (12k + 8)2 (12k + 10)2 (12k + 11)2
(6)
3 arctan
,
√ ∞ 3 1 3 1 , = + k 7 27 3k + 1 3k + 2 k=0
(7)
√ √5 ∞ 25 781 57 − 5 5 1 1 5 = √ log + . 2 256 57 + 5 5 55k 5k + 2 5k + 3 k=0
Formulas (3) and (4) permit arbitrary-position √ binary digits to be calculated for π 3 and π 2 . Formulas (5) and (6) permit the same for ternary √ √ (base-3) expansions of π 2 and 3 arctan( 3/7) . Formula (7) permits the same for the base-5 expansion of the curious constant shown. A compendium of known BBP-type formulas, with references, is available at [5]. One interesting twist here is that the hyperbolic volume of one of Ferguson’s sculptures (the 504
NOTICES
OF THE
“Figure Eight Knot Complement”;3 see Figure 1), which is given by ∞ V =2 3
2n−1 1 1 2n k n=1 n k=n n
= 2.029883212819307250042405108549. . . , has been identified in terms of a BBP-type formula by application of Ferguson’s own PSLQ algorithm. In particular, British physicist David Broadhurst found in 1998, using a PSLQ program, that √ ∞ 3 (−1)n 9 n=0 27n 18 18 24 6 2 . × − − − + (6n + 1)2 (6n + 2)2 (6n + 3)2 (6n + 4)2 (6n + 5)2
V =
This result is proven in [15, Chap. 2, Prob. 34].
Does Pi Have a Nonbinary BBP Formula? Since the discovery of the BBP formula for π in 1995, numerous researchers have investigated, by means of computational searches, whether there is a similar formula for calculating arbitrary digits of π in other number bases (such as base 10). Alas, these searches have not been fruitful. Recently, one of the present authors (JMB), together with David Borwein (Jon’s father) and William Galway, established that there is no degree-1 BBPtype formula for π for bases other than powers of two (although this does not rule out some other scheme for calculating individual digits). We will sketch this result here. Full details and some related results can be found in [20]. In the following, (z) and (z) denote the real and imaginary parts of z , respectively. The integer b > 1 is not a proper power if it cannot be written as c m for any integers c and m > 1 . We will use the notation ordp (z) to denote the p-adic order of the rational z ∈ Q . In particular, ordp (p) = 1 for prime p , while ordp (q) = 0 for primes q ≠ p , and ordp (wz) = ordp (w) + ordp (z) . The notation νb (p) will mean the order of the integer b in the multiplicative group of the integers modulo p. We will say that p is a primitive prime factor of bm − 1 if m is the least integer such that p|(bm − 1). Thus p is a primitive prime factor of bm − 1 provided νb (p) = m . Given the Gaussian integer z ∈ Q[i] and the rational prime p ≡ 1 (mod 4) , let θp (z) denote ordp (z) − ordp (z) , where p and p are the two conjugate Gaussian primes dividing p and where we require 0 < (p) < (p) to make the definition of θp unambiguous. Note that
(8)
θp (w z) = θp (w ) + θp (z).
Given κ ∈ R , with 2 ≤ b ∈ Z and b not a proper power, we say that κ has a Z -linear or Q -linear 3 Reproduced by permission of the sculptor.
AMS
VOLUME 52, NUMBER 5
Machin-type BBP arctangent formula to the base b if and only if κ can be written as a Z -linear or Q linear combination (respectively) of generators of the form i 1 = log 1 + arctan bm bm ∞ (−1)k = bm (9) . b2mk (2k + 1) k=0 We shall also use the following result, first proved by Bang in 1886: Theorem 1. The only cases where bm − 1 has no primitive prime factor(s) are when b = 2, m = 6, bm − 1 = 32 · 7 or when b = 2N − 1, N ∈ Z, m = 2, bm − 1 = 2N+1 (2N−1 − 1) . We can now state the main result: Theorem 2. Given b > 2 and not a proper power, there is no Q -linear Machin-type BBP arctangent formula for π . Proof: It follows immediately from the definition of a Q -linear Machin-type BBP arctangent formula that any such formula has the form
π=
(10)
M 1 nm log(bm − i), n m=1
where n > 0 ∈ Z , nm ∈ Z , and M ≥ 1 , nM ≠ 0 . This implies that M
(11)
(bm − i)nm ∈ eniπ Q× = Q× .
m=1
For any b > 2 and not a proper power, it follows from Bang’s Theorem that b4M − 1 has a primitive prime factor, say p. Furthermore, p must be odd, since p = 2 can only be a primitive prime factor of bm − 1 when b is odd and m = 1 . Since p is a primitive prime factor, it does not divide b2M − 1 , and so p must divide b2M + 1 = (bM + i)(bM − i) . We cannot have both p|bM + i and p|bM − i, since this would give the contradiction that p|(bM + i)− (bM − i) = 2i . It follows that p ≡ 1 (mod 4) and that p factors as p = pp over Z[i], with exactly one of p, p dividing bM − i . Referring to the definition of θ, we see that we must have θp (bM − i) ≠ 0 . Furthermore, for any m < M , neither p nor p can divide bm − i , since this would imply p | b4m − 1 , 4m < 4M , contradicting the fact that p is a primitive prime factor of b4M − 1 . So for m < M , we have θp (bm − i) = 0 . Referring to equation (10) and using equation (8) and the fact that nM ≠ 0 , we get the contradiction
0 ≠ nM θp (bM − i) (12)
=
M m=1
MAY 2005
nm θp (bm − i) = θp (Q× ) = 0.
Thus our assumption that there was a b-ary Machintype BBP arctangent formula for π must be false.
Normality Implications of the BBP Formulas One interesting (and unanticipated) discovery is that the existence of these computer-discovered BBPtype formulas has implications for the age-old question of normality for several basic mathematical constants, including π and log 2 . What’s more, this line of research has recently led to a fullfledged proof of normality for an uncountably infinite class of explicit real numbers. Given a positive integer b , we will define a real number α to be b -normal if every m-long string of base-b digits appears in the base-b expansion of α with limiting frequency b−m . In spite of the apparently stringent nature of this requirement, it is well known from measure theory that almost all real numbers are b-normal, for all bases b. Nonetheless, there are very few explicit examples of b -normal numbers, other than the likes of Champernowne’s constant 0.123456789101112131415 . . . . In particular, although computations suggest that virtually all of the well-known irrational √ constants of mathematics (such as π , e, γ, log 2, 2 , etc.) are normal to various number bases, there is not a single proof—not for any of these constants, not for any number base. Recently one of the present authors (DHB) and Richard Crandall established the following result. Let p(x) and q(x) be integer-coefficient polynomials, with deg p < deg q , and q(x) having no zeroes for positive integer arguments. By an equidistributed sequence in the unit interval we mean a sequence (xn ) such that for every subinterval (a, b) , the fraction #[xn ∈ (a, b)]/n tends to b − a in the limit. The result is as follows: Theorem 3. A constant α satisfying the BBP-type formula ∞ p(n) α = bn q(n) n=1 is b -normal if and only if the associated sequence defined by x0 = 0 and, for n ≥ 1 , xn = {bxn−1 + p(n)/q(n)} (where {·} denotes fractional part as before), is equidistributed in the unit interval. For example, log 2 is 2 -normal if and only if the simple sequence defined by x0 = 0 and {xn = 2xn−1 + 1/n} is equidistributed in the unit interval. For π , the associated sequence is x0 = 0 and
xn =
16xn−1 +
120n2 − 89n + 16 512n4 − 1024n3 + 712n2 − 206n + 21
.
Full details of this result are given in [11] [15, Section 3.8]. It is difficult to know at the present time whether this result will lead to a full-fledged proof of normality for, say, π or log 2 . However, this approach NOTICES
OF THE
AMS
505
has yielded a solid normality proof for another class of reals: Given r ∈ [0, 1), let rn be the n-th binary digit of r . Then for each r in the unit interval, the constant ∞ 1 (13) αr = n 23n +rn 3 n=1 is 2 -normal and transcendental [12]. What’s more, it can be shown that whenever r ≠ s , then αr ≠ αs . Thus (13) defines an uncountably infinite class of distinct 2 -normal, transcendental real numbers. A similar conclusion applies when 2 and 3 in (13) are replaced by any pair of relatively prime integers greater than 1. Here we will sketch a proof of normality for one particular instance of these constants, namely n α0 = n≥1 1/(3n 23 ) . Its associated sequence can be seen to be x0 = 0 and xn = {2xn−1 + cn } , where cn = 1/n if n is a power of 3, and zero otherwise. This associated sequence is a very good approximation to the sequence ({2n α0 }) of shifted binary fractions of α0 . In fact, |{2n α0 } − xn | < 1/(2n) . The first few terms of the associated sequence are 1 2 , , 3 3 4 8 7 5 , , , , 9 9 9 9 13 26 25 , , , 27 27 27 13 26 25 , , , 27 27 27 13 26 25 , , , 27 27 27
0, 0, 0,
1 2 1 2 , , , , 3 3 3 3 1 2 4 8 , , , , 9 9 9 9 23 19 11 , , , 27 27 27 23 19 11 , , , 27 27 27 23 19 11 , , , 27 27 27
7 5 1 2 , , , , 9 9 9 9 22 17 7 , , , 27 27 27 22 17 7 , , , 27 27 27 22 17 7 , , , 27 27 27
4 8 7 , , , 9 9 9 14 1 , , 27 27 14 1 , , 27 27 14 1 , , 27 27
5 , 9 2 , 27 2 , 27 2 , 27
1 2 , , 9 9 4 8 , , 27 27 4 8 , , 27 27 4 8 , , 27 27
16 , 27 16 , 27 16 , 27
5 , 27 5 , 27 5 , 27
10 , 27 10 , 27 10 , 27
20 , 27 20 , 27 20 , 27
and so forth. The clear pattern is that of triply repeated segments, each of length 2 · 3m, where the numerators range over all integers relatively prime to and less than 3m+1. Note the very even manner in which this sequence fills the unit interval. Given any subinterval (c, d) of the unit interval, it can be seen that this sequence visits this subinterval no more than 3n(d − c) + 3 times, among the first n elements, provided that n > 1/(d − c) . It can then be shown that the sequence ({2j α}) visits (c, d) no more than 8n(d − c) times, among the first n elements of this sequence, so long as n is at least 1/(d − c)2 . The 2normality of α0 then follows from a result given in [28, p. 77]. Further details on these results are given in [15, Sec. 4.3], [6], [12].
Euler’s Multi-Zeta Sums In April 1993, Enrico Au-Yeung, an undergraduate at the University of Waterloo, brought to the attention of one of us (JMB) the curious result ∞ 1 2 −2 1 k 1 + + ··· + 2 k k=1 (14) 17 17π 4 = 4.59987 . . . ≈ ζ(4) = 4 360 506
NOTICES
OF THE
where ζ(s) = n≥1 n−s is the Riemann zeta function. Au-Yeung had computed the sum in (14) to 500,000 terms, giving an accuracy of five or six decimal digits. Suspecting that his discovery was merely a modest numerical coincidence, Borwein sought to compute the sum to a higher level of precision. Using Fourier analysis and Parseval’s equation, he wrote (15) 1 2π
π
n ∞ ( k=1 1k )2 t (π − t) log (2 sin ) dt = . 2 (n + 1)2 n=1 2
0
2
The series on the right of (15) permits one to evaluate (14), while the integral on the left can be computed using the numerical quadrature facility of Mathematica or Maple. When he did this, Borwein was surprised to find that the conjectured identity (14) holds to more than 30 digits. We should add here that by good fortune, 17/360 = 0.047222 . . . has period one and thus can plausibly be recognized from its first six digits, so that Au-Yeung’s numerical discovery was not entirely far-fetched. Borwein was not aware at the time that (14) follows directly from a 1991 result due to De Doelder and had even arisen in 1952 as a problem in the American Mathematical Monthly. What’s more, it turns out that Euler considered some related summations. Perhaps it was just as well that Borwein was not aware of these earlier results—and indeed of a large, quite deep and varied literature [21]— because pursuit of this and similar questions had led to a line of research that continues to the present day. First define the multi-zeta constant
ζ(s1 , s2 , · · · , sk ) :=
k
n1 >n2 >···>nk >0
j=1
−|sj |
nj
−nj
σj
,
where the s1 , s2 , . . . , sk are nonzero integers and the σj := signum(sj ) . Such constants can be considered as generalizations of the Riemann zeta function at integer arguments in higher dimensions. The analytic evaluation of such sums has relied on fast methods for computing their numerical values. One scheme, based on Hölder Convolution, is discussed in [22] and implemented in EZFace+, an online tool available at http://www.cecm.sfu. ca/projects/ezface+. We will illustrate its application to one specific case, namely the analytic identification of the sum
(16) S2,3 =
∞ 1 2 1 (k + 1)−3 . 1 − + · · · + (−1)k+1 2 k k=1
Expanding the squared term in (16), we have
AMS
VOLUME 52, NUMBER 5
(17)
(−1)i+j+1 = −2 ζ(3, −1, −1) + ζ(3, 2). ijk3 0 0 ,
NOTICES
1 −1
f (x) dx =
∞
f (g(t))g (t) dt
−∞ ∞
=h
wj f (xj ) + E(h),
j=−∞
OF THE
AMS
507
where xj = g(hj) and wj = g (hj) are abscissas and weights that can be precomputed. If g (t) and its derivatives tend to zero sufficiently rapidly for large t , positive and negative, then even in cases where f (x) has a vertical derivative or an integrable singularity at one or both endpoints, the resulting integrand f (g(t))g (t) is, in many cases, a smooth bell-shaped function for which the EulerMaclaurin formula applies. In these cases, the error E(h) in this approximation decreases faster than any power of h. Three suitable g functions are g1 (t) = tanh t , g2 (t) = erf t, and g3 (t) = tanh(π /2 · sinh t) . Among these three, g3 (t) appears to be the most effective for typical experimental math applications. For many integrals, “tanh-sinh” quadrature, as the resulting scheme is known, achieves quadratic convergence: reducing the interval h in half roughly doubles the number of correct digits in the quadrature result. This is another case where we have more heuristic than proven knowledge. As one example, recently the present authors, together with Greg Fee of Simon Fraser University in Canada, were inspired by a recent problem in the American Mathematical Monthly [2]. They found by using a tanh-sinh quadrature program, together with a PSLQ integer relation detection program, that if C(a) is defined by
1 C(a) =
0
√ arctan( x2 + a2 ) dx √ , x2 + a2 (x2 + 1)
then
C(0) = π log 2/8 + G/2, C(1) = π /4 − π 2/2 + 3 arctan( 2)/ 2, C( 2) = 5π 2 /96. Here G = k≥0 (−1)k /(2k + 1)2 is Catalan’s constant—the simplest number whose irrationality is not established but for which abundant numerical evidence exists. These experimental results then led to the following general result, rigorously established, among others: ∞ 0
√ arctan( x2 + a2 ) dx √ x2 + a2 (x2 + 1) π = √ 2 arctan( a2 − 1) − arctan( a4 − 1) . 2 a2 − 1
∞ L3 (s) = n=1 [1/(3n − 2)s − 1/(3n − 1)s ] . where Based on these experimental results, general results of this type have been conjectured but not yet rigorously established. A third example is the following:
(21)
1
L−7 (s) =
0
− 275184L3 (4)π 4 + 36288000L3 (3)ζ(5) − 30008L3 (2)π 6 − 57030120L3 (1)ζ(7) ] ,
508
NOTICES
OF THE
π /3
∞ n=0
tan t + √7 ? √ dt = L−7 (2) log tan t − 7
1 1 1 + − (7n + 1)s (7n + 2)s (7n + 3)s
+
1 1 1 . − − (7n + 4)s (7n + 5)s (7n + 6)s
The “identity” (21) has been verified to over 5000 decimal digit accuracy, but a proof is not yet known. It arises from the volume of an ideal tetrahedron in hyperbolic space, [15, pp. 90–1]. For algebraic topology reasons, it is known that the ratio of the left-hand to the right-hand side of (21) is rational. A related experimental result, verified to 1000 digit accuracy, is ?
=
0
−2J2 − 2J3 − 2J4 + 2J10 + 2J11 + 3J12 + 3J13 + J14 − J15 −J16 − J17 − J18 − J19 + J20 + J21 − J22 − J23 + 2J25 ,
where Jn is the integral in (21), with limits nπ /60 and (n + 1)π /60. The above examples are ordinary onedimensional integrals. Two-dimensional integrals are also of interest. Along this line we present a more recreational example discovered experimentally by James Klein—and confirmed by Monte Carlo simulation. It is that the expected distance between two random points on different sides of a unit square is 2 3
11 1 1 1 x2 + y 2 dx dy + 1 + (u − v)2 du dv 3 0 0 0 0 5 2 1 2 + log( 2 + 1) + , = 9 9 9
and the expected distance between two random points on different sides of a unit cube is 4 5
1111 x2 + y 2 + (z − w )2 dw dx dy dz 0
1 + 5
0
0
0
1111 1 + (y − u)2 + (z − w )2 du dw dy dz 0
=
√ log6 (x) arctan[x 3/(x − 2)] 1 dx = [−229635L3 (8) x+1 81648
+ 29852550L3 (7) log 3 − 1632960L3 (6)π 2 + 27760320L3 (5)ζ(3)
π /2
where
As a second example, recently the present authors empirically determined that 2 √ 3
24 √ 7 7
0
0
0
2 7 4 17 2− 3− + π 75 75 25 75 7 7 log 1 + 2 + log 7 + 4 3 . + 25 25
See [7] for details and some additional examples. It is not known whether similar closed forms exist for higher-dimensional cubes. AMS
VOLUME 52, NUMBER 5
Ramanujan’s AGM Continued Fraction Given a, b, η > 0 , define
Rη (a, b)
a
= η+
b2 4a2 η+ 9b2 η+ η+. ..
.
This continued fraction arises in Ramanujan’s Notebooks. He discovered the beautiful fact that Rη (a, b) + Rη (b, a) a+b = Rη , ab . 2 2 The authors wished to record this in [15] and wished to computationally check the identity. A first attempt to numerically compute R1 (1, 1) directly failed miserably, and with some effort only three reliable digits were obtained: 0.693 . . . . With hindsight, the slowest convergence of the fraction occurs in the mathematically simplest case, namely when a = b . Indeed R1 (1, 1) = log 2 , as the first primitive numerics had tantalizingly suggested. Attempting a direct computation of R1 (2, 2) using a depth of 20000 gives us two digits. Thus we must seek more sophisticated methods. From formula (1.11.70) of [16] we see that for 0 < b < a,
R1 (a, b)
aK(k) K(k ) π sech nπ , 2 n∈Z K 2 (k) + a2 n2 π 2 K(k) √ where k = b/a = θ22 /θ32 , k = 1 − k2 . Here θ2 , θ3 are Jacobian theta functions and K is a complete elliptic integral of the first kind. Writing the previous equation as a Riemann sum, we have ∞ sech(π x/(2a)) R(a) := R1 (a, a) = dx 1 + x2 0 (23) ∞ (−1)k+1 , = 2a 1 + (2k − 1)a k=1 (22)
=
where the final equality follows from the CauchyLindelof Theorem. This sum may also
be written 2a 1 1 1 3 as R(a) = 1+a F 2a + 2 , 1; 2a + 2 ; −1 . The latter form can be used in Maple or Mathematica to determine R(2) = 0.974990988798722096719900334529. . . .
This constant, as written, is a bit√difficult to recognize, but if one first divides by 2 , one can obtain, using the Inverse Symbolic Calculator, an online tool available at the URL http://www. cecm.sfu.ca/projects/ISC/ISCmain.html, that √ the quotient is π /2 − log(1 + 2) . Thus we conclude, experimentally, that MAY 2005
Figure 2. Dynamics and attractors of various iterations. R(2) = 2[π /2 − log(1 + 2)]. Indeed, it follows (see [19]) that 1 1/a t R(a) = 2 dt. 1 + t2 0 Note that R(1) = log 2 . No nontrivial closed form is known for R(a, b) with a ≠ b , although R1
√ 1 1 1 1 1 1 sech(nπ ) 2 β , , β , = 4π 4 4 8π 4 4 2 n∈Z 1 + n2
is close to closed. Here β denotes the classical Beta function. It would be pleasant to find a direct proof of (23). Further details are to be found in [19], [17], [16]. Study of these Ramanujan continued fractions has been facilitated by examining the closely related dynamical system t0 = 1, t1 = 1 , and
NOTICES
OF THE
AMS
509
There are some exceptional cases. JacobsenMasson theory [17], [18] shows that the even/odd fractions for R1 (i, i) behave “chaotically”; neither converge. Indeed, when a = b = i, (tn (i, i)) exhibit a fourfold quasi-oscillation, as n runs through values mod 4. Plotted versus n, the (real) sequence tn (i) exhibits the serpentine oscillation of four separate “necklaces”. The detailed asymptotic is 1 2 π 1 √ 1+O tn (i, i) = cosh π 2 n n (−1)n/2 cos(θ − log(2n)/2) n is even × (−1)(n+1)/2 sin(θ − log(2n)/2) n odd
Figure 3. The subtle fourfold serpent.
where θ := arg Γ ((1 + i)/2) . Analysis is easy given the following striking hypergeometric parametrization of (24) when a = b ≠ 0 (see [18]), which was both experimentally discovered and is computer provable:
(25)
tn (a, a) =
1 1 Fn (a) + Fn (−a), 2 2
where Fn (a) := −
an 21−ω 1 . 2 F1 ω, ω; n + 1 + ω; ω β(n + ω, −ω) 2
Here
Γ (n + 1) , and Γ (n + 1 + ω) Γ (−ω) 1 − 1/a ω := . 2
β(n + 1 + ω, −ω) :=
Figure 4. A period three dynamical system (odd and even iterates). 1 1 tn−2 , (24) tn := tn (a, b) = + ωn−1 1 − n n where ωn = a2 or b2 (from the Ramanujan continued fraction definition), depending on whether n is even or odd. If one studies this based only on numerical values, nothing is evident; one only sees that tn → 0 fairly slowly. However, if we look at this iteration pictorially, we learn significantly more. In particular, if we plot these √ iterates in the complex plane and then scale by n and color the iterations blue or red depending on odd or even n, then some remarkable fine structures appear; see Figure 2. With assistance of such plots, the behavior of these iterates (and the Ramanujan continued fractions) is now quite well understood. These studies have ventured into matrix theory, real analysis, and even the theory of martingales from probability theory [19], [17], [18], [23]. 510
NOTICES
OF THE
Indeed, once (25) was discovered by a combination of insight and methodical computer experiment, its proof became highly representative of the changing paradigm: both sides satisfy the same recursion and the same initial conditions. This can be checked in Maple, and if one looks inside the computation, one learns which confluent hypergeometric identities are needed for an explicit human proof. As noted, study of R devolved to hard but compelling conjectures on complex dynamics, with many interesting proven and unproven generalizations. In [23] consideration is made of continued fractions like
12 a12
S1 (a) = 1+
22 a22 1+
32 a32 . 1 + ..
for any sequence a ≡ (an )∞ n=1 and convergence properties obtained for deterministic and random sequences (an ). For the deterministic case the best results obtained are for periodic sequences, satisfying aj = aj+c for all j and some finite c . The dynamics are considerably more varied, as illustrated in Figure 4.
AMS
VOLUME 52, NUMBER 5
Coincidence and Fraud
C(x)
1
Coincidences do occur, and such examples drive home the need for reasonable caution in this enterprise. For example, the approximations 9801 3 log(640320), π ≈ 2 π ≈ √ 4412 163
0.8 0.6 0.4 0.2
occur for deep number theoretic reasons: the first good to fifteen places, the second to eight. By contrast
0
1
2 x
3
4
–0.2
eπ − π = 19.999099979189475768 . . . , n=2 n=5 n=10
most probably for no good reason. This seemed more bizarre on an eight-digit calculator. Likewise, as spotted by Pierre Lanchon recently, e = 10.10110111111000010101000101100 . . .
Figure 5. First few terms of
n≥1
cos(x/k) .
while π = 11.0010010000111111011010101000 . . .
have 19 bits agreeing in base two—with one reading right to left. More extended coincidences are almost always contrived, as illustrated by the following:
a few correct digits. Thus it is necessary to rewrite the integrand function in a form more suitable for computation. This can be done by writing m (27) f (x) = cos(2x) cos(x/k) exp(fm (x)), 1
∞ [n tanh(π /2)] 1 ≈ , 10n 81 n=1
∞ [n tanh(π )] 1 ≈ . 10n 81 n=1
The first holds to 12 decimal places, while the second holds to 268 places. This phenomenon can be understood by examining the continued fraction expansion of the constants tanh(π /2) and tanh(π ) : the integer 11 appears as the third entry of the first, while 267 appears as the third entry of the second. Bill Gosper, commenting on the extraordinary effectiveness of continued-fraction expansions to “see” what is happening in such problems, declared, “It looks like you are cheating God somehow.” A fine illustration is the unremarkable decimal α = 1.4331274267223117583 . . . whose continued fraction begins [1, 2, 3, 4, 5, 6, 7, 8, 9 . . .] and so most probably is a ratio of Bessel functions. Indeed, I0 (2)/I1 (2) was what generated the decimal. Similarly, π and e are quite different as continued fractions, less so as decimals. A more sobering example of high-precision “fraud” is the integral ∞ ∞ x (26) dx. π2 := cos(2x) cos n 0 n=1 The computation of a high-precision numerical value for this integral is rather challenging, due in part to the oscillatory behavior of n≥1 cos(x/n) (see Figure 2), but mostly due to the difficulty of computing high-precision evaluations of the integrand function. Note that evaluating thousands of terms of the infinite product would produce only MAY 2005
where we choose m > x , and where ∞ x (28) . fm (x) = log cos k k=m+1 The log cos evaluation can be expanded in a Taylor series [1, p. 75], as follows: log cos
∞ (−1) j 22j−1 (22j − 1)B2j x 2j x , = k j(2j)! k j=1
where B2j are Bernoulli numbers. Note that since k > m > x in (28), this series converges. We can now write fm (x)
=
=
=
∞
∞ (−1) j 22j−1 (22j − 1)B2j x 2j j(2j)! k k=m+1 j=1 ∞ ∞ (22j − 1)ζ(2j) 1 2j − x 2j 2j jπ k j=1 k=m+1 ∞ m (22j − 1)ζ(2j) 1 2j − ζ(2j) − x . jπ 2j k2j j=1 k=1
This can now be written in a compact form for computation as ∞ (29) fm (x) = − aj bj,m x2j , j=1
where
(22j − 1)ζ(2j) , jπ 2j m = ζ(2j) − 1/k2j .
aj = (30)
NOTICES
bj,m
k=1
OF THE
AMS
511
However, ∞ x x I8 := sinc(x)sinc · · · sinc dx 3 15 0 467807924713440738696537864469 = π 935615849440640907310521750000 ≈ 0.499999999992646π . When this was first found by a researcher using a well-known computer algebra package, both he and the software vendor concluded there was a “bug” in the software. Not so! It is easy to see that the limit of these integrals is 2 π1 , where ∞ ∞ x π1 := cos(x) cos (31) dx. n 0 n=1
Figure 6. Advanced Collaborative Environment in Vancouver. Computation of these b coefficients must be done to a much higher precision than that desired for the quadrature result, since two very nearly equal quantities are subtracted here. The integral can now be computed using, for example, the tanh-sinh quadrature scheme. The first 60 digits of the result are the following:
0.3926990816987241548078304229099 37860524645434187231595926812. . . .
37860524646174921888227621868. . . , reveals that they are not equal: the two values differ by approximately 7.407 × 10−43 . Indeed, these two values are provably distinct. The reason is 55 1/(2n + 1) > 2 > governed by the fact that n=1 54 n=1 1/(2n + 1) . See [16, Chap. 2] for additional details. A related example is the following. Recall the sinc function sin x sinc(x) := . x Consider the seven highly oscillatory integrals below. ∞
π sinc(x) dx = , 2 x π sinc(x)sinc I2 := dx = , 3 2 0 ∞ x x π I3 := sinc(x)sinc sinc dx = , 3 5 2 0 ..
0 ∞
. ∞
π x x · · · sinc dx = , 3 11 2 0 ∞ x x π I7 := sinc(x)sinc · · · sinc dx = . 3 13 2 0
I6 :=
512
sinc(x)sinc
NOTICES
OF THE
0
with the volume of the polyhedron PN given by N PN := {x : | ak xk | ≤ a1 , |xk | ≤ 1, 2 ≤ k ≤ N}, k=2
where x := (x2 , x3 , · · · , xN ) . If we let
CN := {(x2 , x3 , · · · , xN ) : −1 ≤ xk ≤ 1, 2 ≤ k ≤ N},
At first glance, this appears to be π /8 . But a careful comparison with a high-precision value of π /8 , namely 0.3926990816987241548078304229099
I1 :=
This can be seen via Parseval’s theorem, which links the integral ∞ IN := sinc(a1 x)sinc (a2 x) · · · sinc (aN x) dx
then
IN =
π Vol(PN ) . 2a1 Vol(CN )
Thus,the value drops precisely when the conN straint k=2 ak xk ≤ a1 becomes active N and bites the hypercube CN . That occurs when k=2 ak > a1 . 1 1 1 In the above, 3 + 5 + · · · + 13 < 1, but on addition 1
of the term 15 , the sum exceeds 1 , the volume π drops, and IN = 2 no longer holds. A similar analysis applies to π2 . Moreover, it is fortunate that we began with π1 or the falsehood of the identity analogous to that displayed above would have been much harder to see.
Further Directions and Implications In spite of the examples of the previous section, it must be acknowledged that computations can in many cases provide very compelling evidence for mathematical assertions. As a single example, recently Yasumasa Kanada of Japan calculated π to over one trillion decimal digits (and also to over one trillion hexadecimal digits). Given that such computations—which take many hours on large, stateof-the-art supercomputers—are prone to many types of error, including hardware failures, system software problems, and especially programming bugs, how can one be confident in such results? In Kanada’s case, he first used two different arctangent-based formulas to evaluate π to over one trillion hexadecimal digits. Both calculations
AMS
VOLUME 52, NUMBER 5
agreed that the hex expansion beginning at position 1,000,000,000,001 is B4466E8D21 5388C4E014. He then applied a variant of the BBP formula for π , mentioned in Section 3, to calculate these hex digits directly. The result agreed exactly. Needless to say, it is exceedingly unlikely that three different computations, each using a completely distinct computational approach, would all perfectly agree on these digits unless all three are correct. Another, much more common, example is the usage of probabilistic primality testing schemes. Damgard, Landrock, and Pomerance showed in 1993 that if an integer n has k bits, then the probability that it is prime, provided it passes the most commonly √ used probabilistic test, is greater than 1 − k2 42− k , and for certain k is even higher [25]. For instance, if n has 500 bits, then this probability is greater than 1 − 1/428m . Thus a 500-bit integer that passes this test even once is prime with prohibitively safe odds: the chance of a false declaration of primality is less than one part in Avogadro’s number (6 × 1023 ) . If it passes the test for four pseudorandomly chosen integers a, then the chance of false declaration of primality is less than one part in a googol (10100 ) . Such probabilities are many orders of magnitude more remote than the chance that an undetected hardware or software error has occurred in the computation. Such methods thus draw into question the distinction between a probabilistic test and a “provable” test. Another interesting question is whether these experimental methods may be capable of discovering facts that are fundamentally beyond the reach of formal proof methods, which, due to Gödel’s result, we know must exist; see also [24]. One interesting example, which has arisen in our work, is the following. We mentioned in Section 3 the fact that the question of the 2 -normality of π reduces to the question of whether the chaotic iteration x0 = 0 and
xn
=
16xn−1 +
120n2 − 89n + 16 512n4 − 1024n3 + 712n2 − 206n + 21
is any error among the remaining digits after the first million is less than 1.465 × 10−8 [15, Section 4.3]. Additional computations could be used to lower this probability even more. Although few would bet against such odds, these computations do not constitute a rigorous proof that the sequence (yn ) is identical to the hexadecimal expansion of π . Perhaps someday someone will be able to prove this observation rigorously. On the other hand, maybe not—maybe this observation is in some sense an “accident” of mathematics, for which no proof will ever be found. Perhaps numerical validation is all we can ever achieve here.
Conclusion We are only now beginning to digest some very old ideas: Leibniz’s idea is very simple and very profound. It’s in section VI of the Discours [de métaphysique]. It’s the observation that the concept of law becomes vacuous if arbitrarily high mathematical complexity is permitted, for then there is always a law. Conversely, if the law has to be extremely complicated, then the data is irregular, lawless, random, unstructured, patternless, and also incompressible and irreducible. A theory has to be simpler than the data that it explains, otherwise it doesn’t explain anything. —Gregory Chaitin [24]
,
where {·} denotes fractional part, are equidistributed in the unit interval. It turns out that if one defines the sequence yn = 16xn (in other words, one records which of the 16 subintervals of (0, 1) , numbered 0 through 15, xn lies in), that the sequence (yn ) , when interpreted as a hexadecimal string, appears to precisely generate the hexadecimal digit expansion of π . We have checked this to 1,000,000 hex digits and have found no discrepancies. It is known that (yn ) is a very good approximation to the hex digits of π , in the sense that the expected value of the number of errors is finite [15, Section 4.3] [11]. Thus one can argue, by the second Borel-Cantelli lemma, that in a heuristic sense the probability that there MAY 2005
Figure 7. Polyhedra in an immersive environment.
Chaitin argues convincingly that there are many mathematical truths which are logically and computationally irreducible—they have no good reason in the traditional rationalist sense. This in turn adds force to the desire for evidence even when proof may not be possible. Computer experiments
NOTICES
OF THE
AMS
513
can provide precisely the sort of evidence that is required. Although computer technology had its roots in mathematics, the field is a relative latecomer to the application of computer technology, compared, say, with physics and chemistry. But now this is changing, as an army of young mathematicians, many of whom have been trained in the usage of sophisticated computer math tools from their high school years, begin their research careers. Further advances in software, including compelling new mathematical visualization environments (see Figures 6 and 7), will have their impact. And the remarkable trend towards greater miniaturization (and corresponding higher power and lower cost) in computer technology, as tracked by Moore’s Law, is pretty well assured to continue for at least another ten years, according to Gordon Moore himself and other industry analysts. As Richard Feynman noted back in 1959, “There’s plenty of room at the bottom” [27]. It will be interesting to see what the future will bring. References [1] MILTON ABRAMOWITZ and IRENE A. STEGUN, Handbook of Mathematical Functions, New York, 1970. [2] Z AFAR A HMED , Definitely an integral, Amer. Math. Monthly 109 (2002), 670–1. [3] KENDALL E. ATKINSON, An Introduction to Numerical Analysis, Wiley and Sons, New York, 1989. [4] DAVID H. BAILEY, Integer relation detection, Comput. Sci. Engineering 2 (2000), 24–8. [5] ——— , A compendium of BBP-type formulas for mathematical constants, http://crd.lbl.gov/ ~dhbailey/dhbpapers/bbp-formulas.pdf (2003). [6] ——— , A hot spot proof of normality for the alpha constants, http://crd.lbl.gov/~dhbailey/ dhbpapers/alpha-normal.pdf (2005). [7] DAVID H. BAILEY, JONATHAN M. BORWEIN, VISHAA KAPOOR, and ERIC WEISSTEIN, Ten problems of experimental mathematics, http://crd.lbl.gov/~dhbailey/ dhbpapers/tenproblems.pdf (2004). [8] DAVID H. BAILEY, PETER B. BORWEIN, and SIMON PLOUFFE, On the rapid computation of various polylogarithmic constants, Math. of Comp. 66 (1997), 903–13. [9] DAVID H. BAILEY and DAVID J. BROADHURST, A seventeenthorder polylogarithm ladder, http://crd.lbl.gov/ ~dhbailey/dhbpapers/ladder.pdf (1999). [10] ——— , Parallel integer relation detection: Techniques and applications, Math. of Comp. 70 (2000), 1719–36. [11] DAVID H. BAILEY and RICHARD E. CRANDALL, Random generators and normal numbers, Experiment. Math. 10 (2001), 175–90. [12] ——— , Random generators and normal numbers, Experiment. Math. 11 (2004), 527–46. [13] DAVID H. BAILEY and XIAOYE S. LI, A comparison of three high-precision quadrature schemes, http://crd.lbl. gov/~dhbailey/dhbpapers/quadrature.pdf (2004). [14] DAVID H. BAILEY and SINAI ROBINS, Highly parallel, highprecision numerical quadrature, http://crd. lbl.gov/~dhbailey/dhbpapers/quadparallel.pdf (2004).
514
NOTICES
OF THE
[15] JONATHAN BORWEIN and DAVID BAILEY, Mathematics by Experiment, A K Peters Ltd., Natick, MA, 2004. [16] J O N A T H A N B O R W E I N , D A V I D B A I L E Y , and R O L A N D GIRGENSOHN, Experimentation in Mathematics: Computational Paths to Discovery, A K Peters Ltd., Natick, MA, 2004. [17] JONATHAN BORWEIN and RICHARD CRANDALL, On the Ramanujan AGM fraction. Part II: The complex-parameter case, Experiment. Math. 13 (2004), 287–96. [18] JONATHAN BORWEIN, RICHARD CRANDALL, DAVID BORWEIN, and RAYMOND MAYER, On the dynamics of certain recurrence relations, Ramanujan J. (2005). [19] JONATHAN BORWEIN, RICHARD CRANDALL, and GREG FEE, On the Ramanujan AGM fraction. Part I: The real-parameter case, Experiment. Math. 13 (2004), 275–86. [20] JONATHAN M. BORWEIN, DAVID BORWEIN, and WILLIAM F. GALWAY, Finding and excluding b-ary Machin-type BBP formulae, Canadian J. Math. 56 (2004), 897–925. [21] JONATHAN M. BORWEIN and DAVID M. BRADLEY, On two fundamental identities for Euler sums, http://www. cs.dal.ca/~jborwein/z21.pdf (2005). [22] J O N A T H A N M. B O R W E I N , D A V I D M. B R A D L E Y , DAVID J. BROADHURST, and PETR LISONˇ EK, Special values of multiple polylogarithms, Trans. Amer. Math. Soc. 353 (2001), 907–41. [23] J O N A T H A N M. B O R W E I N and D. R U S S E L L L U K E , Dynamics of generalizations of the AGM continued fraction of Ramanujan. Part I: Divergence, http://www.cs.dal.ca/~jborwein/BLuke.pdf (2004). [24] GREGORY CHAITIN, Irreducible complexity in pure mathematics, http://arxiv.org/math.HO/0411091 (2004). [25]I. D A M G A R D , P. L A N D R O C K , and C. P O M E R A N C E , Average case error estimates for the strong probable prime test, Math. of Comp. 61 (1993), 177–94. [26] HELAMAN R. P. FERGUSON, DAVID H. BAILEY, and STEPHEN ARNO, Analysis of PSLQ, an integer relation finding algorithm, Math. of Comp. 68 (1999), 351–69. [27] RICHARD FEYNMAN, There’s plenty of room at the bottom, http://engr.smu.edu/ee/smuphotonics/ Nano/FeynmanPlentyofRoom.pdf (1959). [28] L. KUIPERS and H. NIEDERREITER, Uniform Distribution of Sequences, Wiley-Interscience, Boston, 1974. [29] GORDON E. MOORE, Cramming more components onto integrated circuits, Electronics 38 (1965), 114–7.
AMS
VOLUME 52, NUMBER 5