Bandwidth Truncation for Chebyshev Polynomial and

2 downloads 0 Views 635KB Size Report
13 Feb 2016 - 8 Improvement II: Chebyshev-Padé approximations. 21. 9 Practical Advice. 28 ... m1. ∑ k=0 qk Tk(x)∣∣∣∣. ∣. ) < ϵ. (1) where the qk = (2/π) ∫. 1. 1 q(x)Tk(x)(1 − x2)1//2dx are the usual Chebyshev coefficients of q(x).
Bandwidth Truncation for Chebyshev Polynomial and ultraspherical/Chebyshev Galerkin Discretizations of Differential Equations: Restrictions and Two Improvements Zhu Huang School of Energy and Power Engineering Xian Jiaotong University Xian, P.R. China John P. Boyd Department of Atmospheric, Oceanic & Space Science University of Michigan, 2455 Hayward Avenue, Ann Arbor MI 48109 [email protected]

Abstract The Petrov-Galerkin ultraspherical polynomial/Chebyshev polynomial discretization of the highest derivative of a differential equation is a diagonal matrix. The same is true for Fourier Galerkin discretizations. Nevertheless, the spectral discretizations of simple problems like uxx + q(x)u = f(x) are usually dense matrices. The villain is the“multiplication matrix”, the Galerkin representation of a term like q(x)u(x); unfortunately, this part of the Galerkin matrix is dense. However, if the ODE coefficient q(x) has a Chebyshev or Fourier series that converges much more rapidly than u(x), then it is possible to realize great costs-savings at no loss of accuracy by truncating the full N ×N Galerkin matrix to a banded matrix where the bandwidth m  N . One of our themes is that when the spectral series for q(x) and u(x) have similar rates of convergence, as is almost universal when a nonlinear equation is linearized for a Newton-Krylov iteration, such ”[accuracy] lossless” truncation is impossible. Nonlinearity is but one of many causes of this sort of solution/coefficient “equiconvergence”. When bandwidth truncation is possible, though, our second theme is to show that a modest amount of floating point operations and memory can be saved by an unsymmetric truncation in which the number of elements retained to the left of the main diagonal is roughly double the number kept to the right. Our second improvement is to replace the M -term spectral series for q(x) by its [(M/2)/(M/2)] Chebyshev-Pad´e rational approximation. This sometimes allow one to have the matrix bandwidth, reducing the linear algebra costs by a factor of four. Keywords: pseudospectral; Chebyshev polynomials; spectral method;

Preprint submitted to Elsevier

February 13, 2016

ultraspherical polynomials; Galerkin matrix

Contents 1 Introduction

2

2 Spectral Series: Rate of Convergence

4

3 General Structure of Multiplication Matrices

6

4 Galerkin sums 4.1 Row sums of Galerkin discretizations . . . . . . . . . . . . . . . . 4.2 Good and Bad Bandwidth Truncation for the SIL Function . . .

7 7 8

5 Bandwidth Truncation Versus Basis Reduction 11 5.1 Recommendation . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 5.2 The effects of a pre-asymptotic cliff on bandwidth truncation . . 12 6 Rates of Convergence of ODE coefficients and the ODE solution 14 7 Improvement I: Unsymmetrical bandwidth truncation 14 7.1 Bandwidth truncation versus basis reduction . . . . . . . . . . . 19 8 Improvement II: Chebyshev-Pad´ e approximations

21

9 Practical Advice

28

10 Summary

28

1. Introduction Olver and Townsend have shown the great potential of Petrov-Galerkin spectral methods which use Chebyshev polynomial basis functions and Gegenbauer polynomial test functions.1 The Gegenbauer order is chosen to match the order of the differential equation [20, 22]. They write (pg. 469) 2 : “At first glance, it appears that the multiplication operator and any truncation of it are dense. However, since q(x) is continuous with bounded variation, we are able to uniformly approximate q(x) with a finite number of Chebyshev coefficients to any 1 Some purists restrict “Petrov-Galerkin” to instantiations in which the basis functions individually satisfy homogeneous boundary conditions; perhaps it would be more precise to label the Olver-Townsend work as “boundary-bordered Petrov-Galerkin”, but we shall usually omit “boundary-bordered” for brevity. 2 We replace their a(x) by our q(x)

2

desired accuracy. That is, for any  > 0 there exists an m ∈ N such that ! m−1 X max qk Tk (x) < . q(x) − x∈([−1,1]

(1)

k=0

R1 where the qk = (2/π) −1 q(x)Tk (x)(1 − x2 )−1//2 dx are the usual Chebyshev coefficients of q(x). As long as m is large enough, to all practical purposes, we can use the truncated Chebyshev series to replace q(x). Recall that in our implementation we approximate this truncation by the polynomial interpolant. Hence, the N × N principal part of the multiplication matrix for q(x) is banded with bandwidth mband for N > mband . Moreover, mband can be surprisingly small when q(x) is analytic or many times differentiable.” Fig. 1 schematically illustrates a dense matrix (left) and a bandwidth-truncated matrix (middle). For a symmetric truncation to bandwidth mband , there are (2mband + 1) nonzero elements in each row of the matrix (excluding boundary rows). The cost of factoring and solving a bandlimited matrix of bandwidth mband is vastly cheaper than computing with a dense N × N matrix when N is large. (O(m2band N ) versus O(N 3 ) when Crout Reduction (Gaussian Elimination) is applied.) Thus, great savings are realized by truncating the dense matrix to a finite bandwidth so that all elements Gjk with |j − k| > mband are set equal to zero. In this article, we discuss three subtleties. The first is to identify situations where mband  N will not be possible. The second is to point out that an unsymmetric truncation, as illustrated in the right panel of Fig. 1, is often more efficient than a symmetrical pruning of diagonal rows. Third, approximating ODE coefficients by a Chebyshev-Pad´e approximant instead of a series often allows the bandwidth to be halved, saving a factor of four in operation count. For simplicity, we examine only ordinary differential equations (ODEs). However, it will be obvious that the same considerations apply to Galerkin discretizations of partial differential equations as well, albeit with a more complicated analysis. The effectiveness of Chebyshev and Fourier spectral methods in general and applications in almost every field of science and engineering is discussed in the books by the second author [5, 7] and by Trefethen, Fornberg and others [23, 11, 10, 13, 14]. The ultraspherical Petrov-Galerkin spectral method is closely related and in many cases identical with apparently different strategies of “integration sparsification” [of the Galerkin matrix] such as doubly integrating the differential equations [4, 9, 12, 24, 25] and manipulating recurrence relations [8, 12] In this work, both Fourier Galerkin and Chebyshev Petrov-Galerkin treatments of ordinary differential equations are analyzed. Ultraspherical test functions generate derivative-approximating Petrov-Galerkin matrices which are diagonal and the same is true for Fourier Galerkin methods, too. Bandwidth truncation is then entirely a story narrated by the properties of the multiplication matrix. One helpful simplification is that the Chebyshev multiplication matrix for q(x) is identical with the Fourier Galerkin discretization of q(arccos(x)) 3

Dense Matrix 0

Symmetric Truncation Unsymmetrical Bandwidth m=2 mleft=4 mright=1 0 0

5

5

5

10

10

10

0

5 10 nz = 100

0

5 10 nz = 44

0

5 10 nz = 49

Figure 1: Left: dense matrix, as is typical of Galerkin matrices; the squares represent nonzero matrix elements. Middle: a banded matrix. To exhibit the symmetry of the truncation with respect to the main diagonal, the diagonal elements are solid. Right: an unsymmetric matrix truncation, mlef t = 4 and mright = 1.

and thus the Fourier and ultraspherical/Chebyshev algorithms can be analyzed simultaneously. The rate of convergence of Chebyshev series controls the size of the elements of the multiplication matrix. We therefore begin with a short section that answers: How fast do spectral coefficients diminish? 2. Spectral Series: Rate of Convergence Chapter 2 of [5], the review [6] and the old but valuable book of Gottlieb and Orszag [16] all give thorough discussions of spectral coefficient asymptotics, so only the briefest review will be given here. First, it is generic for functions analytic on the domain that the coefficients qn – or at least the “envelopes” of the coefficients decrease at a “geometric rate”, that is proportional to pn for some |p| < 1 as shown schematically in Fig. 2. Oscillations with degree are common as shown schematically in Fig. 2, but it is unlikely that bandwidth truncation or any other simple adaptive procedure can take advantage of them. The geometric decay of the envelope that tightly bounds the peaks in the coefficients from above will, however, be a major theme in what follows. Often, coefficients will be O(1) up to some finite degree nshelf , the “preasymptotic shelf”, and then, metaphorically fall off a cliff to the rapid exponential decay. We shall analyze the consequences of such cliffs by looking also at an exaggeration of this behavior, a step function in degree:  1, n ≤ nshelf qn ≈ (2) 0, n > nshelf

4

Pre-asymptotic range 1 Envelope proportional to Coefficients [Logarithmic Scale]

exp(- µ n)= pn

Roundoff Plateau 10-15 Degree n [Linear Scale] Figure 2: Generic convergence of Chebyshev polynomial and Fourier series. The coefficients and error may exhibit three regimes. For small degree n, the coefficients may decrease very little, or decay at a rate different from the asymptotic rate that sets in for large n. This “preasymptotic” regime may be absent, or may be a fast decay that is replaced for larger degree by a slower decay. Because Chebyshev series converge exponentially fast, high degree coefficients often “plateau”, that is, cease to decrease but instead bounce around randomly. This plateau is due to roundoff error; it typically is at ten to one thousand times the product of “machine epsilon” with the magnitude of the largest Chebyshev coefficient maxn |an |. Generically, Chebyshev coefficients and errors asymptote to an exponential rate of decay, proportional to exp(−µn) where µ is a positive constant, the “asymptotic rate of geometric convergence”. The exponential is often multiplied by non-exponential, slower-than-exponential functions. Sometimes the asymptotic decay of coefficients and errors is monotonic, but oscillations in degree n are common. When the coefficients oscillate with degree, it is always possible to bound the curve rather tightly by an “envelope” [dashed line] which does decay monotonically.

5

3. General Structure of Multiplication Matrices When a differential equation like uxx + q(x)u = f(x) is discretized by a Galerkin method [Fourier basis] or Petrov-Galerkin method [Chebyshev basis], the second derivative will contribute only a diagonal matrix to the discretization matrix, as noted earlier. The possibilities of truncating a dense matrix to a banded matrix depends entirely upon the properties of the dense “multiplication matrix” which is the discretization of q(x)u(x). It follows that it behooves us to thoroughly understand the structure of this matrix and how it depends upon the coefficients of the differential equation, q(x). The multiplication matrix for Chebyshev polynomials/cosines is  Rπ R0π dt cos(jt)q(t) cos(kt) [Fourier cosine basis] Mj,k ≡ (3) dt q(cos(t)) cos(kt) [Chebyshev polynomial basis] 0

which are identical in form except that in the Chebyshev case, the function q(x) appears as q(cos(t)) Olver and Townsend observe on pg. 469 of [20] that the multiplication matrix is the sum of a Toeplitz matrix plus an “almost Hankel” matrix where the Toeplitz matrix elements are 2Tk,k = 2q0 , k = 1 . . . N , 2Tk,k−j+1 = qj−1, 2Tk−j+1,k = qj−1 ,j = 2 . . . N, k = j . . . N :   2 q0 q1 q2 q3 q4 q5 q6 q7    q1 2 q0 q1 q2 q3 q4 q5 q6        q2 q 2 q q q q q q 1 0 1 2 3 4 5      q3 q2 q1 2 q0 q1 q2 q3 q4    (4)     q4 q q q 2 q q q q 3 2 1 0 1 2 3      q5 q4 q3 q2 q1 2 q0 q1 q2        q6 q q q q q 2 q q 5 4 3 2 1 0 1   q7 q6 q5 q4 q3 q2 q1 2 q0 and the “almost-Hankel” matrix is 2H1,j = 0, j 2 . . ., k = 1 . . .N  0 0 0 0 0 0   q1 q2 q3 q4 q5 q6    q2 q3 q4 q5 q6 q7    q3 q4 q5 q6 q7 q8    q4 q5 q6 q7 q8 q9    q5 q6 q7 q8 q9 q10    q6 q7 q8 q9 q10 q11  q7

q8

q9

q10

q11

6

q12

= 1 . . . N and 2Hj,k = qj+k−2, j = 0 q7 q8 q9 q10 q11 q12 q13

0



 q8    q9    q10    q11    q12    q13   q14

(5)

The degree of the Toeplitz contribution is always larger than that of the Hankel term except in the first column where the two are identical. This implies the following. Theorem 1 (Degrees). Each element of the multiplication matrix is the sum of two coefficients of the series for q(x). One is from the element in row j and column k of the Hankel matrix which we shall denote by qH(j,k) where H(j, k) is the degree of the coefficient and another from the Toeplitz matrix whose degree will be denoted T (j, k). For all row and column indices (j, k), H(j, k) ≤ T (j, k)

(6)

If |qn | ≥ |qn+1 | for all n, this implies |Mj,k | ≤ 2 |Tj,k |

(7)

Theorem 2 (Bound). Suppose that the cosine coefficients of q(t) [Fourier case] or q(cos(t)) [Chebyshev case] can be bounded by a function q(k): |qk | ≤ q(k)

∀k > 0

(8)

Then |Mj,j±k | ≤ q(k)

(9)

In particular, if q has a spectral series with a geometric rate of convergence so that q(k) = Cpk

(10)

for some positive constants C and p < 1, then |Mj,j±k| ≤ C pk

(11)

Thus, generically, the elements of multiplication matrix decay exponentially fast with distance from the main diagonal. This would seem to be very favorable for bandwidth truncation. Unfortunately, as explained in the next section, the matrix elements are only part of the story. 4. Galerkin sums 4.1. Row sums of Galerkin discretizations Estimates of the multiplication matrix are all well and good, but what really matters is the ”Galerkin sums”, which are the components of the product of the multiplication matrix with the vector of the Chebyshev coefficients ak of the solution to the differential equation, u(x). We can organize the terms in the

7

sums into an array (“Galerkin term array”) of the same size as the multiplication matrix. The n-th sum, whose terms are the n-th row of the term array, is N X

Mnk ak

(12)

k=1

The elements of the Galerkin term array are ˜ j,j+k ≡ Mj,j+k aj+k M

(13)

The term array is more complicated to analyze than the multiplication matrix because the array is a function of both u(x) and q(x), but it is also more illuminating because a poor approximation of the Galerkin residual is disastrous. Before analyzing the general case with bounds and theorems, it is helpful to first work a concrete example. 4.2. Good and Bad Bandwidth Truncation for the SIL Function A useful exemplar is the SIL function defined in Chapter 2 of [5]; this has both an explicit analytic form and also a cosine series whose general coefficient is known explicitly; further, its geometric convergence is typical for functions periodic and analytic on the real axis and for nonperiodic nonsingular functions expanded in Chebyshev polynomials. This function is known variously as the “Symmetric Imbricated Lorentzian”, “Periodized Witch of Agnesi” and “Geometric Fourier Series” function, among other names:  f SIL (x) ≡ (1 − p2 )/ (1 + p2 ) − 2p cos(x) (14) ∞ X = 1+2 pn cos(nx) (15) =

1+2

n=1 ∞ X

exp(−µ n) cos(nx)

(16)

n=1

where p < 1 and µ > 0 are constants related by p = exp(−µ)



µ = − log(p)

(17)

Larger µ or equivalently smaller p implies faster convergence and a smoother function. In the Chebyshev case, f SILC (x) ≡ f SIL (arccos(x)) will fulfill the same role. The crucial point is that a geometric rate of convergence, that is, with the envelope of the coefficients bounded by exp(−nµ) for some µ > 0, is generic. Different functions with the same µ will yield error graphs similar to those for the SIL function. Fig. 3 shows the errors in solving a typical ordinary differential equation such that both the coefficient q and the solution u were chosen to be the SIL function with independent rates of geometric convergence. The ODE contains 8

an additional parameter Q to control the magnitude of q(x; Q, p) independent of its smoothness, but this plays only a secondary role. Varying Q by a factor of about 60 does not change the qualitative behavior. When q(x; p) has a spectral series that converges much faster than the solution u(x; p), there is no penalty in truncating the Galerkin matrix to bandwidth m as illustrated in Fig. 3, provided that m  N/5, roughly, for this example. The error falls geometrically with increasing bandwidth until the error flattens out at some bandwidth mplateau (N ). The smoothest class of coefficients are polynomials; then q(x) has a finite Chebyshev series and the Petrov-Galerkin matrix is banded. Bandwidth truncation to the actual bandwidth induces no error at all. In contrast, when the coefficient of the ODE and the solution have equal rates of convergence, the error falls exponentially with increasing m until the matrix is the full N × N Galerkin matrix as also shown in Fig. 3. For ODE coefficients no smoother than the ODE solution, any bandwidth truncation is error-inducing, and truncation to a small bandwidth throws away many decimal places of accuracy. The lower graphs in Fig. 3 confirm the assertion in Sec. 2 that the type of singularity is much less important than location of the singularity, which controls the asymptotic rate convergence, µ = − log(p). The coefficient q has third order poles in the lower left of Fig. 3 [second derivative of the SIL function] or when q(x; p) has singularities much weaker than simple poles, such as x log(x) branch points, as in the lower right of Fig. 3. These changes multiply or divide the n-th Fourier coefficient, relative to 2pn , by n2 and n−2 . Nevertheless, truncating the bandwidth yields an exponentially bad increase in error for the “equiconvergence” case that solution and coefficient have singularities the same distance from the real axis. We shall therefore return to the SIL function, and to geometric convergence that is not modulated by powers of degree, in the next section. Theorem 3 (Right-of-the-Diagonal Bound). Suppose that the cosine coefficients of q(t) [Fourier case] or q(cos(t)) [Chebyshev case] can be bounded by a function q(k) and similarly for u: |qk | ≤ q(k)

∀k > 0

(18)

|uk | ≤ u(k)

∀k > 0

(19)

then ˜ j,j+k | ≤ C C 0 q(k)u(j + k) |M

(20)

In particular, if u and q are spectral series with the same geometric rate of convergence so that q(k) = Cpk ,

u(k) = C 0 pk

(21)

for some positive constants C and p < 1, then ˜ j,j+k | ≤ C C 0 pj+2k |M

9

(22)

10

10

Loo error in u(x;P), P =0.5 q=SIL function

5

p=0.125 Q=1 p=0.125 Q=61 p=0.5 Q=1 p=0.5 Q=61

0

error norm 10

10

10

10

10

−5

−10

−15

0

5

10

Loo error in u(x;1/2), q=second derivative of SIL

5

10

p=0.125 Q=1 p=0.125 Q=61 p=0.5 Q=1 p=0.5 Q=61

0

10

10 error norm

10

−10

−15

0

5

10

15 m

20

25

20

25

30

Loo error in u(x;1/2), q=double integral of SIL

0

p=0.125 Q=1 p=0.125 Q=61 p=0.5 Q=1 p=0.5 Q=61

error norm −5 10

10

15 m

10

30

−5

−10

−15

0

5

10

15 m

20

25

Figure 3: Maximum pointwise errors (errors in the L∞ norm, that is, maxx∈[−1,1] |uapprox (x) − uexact (x)|) for Fourier cosine-Galerkin discretizations truncated at various bandwidths. The ODE is uxx + Q2 q˜(x; p)u = f (x). For all four curves in all three panels, f (x) is chosen so that the exact solution is u(x; P ) = SIL(x; P ) with P = 1/2 . Top panel: q˜ is the SIL function. Lower left: Same as top panel except that the SIL function P(N −1) has been replaced by its second derivative in q(x; p) ≡ −Q2 n=1 (pn n2 ) cos(nx). Lower right: Same except that the SIL function has been replaced by its double integral as defined P(N −1) by the series q = −Q2 n=1 (pn /n2 ) cos(nx).

10

30

5. Bandwidth Truncation Versus Basis Reduction 5.1. Recommendation When the convergence-limiting singularity of an ODE coefficient q(x) coincides in location with that of the solution u(x), our recommendation is: do not truncate the N × N Galerkin matrix to a banded matrix with m ≤ N . Equality of rates of convergence, as is automatic (but not restricted to) when the convergence-limiting singularities are coincident in the complex plane implies that roughly equal numbers of terms will be needed to approximate both q(x) and u(x) to within a given tolerance. Truncating the series for q(x) [by bandwidth truncation] more drastically than the ODE solution u(x) makes no sense. In the rest of this section, we add theoretical analysis to the empirical evidence of Fig. 3 to support this recommendation.

Figure 4: Left: Absolute value of the matrix elements in typical rows of a typical multiplication matrix. To leading order, magnitude is only a function of distance from the main diagonal. ˜ jk ak . Right: Rows of the Galerkin term array, M

P To work through an example, set u(x) = q(x) = 1 + m pm cos(mt), and then retain just the largest terms in p. The array of Galerkin terms, multiplied

11

by one-half to eliminate annoying and irrelevant fractions, is  1 p2 p4 p6 p8 p10 p12 p14 2   p p p3 p5 p7 p9 p11 p13   2  p p2 p2 p4 p6 p8 p10 p12   3  p p3 p3 p3 p5 p7 p9 p11 1 ˜  M =  p4 p4 p4 p4 p4 p6 2 p8 p10   5  p p5 p5 p5 p5 p5 p7 p9   6  p p6 p6 p6 p6 p6 p6 p8  p7

p7

p7

p7

p7

p7

p7

p7

                  

(23)

The term array is very unsymmetric as illustrated in Fig. 5: ˜ n,n+m M ˜ n,n−m M

≈ ≈

pn × p2m p

n

[right of diagonal]

[left of diagonal]

(24) (25)

There is very fast exponential decay to the right of the main diagonal because the elements of the multiplication matrix are decaying as pm and these multiply coefficients of u(x) which are decaying as pm , too. There is no decay to the left of the main diagonal. One might why bandwidth truncation would ever produce useful approximations. To understand the answer, we need to extend our analysis a little deeper. First, though, we shall summarize what we have learned so far. If we take N terms in the series for u(x), then this means our error tolerance must be about pN . We can consistently neglect terms proportional to ˜ in the pN+1 , but nothing larger. Fig. 5 illustrates the matrix elements of M “equiconvergence” case that the solution and ODE coefficients have similar rates of convergence. If we truncate the bandwidth of the term array by deleting the gray-shaded elements, we obtain an approximation that retains all terms of O(p4 ) and is therefore a satisfactory approximation for sufficiently small p. However, an equally accurate reduction is obtained by truncating the matrix dimension to five, which is equivalent to truncating the Galerkin basis at N = 5 Chebyshev polynomials; the full 5 × 5 matrix resulting from this truncation is bounded by the dashed lines in Fig. 5. Since this full matrix has fewer elements than the bandwidth-truncated 8 × 8 matrix, it is obvious that “basis reduction” is the better way. 5.2. The effects of a pre-asymptotic cliff on bandwidth truncation The SIL function is unrepresentative in one regard: For many functions the spectral coefficients do not immediately begin their asymptotic descent, but instead oscillate at O(1) magnitude for n ≤ nshelf . The asymptotic large-n formulas do not apply on this shelf or plateau in the magnitude of the spectral coefficients, which is therefore labeled “pre-asymptotic”. For example, the 12

Figure 5: The 8 × 8 Galerkin term array for the case of equiconvergence. The narrowest bandwidth that excludes only elements of p5 or smaller is bounded by the solid line; the elements excluded by the bandwidth truncation to mband = 4 are gray-shaded. Truncating the Galerkin matrix to its upper 5 ×5 block (“basis reduction” to N = 5) is equally accurate in the sense of including all elements proportional to p4 or larger, but it is cheaper to factor than the larger banded matrix. (The matrix has been multiplied by a factor of (1/2) to eliminate a lot of irrelevant factors of two.)

Chebyshev coefficients of cos(M πx) isolate without decay until N > M and then fall supergeometrically. Detailed analysis confirms the obvious: Slow convergence in either u(x) or q(x) both makes error-free bandwidth truncation harder. When bandwidth truncation is contra-indicated for an ODE wose coefficients fall geometrically fast, a preasymptotic shelf only makes bandwidth truncation even less desirable.

13

6. Rates of Convergence of ODE coefficients and the ODE solution A full discussion would be a two-semester course, but some remarks are both brief and useful. 1. The ODE coefficients, such as q(x) in an ODE like uxx + q(x)u = f(x) can indeed be much smoother, with faster spectral series convergence than the ODE solution. An example uxx − u = 1/(1 + 25x2 ), whose solution must inherit the poles of the inhomogeneous term at x ± i/5 while q(x) is justs a constant. 2. The linearization of a nonlinear ODE, as for a Newton’s iteration, is an ODE in which the coefficients have the same singularities3 as the solutions because the ODE coefficients are the solution. Nonlinear ODEs almost always linearize to ODEs with equiconvergence of solution and coefficients. 3. The one-dimensional Helmholtz equation is uxx + Q2 u = f(x)

(26)

where Q is a constant we shall dub the “Helmholtz constant”. The homogeneous solutions are cos(Qx) and sin(Qx). As explained in [5, 16], the number of Chebyshev polynomials required to approximate a sinusoidal function on x ∈ [−π, π] asymptotically approaches π polynomials per wavelength, which here translates to N ≥ Q. Thus large spectral truncation N is essential for accurate homogeneous solutions if Q  1. However, the ultraspherical Petrov-Galerkin method generates a pentadiagonal matrix for the Helmholtz equation. Exploiting parity splits the problem into a pair of tridiagonal matrices of size N/2; the large Helmholtz constant and large-N -requiring particular solutions are not incompatible with extreme bandwidth truncation. 4. The situation is more complicated when q(x) = Q2 q˜(x) + Q2 q0 ,

max |˜ q (x)| = 1

(27)

where q0 is the mean of q˜(x). Both the rate of convergence of q˜(x) and the magnitude of the coefficient will control the accuracy of bandwidth truncation. 7. Improvement I: Unsymmetrical bandwidth truncation The elements of the multiplication matrix decay roughly symmetrically about the main diagonal, diminishing exponentially with increasing distance from the main diagonal either to the left or the right. However, in solving differential 3 By “same” we mean singularities whose locations in the complex plane are coincident; an ODE coefficient which is the square of u will have double poles where u has only poles, but location is vastly more important for the asymptotic rate of convergence than is the order of the singularities as stressed earlier here and in Chapter 2 of [5].

14

equations, it is necessary to multiply the elements of this matrix by the spectral coefficients of the solution. The magnitude of the contribution of each element to these matrix vector products can be represented by an array (henceforth “term array”) in which each element is the corresponding element of the multiplication matrix multiplied by the corresponding coefficient an of the solution u(x). Fig. 4 [right] shows that the resulting term array is extremely unsymmetric about the main diagonal, decaying slowly – or perhaps not decaying at all — to the left of the main diagonal, but decaying very rapidly to the right. Fis. 5 reiterates this theme. It therefore is advantageous to delete matrix elements in an unsymmetrical way, the leading many more elements to the right of the main diagonal than to the left. The theoretical justification for unsymmetric bandwidth truncation has already been presented. Because this analysis used the approximation Mj,j±k ≈ qk , it is useful to see some concrete approximation-free illustrations generated by actually solving ordinary differential equations using the Petrov-Galerkin ultraspherical/Chebyshev method for some examples and the Fourier-Galerkin scheme for others. Fig. 6 plots the isolines of the base-10 logarithm of the maximum pointwise error in solving the typical ODE uxx + q(x)u = f(x)

(28)

using an ultraspherical/Chebyshev Petrov-Galerkin matrix which has been unsymmetrically truncated to a sparse matrix. The number of nonzero elements to the left of the diagonal is mleft while the number to the right of the main diagonal is mright . The isolines are all L-shaped, that is, the union of a line segment parallel to the mleft axis and another segment parallel to the mright axis. The reason that the contours have this shape is that when mright  mleft [bottom of the plot], the error is dominated by the consequences of lopping off Petrov-Galerkin matrix elements that are right of the main diagonal. The error therefore improves exponentially as mright increases [moving vertically on the plot]. However, eventually a plateau is reached for sufficiently large mright and the isoline rotates ninety degrees to parallel the mright . The reason is that for a fixed mleft , a saturation point is reached for increasing mright in which the error is now limited by missing elements to the left, rather than to the right, of the main diagonal. The most efficient truncation is at the corners of each L: at a corner, increasing either of (mleft , mright ) does nothing; the error can be reduced only by simultaneously increasing both mleft and mright . In the left graph of Fig. 6, the ODE coefficient q(x) is very smooth. The optimum is a truncation which is only slightly unsymmetric. When P > p, that is the series for the solution converges more slowly than that of the very smooth function q(x), the optimum choice is mleft ≈ mright . When p = P [equal rates], mleft ≈ 2mright as shown in the right panel of Fig. 6. 15

q(x)=90000(sech(x)+tanh(x)) u(x)=sech(10x)+tanh(10x)

q(x)=90000(sech(10x)+tanh(10x)) u(x)=sech(10x)+tanh(10x) 2

−4

−2

−2

−8

50

−12

−2

0

2

0

5

10

15 20 mleft

25

30

−10

−6 −2

0

−10

0

−10

−4

−4

100

−8

−8

−6

−6

−4

−4

0

−8

−6

−6

0

0

−8

−6

mright

mright

−4 −4

0

−2

5

−4

150

−8

10

−2

−10

−4

−10

−12

−8

−2

15

5

0

−6

0

20

−2

−2

25

200

−12

30

0

−2 02 4

02 4

50

100 mleft

−2 02 4

150

200

Figure 6: Left: Contours of the base-10 logarithm of the maximum pointwise error where the ODE coefficient q(x) is much smoother than the solution for the ODE uxx + q(x)u = f (x) solved by the ultraspherical/Chebyshev Petrov-Galerkin method. The red guideline is the diagonal mlef t ≈ mright [symmetric truncation] while the black line is mlef t ≈ 2mright . Right: Same as left, but u and q have equal smoothness. The curves and colors depict the base-10 logarithm of the maximum pointwise error in solving the ODE uxx +q(x; p) = f (x) by means of an ultraspherical/Chebyshev Petrov-Galerkin method. The corners of the L-shaped isolines are connected by the black guideline, mlef t ≈ 2mright . (The red line is the symmetric truncation, which is far from optimum when the coefficient q(x) and ODE solution have equal rates of convergence.)

When P < p, that is, the series for u(x; P ) is converging much faster than the series for the coefficient q(x; p), mleft ≈ 3mright [not illustrated]. Another perspective is offered by shifting to nonperiodic ODEs solved by Chebyshev polynomials and making contour plots for symmetric truncations m in the plane spanned by m and the size of the Chebyshev basis N . The plots are triangular because the maximum bandwidth is limited to N . These Chebyshev m−N plots tell the same story as the Fourier mleft −mright plots. In the upper left of Fig. 7, the coefficient is very smooth. For fixed N , increasing bandwidth m (i. e., moving vertically in the plot) shows rapid decrease in error at first, but the contour lines quickly turn vertical. The error has reached a plateau controlled entirely N ; Olver and Townsend’s strategy of drastic bandwidth truncation enormously reduces costs without a loss of accuracy. In the upper right, q − u equiconvergence shows a very different pattern. For fixed N ,the color gradient and the contours are horizontal almost all the way to the m = N diagonal line. Unlike the Fourier cases, a modest truncation of bandwidth, such as to m = 100 when N = 140 say, causes little loss of accuracy, but only a small fraction of the matrix elements are omitted, saving little over the cost of factoring the full N × N Galerkin matrix. In the lowest plot, the contours near the bottom are horizontal. Unlike the upper left, however, the contours never turn vertical. Instead, for fixed N > 50, the error saturates at very tiny errors limited by machine precision.

16

−5

−10

q(x)=(sech(10x)+tanh(10x)) u(x)=sech(10x)+tanh(10x)

q(x)=(sech(x)+tanh(x)) u(x)=sech(10x)+tanh(10x) 300

300

0

250

0 −2

250

−4 200

−12

−6

100

−12 −10 −8−4 −2−6 150 200 N

−8

2− −110 −−8 − −6 2 0 50

−15

−12

−10

−10

−14

−8

−8 −6

−4

50 −14 −10 −8 −2−6 250 300

14

4 −1 102 −−1 −12

−6

−14

100

100

150 N

−8 −10 −12

−6 −4 −2

−4 −2

0

−2

−12

0

−4

− 8 1−4 6− − 2 −41 0 50

−6

− 10

−10

−8

100

−1

−14

−10

4

12

m 150

−12 −12

m 150

50

−5

−14

200

200

250

−14 300

q(x)=(sech(2x)+tanh(2x)) u(x)=sech(x)+tanh(x) 300

−2

250

−4 −6

200

−8

m 150

−10 100 −12 50

0

50

100

150 N

−14 −12 −10 −8 −6 −4 −2 200 250

−14 300

Figure 7: Base-10 logarithm of the maximum pointwise error in solving ODEs of the form uxx + q(x)u = f (x) using Chebyshev polynomials. The horizontal axis is N , the degree of the Chebyshev polynomial approximation to u(x). The vertical axis is the bandwidth m for a separate truncation that discards all elements of the Galerkin matrix Gj,j+k such that k > m. The hypotenuse of the triangle bounding the contours is m = N , which is no bandwidth truncation of the Galerkin matrix.

It appears — and indeed is true – that one can truncate to a bandwidth m  N without loss of accuracy. However, this is misleading. The coefficient q and solution u(x) are very both smooth compared to the other two panels: u(x) = sech(x) + tanh(x) has poles which are 10 times as far on the real axis as u(x) = sech(10x) + tanh(10x). A dense matrix of full bandwidth with N = 30 gives an approximation accurate to the roundoff plateau. It is therefore silly, except in multiple precision arithmetic, to use larger N . The lower panel of Fig. 7 is thus a parable reminding us the truncating the dimension N of the Galerkin matrix is significant for cost-reduction, too. A full 30 × 30 is much less expensive than a 100 × 100 matrix truncated to bandwidth m = 30. We repeated Fig. 7 with q multiplied by a large constant Q2 = 90, 000. The only major change was in the “over-resolved” plot: the huge magnitude of q(x) requires roughly twice as many Chebyshev polynomials as when q(x) is equally

17

smooth, but O(1) [not illustrated]. The key insight is unchanged by the magnitude of q(x): truncating the bandwidth of the Galerkin matrix to m  N is accuracy-neutral is possible only when the coefficients of the differential equation are much smoother than the solution.

18

7.1. Bandwidth truncation versus basis reduction Fig. 8 schematically compares the symmetric and unsymmetric bandwidth truncation with the third option, dubbed “basis reduction”, which is to use a square matrix of dimensions mband × mband , the standard Petrov-Galerkin of the largest size which is bandlimited of bandwidth mband.

Symmetric bandwidth truncation

Right−of−diagonal bandwidth truncation

Basis reduction

Figure 8: Symmetric bandwidth truncation, truncation to the right of the diagonal only, and basis reduction. Note that bandwidth truncation ignores the first two rows which impose the boundary conditions. N = 8, mband = 3.

19

Error norm vs bandwidth symmetrical truncation right−of−diagonal truncation basis reduction 10

10

10

10

u=tanh(x−1/3) q=− 1/(1+25 x2)

0

10

−5

10

−10

10

−15

10

0

10

20 30 bandwidth

40

10

10

−5

−10

−15

50

u=tanh(cos(20 (x−1/13))) q=−1000 + 900 cos(20x) 0 10

10

u=tanh(x−1/3) q=−1 − exp(5 x )

0

0

10

20 30 bandwidth

40

50

u=1/(1+25x2)+1/(1+25(x+1)2) q=−1000 u 0 10

−5

10

−5

−10

10

−10

−15

0

100

200 300 bandwidth

400

500

10

−15

0

50

100 bandwidth

150

Figure 9: Error norms (in the L∞ norm) for four different ODEs when the Galerkin matrix is truncated for each.

20

8. Improvement II: Chebyshev-Pad´ e approximations Suppose that the ODE coefficient q(x) is sufficiently smooth that it can be approximated to within the desired error tolerance by a Chebyshev series truncated at degree M , allowing truncation of the dense N ×N Galerkin matrix to a banded matrix of bandwidth m  N where m is proportional to, but not necessarily equal to, M . If one can approximate q(x) to the same accuracy by a rational function which is the ratio of two polynomials P and Q of degree M/2, and this is not improbable given that the rational approximation and the truncated Chebyshev series have the same number of degrees of freedom, then uxx + q(x)u = f(x)



Quxx + P u = Q f(x)

(29)

The Petrov-Galerkin procedure then yields a matrix whose bandwidth is M/2. Halving the bandwidth reduces the cost of an LU factorization by a factor of four and memory storage is reduced by a factor of two. Having demonstrated the theoretical advantages of replacing ODE coefficients by rational approximations, the next step is to ask: Do such approximations exist? And if so, how to construct them? Indeed, such approximations exist. We will begin with classical, Taylorseries-derived Pad´e approximants. Although these are highly nonuniform in space and therefore not suitable for spectral methods, which are usually blessed with errors that are almost uniform in x, Pad´e approximants provide useful insights into their much more uniform Pad´e-Chebyshev brethren. The [M/N ] Pad´e approximant to a function f(x) is a polynomial of degree M divided by a polynomial of degree N which is chosen so that the leading terms of the power series of the approximant match the first (M + N + 1) terms of the power series of f(x). One might suppose that the approximant would be restricted to the same domain of convergence as the Taylor expansion from whence it came. In reality, the Pad´e approximant will usually converge on the entire real axis if f(x) is free of singularities there [1, 2, 3]. In the power series/Pad´e realm, it has been proven that for some classes of functions, the diagonal approximants [equal numerator and denominator degrees] are not merely equal in accuracy to the comparable truncated series, but in some sense superior. For example, the power series of the Stieltjes function is asymptotic but has a zero radius of convergence; the diagonal Pad´e approximants converge at a subgeometric but exponential rate on the entire positive q

real axis, an error proportional to exp(−4 N /z) for the [N /N] approximant (pg. 405 of [3]). For the exponential function, pg. 404 of [3] asserts that Error(degree-N Taylor series) ∼ 2N Error([(N /2)/(N/2)] Pad´e)

(30)

The success of Pad´e approximants inspired the invention of Chebyshev and Fourier equivalents known collectively as “Pad´e-spectral” approximations. The rational functions with equal numerator and polynomial degrees are the “diagonal approximants” in the standard jargon of both Pad´e and spectral-Pad´e theory. 21

Taylor-series-derived Pad´e approximants are unique for a given f(x) and specified numerator and denominator degrees, but Chebyshev-Pad´e approximations come in several flavors [19, 18]. For our purposes, it will suffice to use “linear Chebyshev-Pad´e” or “Maehly Chebyshev-Pad´e” approximants. First, let M X PM (x) = pj Tj (x) (31) j=0

QN (x) = T0 +

N X

qj Tj (x)

(32)

j=1

Note that without loss of generality, we have normalized the coefficient of T0 in the denominator to one so that PM (x) and QN (x) together have only M +N +1 undetermined coefficients. The linear Chebyshev-Pad´e approximants are defined by the requirement R(x) ≡ f(x)QN (x) − PM = 0

(33)

at each of Ncol interpolation points where we usually took Ncol = 2N . This yields a matrix problem for the unknown spectral coefficients of the numerator and denominator polynomials. Fig. 10 compares the errors made by a symmetric truncation of the N × N Galerkin matrix to a banded matrix of bandwidth mband [Olver-Townend costreduction] and the Chebyshev-Pad´e strategy with q(x) ≈ q[mband − 2/(mband ) − 2](x)

(34)

followed by multiplication of the ODE by the denominator polynomial . The Pad´e approximation thus yields a matrix with only half the bandwidth of the bandwidth truncation with which it is being compared. The degree of the Pad´e approximant is not mband but mband − 2 because a constant q generates a pentadiagonal multiplication matrix and in general a polynomial of degree mband − 2 generates (for the Petrov-Galerkin method for a second order ODE) a matrix of bandwidth mband . Bandwidth truncation is unnecessary because the Chebyshev-Pad´e approximant, followed by clearing denominators before applying the Petrov-Galerkin method, constructs a matrix of bandwidth mband . Figs. 10-13 show that for most problems, the rational approximation to q(x) is a much better strategy than bandwidth truncation. Fig. 10 is a little silly because q(x) is a rational function so all Chebyshev-Pad´e approximations with denominator degree of two or higher is exact. A heptadiagonal matrix now yields near machine precision when N is sufficiently large. Rigidly sticking to a program of diagonalizing the highest derivative and reducing cost only by truncating a full multiplication matrix is here exposed as a foolish. Fig. 11 compares the Pade´e and bandwidth truncation strategies when the ODE coefficient is −1 + cos(10x). For small bandwidths, the Chebyshev-Pads´e 22

approach is actually worse. The reason is that the cosine and sine functions have “pre-asymptotic shelves”; the error does not fall below O(1) errors until N > πMwavelengths where Mwavelengths is the number of wavelengths of the oscillating. However, when the Chebyshev-Pad´e approach is awful, bandwidth truncation is awful, too. When the degree of the rational approximation is large enough for accuracy, the Chebyshev-Pad´e is triumphant for this example, too. The graphs are similar when q(x) − 1 − exp(5x) as shown in Figs. 12 except that there is no shelf of O(1) coefficients and error for the exponential functions. The final example with q(x) = cos(5x)sech(10x) is as lop-sided in favor of rational approximation as a soccer game whose final score is 10-0.

10

error norm vs. bandwidth truncation u=tanh(x−1/3) q=(1+x)/(1+9 x2)

0

Chebyshev−Pade symmetrical truncation 10

10

10

−5

−10

−15

0

5

10

15 20 bandwidth

25

30

Figure 10: Maximum ` ´ pointwise error in the ultraspherical/Chebyshev solution to uxx + q(x)u = −2 u 1 − u2 + qu where q(x) = −1 + cos(10x) with inhomogeneous Dirichlet boundary conditions and the rather complicated forcing was chosen so that the exact solution is u = tanh(x − 1/3). Both methods solved matrices of bandwidth mband , which is plotted as the horizontal axis. The red curve with x’s shows the errors in bandwidth truncation as described previously. The black curve results from (i) replacing q(x) by the rational approximation which is the ratio of two polynomials each of degree (mband − 2) (ii) multiplication of the ODE by the denominator polynomial (iii) computation and solution of the PetrovGalerkin matrix which by construction — not truncation – is of bandwidth mband . This case has a rational q(x), as listed in the title, so all rational approximations whose denominator degree is quadratic or higher is exact to within roundoff error.

We will not shrink from a catalogue of possible liabilities to the Pad´e approach. First, we have not made a systematic study of “high N ” problems. Perhaps multiplication by the denominator of the Pad´e approximation degrades 23

10

error norm vs. bandwidth truncation u=tanh(x−1/3) q=−1 + cos(10 x)

0

Chebyshev−Pade symmetrical truncation 10

10

10

−5

−10

−15

5

10

15 20 bandwidth

25

30

Figure 11: Same as previous figure except that q(x) = −1 + cos(10x) and f (x) chosen so that the ODE has the same solution as used in the previous figure. The Chebyshev expansion of cos(10) has a sizeable “pre-asymptotic shelf”, so it is not surprising that the error pf Pad´ e approximant also has a shelf before plunging to values close to machine precision.

the condition number so that Petrov-Galerkin method is beat up by floating point errors when N , the truncation of the Chebyshev series, is sufficiently large. Fig. 14 shows that the rational strategy works well for N as large as 500 though the accuracy plateaus at about 10−8 , show that ill-conditioning for large N is worthy of future study. Rational approximations have their difficulties such as ill-conditioning, “Froissart doubles” and other sorrows. Rational approximations have been very useful in a wide variety of applications despite these complications. These difficulties will not be discussed here. Instead, we refer the reader to the new (and exceptional) papers by Gonnet, Guttel and Trefethen [15] and Pachon, Gonnet and van Joris [21], who devise new strategies that largely eliminate these worries. A final difficulty is that computing the rational approximation by interpolation requires solving a matrix problem whose cost is not insignificant compared to that of solving the Petrov-Galerkin linear algebra problem. Even so, the error graphs are eloquent advocates for rational approximations.

24

10

error norm vs. bandwidth truncation u=tanh(x−1/3) q=−1 − exp(5 x )

0

Chebyshev−Pade symmetrical truncation 10

10

10

−5

−10

−15

0

5

10 15 bandwidth

20

Figure 12: Same as previous figure except that q(x) = −1 − exp(5x).

25

25

10

error norm vs. bandwidth truncation u=tanh(x−1/3) q=cos (5x) sech(10 x)

0

Chebyshev−Pade symmetrical truncation 10

10

10

−5

−10

−15

0

10

20

30 40 bandwidth

50

60

Figure 13: Same as previous figure except that q(x) = cos(5x)sech(10x).

26

error norm u=tanh(cos(20(x−1/13))) q=−1000 + 900*cos(20*x) Cheb. coeffs. of u Cheb. coeffs. of P, Q 0 0 5 10 10 10 numer Pade denom BT −2 10 0

10

−5

10 −4

10

−5

−6

10

10

−10

10 −8

10

−10

10 −10

10

−15

0

50 bandwidth

10

0

500 degree

0

50 degree

Figure 14: The left plot is the same as the previous figure except that q(x) = −1000 + 900 cos(20x) and the solution is u = tanh(cos(20(x−1/13))). The solution u(x) needs N = 500 Chebyshev polynomials to be approximated to near machine precision. However, q(x) is approximated to the maximum useful precision by a 50/50 rational function; although not obvious from the magnitude of the coefficients for the numerator and denominator polynomials graphed on the right, the ratio of two polynomials of degree 50 is accurate to a maximum pointwise error of 1.25 × 10−10 . There is no improvement by applying a bandwidth for the 500 × 500 Petrov-Galerkin matrix that is greater than 50 for the Chebyshev-Pad´ e method or a bandwidth of 100 with symmetrical bandwidth truncation.

27

9. Practical Advice If one has absolutely zero information about the rate of decrease of the and spectral coefficients of the solution, then one cannot sensibly trim the PetrovGalerkin matrix. On the other hand, one cannot sensibly choose the truncation of the spectral basis N either. However, it is unusual to perform just one calculation in a given region of parameter space; rather, a thoughtful arithmurgist will learn from each additional calculation. A posterior, the rate of convergence can be estimated by merely inspecting a graph of the spectral coefficients. We will assume that the user begins with the desired error tolerance δ and an estimate for N that is sufficient but barely sufficient to achieve this tolerance. If the convergence is a geometric convergence with |an | ∼ O(pn ), then p ∼ δ 1/N

(35)

One may apply Chebyshev interpolation to compute the Chebyshev coefficients of the ODE coefficient q(x). One can then estimate a constant p˜ such that the series coefficients for q(x) are decreasing proportional to p˜n . If p˜ ≥ p, meaning that the spectral series for the coefficient decreases as slowly or slower than the series for the solution, then symmetric bandwidth truncation will fail. It may still be possible to reduce costs by using unsymmetrical truncation or the Chebyshev-Pad´e strategy. If, on the other hand, the series for the coefficient is converging much faster than the solution, then drastic approximation of the Petrov-Galerkin matrix is possible. Using a theorem proved earlier and assuming that we have approximate bounds on the coefficients of q(x) and u(x), |qk | ≤ q(k) and |uk | ≤ u(k), then we can delete all elements to the right of the main diagonal such that, with δ as the desired error tolerance, if C C 0 q(k)u(j + k) ≤ δ

Mj,j+k ⇒ 0

(36)

Note that this right-only bandwidth truncation and the Chebyshev-Pad´e methods are mutually exclusive: Apply one or the other but not both. The first step in applying the Chebyshev-Pad´e method is to expand q(x) as a diagonal rational approximation of increasing degree until |q(x) − q[M /M](x) | ≤ δ

(37)

then replace q by the rational approximation, multiply the ODE by the denominator of the rational approximation and apply the Petrov-Galerkin procedure to generate the discretization matrix. Exploit the fact the matrix has bandwidth mband = M + 2. 10. Summary The ultraspherical Petrov-Galerkin discretization was devised to reduce the matrix condition number, but as Olver and Townsend have emphasized, Galerkin 28

and Petrov-Galerkin matrices can often be truncated to banded matrices whose bandwidth m is small compared to the dimensionality N of the Galerkin matrix. Here, we have analyzed one restriction or two improvements in such “bandwidth truncation”. The first is that bandwidth truncation does not worsen the error if and only if all the coefficients of the differential equation have spectral series that converge much faster than that of the ODE solution u(x). “Equiconvergence”, that is, ODEs such that u(x) and q(x) have similar rates of convergence does not allow truncation of the dense Galerkin matrix to a banded matrix except at the price of a severe loss of accuracy. Nonlinearity usually implies equiconvergence and therefore is unfavorable to bandwidth truncation. On the other hand, an inhomogeneous term f(x) whose spectral series converges slowly will usually force the expansion of u(x) to converge slowly even if the ODE coefficient q(x) is smooth, and this is favorable to drastic bandwidth truncation. The second is that although the matrix elements decay exponentially with distance from the main diagonal in roughly symmetrical fashion, the corresponding inner product sums — matrix elements multiplied by coefficients of u(x) — decay very unsymmetrically. In the equiconvergence case, the most efficient procedure is a truncation that keeps roughly twice as many elements to the left of the main diagonal as to the right. Modest savings in storage (perhaps 25 %) and execution time can be thus realized. Often, though, an even better option is to just reduce the size of the Galerkin basis from N to something smaller. A second improvement to be bandwidth truncation can be realized by replacing the ODE coefficient by a diagonal Chebyshev-Pad´e approximation. After clearing the denominator, it is often possible to half the bandwidth, reducing matrix factorization costs by about a factor of four. These concepts generalize to other spectral expansions. The Hermite function Galerkin discretization of the second derivative is a pentadiagonal matrix, or tridiagonal if the basis is restricted to even or odd parity. Some important problems are trivially but exactly banded with low bandwidth. For example, the quantum quartic oscillator is the eigenproblem vyy + (−y2 − y4 )v = λv where λ is the eigenvalue; the Galerkin matrices, one for each parity, are without approximation pentadiagonal. When the coefficients of the differential equation are transcendental functions, bandwidth truncation is again nontrivial. Since a subgeometric, root-exponential rate of convergence is normal for Hermite series [5], a full analysis is left for another time [17]. Similarly, the Galerkin representation of the two-dimensional Laplace operator on the surface of a sphere is diagonal. These extensions of Galerkin sparsification to spherical geometry, etc., are best left to specific applications. Olver and Townsend showed that the stigma of dense matrices associated with Chebyshev polynomial methods is often unjust, even for differential equations with complex transcendental coefficients. They never claimed that it is always possible to generate an accurate Chebyshev discretization of small bandwidth. We have shown that there are important classes of problems where drastic bandwidth truncation comes only at the price of a drastic loss of accuracy. When truncation of the Petrov-Galerkin matrix to a matrix of low bandwidth is 29

possible, we have shown that (i) unsymmetrical truncation and (ii) application of Chebyshev-Pad´e approximations to ODE coefficients, can significantly reduce costs still further. Acknowledgments. Support was from NSF grant OCE 1059703, a China Scholarship Council grant and NSFC grant key program (No. 51236006) of China. References [1] G. A. Baker, Jr., Pad´e approximants, in Advances in Theoretical Physics, no. 1 in Advances in Theoretical Physics, Academic Press, New York, 1965, pp. 1–50. [2]

, Essentials of Pad´e Approximants, Academic Press, New York, 1975. 220 pp.

[3] C. M. Bender and S. A. Orszag, Advanced Mathematical Methods for Scientists and Engineers: Asymptotic Methods and Perturbation Theory, Springer, New York, 1999. 594 pp. [4] M. Berioz, T. Hagstrom, S. R. Lau, and R. H. Price, Multidomain, sparse, spectral-tau method for helically symmetric flow, Computers and Fluids, 102 (2014), p. 250265. [5] J. P. Boyd, Chebyshev & Fourier Spectral Methods, Dover, New York, 2001. [6]

, Large-degree asymptotics and exponential asymptotics for Fourier coefficients and transforms, Chebyshev and other spectral coefficients, J. Engrg. Math., 63 (2009), pp. 355–399.

[7]

, Solving Transcendental Equations: The Chebyshev Polynomial Proxy and Other Numerical Rootfinders, Perturbation Series and Oracles, SIAM, Philadelphia, 2014. 460 pp.

[8] C. W. Clenshaw, The numerical solution of linear differential equations in Chebyshev series, Proceedings of the Cambridge Philosophical Society, 53 (1957), pp. 134–149. [9] E. A. Coutsias, T. Hagstrom, and D. Torres, An efficient spectral method for ordinary differential equations with rational function coefficients, Mathematics of Computation, 65 (1996), pp. 611–635. [10] M. O. Deville, P. F. Fischer, and E. H. Mund, High-Order Methods for Incompressible Fluid Flow, vol. 9 of Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, Cambridge, 2002. [11] B. Fornberg, A Practical Guide to Pseudospectral Methods, Cambridge University Press, New York, 1996. 30

[12] L. Fox and I. B. Parker, Chebyshev Polynomials in Numerical Analysis, Oxford University Press, London, 2d ed., 1968. [13] C.-I. Gheorghiu, Spectral Methods for Differential Problems, Casa Cartii de Stiinta, Cluj-Napoca, Romania, 2007. 157 pp. Out of print, but the author has made it available at http://www.ictp.acad.ro/gheorghiu/spectral.pdf. [14]

, Spectral Methods for Non-Standard Eigenvalue Problems: Fluid and Structural Mechanics and Beyond, Springer Briefs in Mathematics, Springer, New York, 2007. 120 pp.,.

[15] P. Gonnet, S. Guttel, and L. N. Trefethen, Robust Pade approximation via SVD, SIAM Rev., 55 (2013), pp. 101–117. [16] D. Gottlieb and S. A. Orszag, Numerical Analysis of Spectral Methods, SIAM, Philadelphia, PA, 1977. 200 pp. [17] Z. Huang and J. P. Boyd, When integration sparsification fails: Banded Galerkin discretizations for Hermite functions and Rational Chebyshev functions on an infinite domain, and Chebyshev methods for solutions with C∞ endpoint singularities, J. Comput. Appl. Math., (2015). To be submitted. [18] J. C. Mason and A. Crampton, Laurent-Pad´e approximants to four kinds of Chebyshev polynomial expansions. Part I. Maehly type approximations, Numer. Algorithms, 38 (2005), pp. 3–18. [19]

, Laurent-Pad´e approximants to four kinds of Chebyshev polynomial expansions. Part II. Clenshaw-Lord type approximantions, Numer. Algorithms, 38 (2005), pp. 19–29.

[20] S. Olver and A. Townsend, A fast and well-conditioned spectral method, SIAM Rev., 55 (2013), pp. 462–489. ´n, P. Gonnet, and J. van Deun, Fast and stable rational [21] R. Pacho interpolation in roots of unity and Chebyshev points, SIAM J. Numer. Anal., 50 (2012), pp. 1713–1754. [22] A. Townsend and S. Olver, The automatic solution of partial differential equations using a global spectral method, J. Comput. Phys., (2015). in press. [23] L. N. Trefethen, Spectral Methods in Matlab, Society for Industrial and Applied Mathematics, Philadelphia, 2000. [24] A. Zebib, A Chebyshev method for the solution of boundary value problems, J. Comput. Phys., 53 (1984), pp. 443–455. [25] A. Zebib, Removal of spurious modes encountered in solving stability problems by spectral methods, J. Comput. Phys., 70 (1987), pp. 521–525. 31

Suggest Documents