Spherical harmonic transforms for discrete ... - Springer Link

7 downloads 6913 Views 487KB Size Report
their inverses to degrees and orders 3600 and higher are discussed with special ..... 3, has been used with different grids and on different computer platforms.
J Supercomput (2006) 38:173–187 DOI 10.1007/s11227-006-7945-6

Spherical harmonic transforms for discrete multiresolution applications J. A. R. Blais · D. A. Provins∗ · M. A. Soofi

 C

Springer Science + Business Media, LLC 2006

Abstract On the sphere, global Fourier transforms are non Abelian and usually called Spherical Harmonic Transforms (SHTs). Discrete SHTs are defined for various grids of data but most applications have requirements in terms of preferred grids and polar considerations. Chebychev quadrature has proven most appropriate in discrete analysis and synthesis to very high degrees and orders. Multiresolution analysis and synthesis that involve convolutions, dilations and decimations are efficiently carried out using SHTs. The high-resolution global datasets becoming available from satellite systems require very high degree and order SHTs for proper representation of the fields. The implied computational efforts in terms of efficiency and reliability are very challenging. The efforts made to compute SHTs and their inverses to degrees and orders 3600 and higher are discussed with special emphasis on numerical stability and information preservation. Parallel and grid computations are imperative for a number of geodetic, geophysical and related applications where near kilometre resolution is required. Parallel computations have been investigated and preliminary results confirm the expectations in terms of efficiency. Further work is continuing on optimizing the computations. Keywords Spherical harmonics . Spherical harmonic transform . Multiresolution analysis . Chebychev quadrature . Parallel computations

∗ Present

address: Rakhit Petroleum Consulting Ltd., Calgary, AB.

J. A. R. BLAIS Department of Geomatics Engineering, and Pacific Institute for the Mathematical Sciences, University of Calgary, Calgary, AB, T2N 1N4, Canada e-mail: [email protected] D. A. Provins · M. A. Soofi Department of Geomatics Engineering, University of Calgary, Calgary, AB, T2N 1N4, Canada e-mail: [email protected] e-mail: [email protected] Springer

174

Blais et al.

1 Introduction On the spherical Earth and neighboring space, spherical harmonics are among the standard mathematical tools for analysis and synthesis. For example, geopotential models in the form of spherical harmonic series synthesize much observational information from gravity surveys, satellite orbit monitoring and topographical height data. Current geopotential models such as EGM96 of degree and order 360 [14] and GPM98a, b and c of degree and order 1800 [28, 29] are presently being upgraded and extended to degree and order 2160 [21]. Also, data over the celestial sphere, such as obtained through the COBE, WMAP and other similar programs are being analyzed using SHTs and extensive efforts have been invested to achieve computational efficiency and reliability (e.g., [8, 9, 25]). For discrete observations on the sphere, various quadrature schemes are used to obtain spectral transform coefficients of global data. Quadratures schemes are well known for such computations and are based on equiangular, equiareal or other discretization approaches. One such scheme is the Gaussian quadratures that require the zeros of the associated Legendre functions to satisfy the orthogonality in discrete computations. But these zeros are not equispaced in latitude making the Gaussian strategy of secondary value in applications where some regularity in the spatial grid is needed. The approach of Driscoll and Healy [6] using Chebychev quadrature is uniquely advantageous in this context in providing an exact transform for exact arithmetic using an equiangular grid. The approach, however, needs to be modified and extended for practical applications requiring very high degrees and orders. Thus, optimization is required to achieve efficiency in the synthesis and analysis of global data.

2 Discrete SHTs and applications The orthogonal or Fourier expansion of a function f (θ , λ) on the sphere S2 is given by f (θ, λ) =

∞   n=0 |m|≤n

f n,m Ynm (θ, λ)

using colatitude θ and longitude λ, where the basis functions Ynm (θ, λ) are called the spherical harmonics satisfying the (spherical) Laplace equation  S 2 Ynm (θ, λ) = 0, for all |m| ≤ n and n = 0, 1, 2, . . .. This is an orthogonal decomposition in the Hilbert space L2 (S2 ) of functions square integrable with respect to the standard rotation invariant measure dσ = sin θ dθ dλ on S2 . In particular, the Fourier or spherical harmonic coefficients appearing in the preceding expansion are obtained as inner products  f n,m =

S2



f (θ, λ) Y¯nm (θ, λ) dσ

 (2n + 1) (n − m)! f (θ, λ) Pnm (cos θ) eimλ dσ 2 4π (n + m)! S   (2n + 1) (n − m)! f (θ, λ) Pnm (cos θ) eimλ dσ = (−1)m 2 4π (n + m)! S =

Springer

Spherical harmonic transforms for discrete multiresolution applications

175

in terms of the associated Legendre functions Pnm (cos θ) = (−1)m Pnm (cos θ), with the overbar denoting the complex conjugate. In most practical applications, the functions f (θ, λ) are band-limited in the sense that only a finite number of those coefficients are nonzero, i.e. f n,m ≡ 0 for all n ≥ N . The usual geodetic spherical harmonic formulation is given as f (θ, λ) =

n ∞     ˙˙˙nm cos mλ +S ˙˙˙nm sin mλ P ˙˙˙nm (cos θ ) C n=0 m=0

where      ˙˙˙nm 1 cos mλ ˙˙˙ C (θ, P (cos θ) dσ = f λ) ˙˙˙ sin mλ nm S nm 4π S 2 and  ˙˙˙ P nm (cos θ) = ˙˙˙ P n (cos θ) =



2 (2n + 1) (n − m)! Pnm (cos θ ) (n + m)! 2n + 1 Pn (cos θ)

are the normalized associated Legendre functions. Various discretization schemes can be adopted for the above formulation. A scheme given by Colombo [5] used discretization of the form θ j = jπ/N , j = 1, 2,. . ., N and λk = kπ /N , k = 1, 2, . . ., 2N. Another discretization scheme is based on Legendre quadrature which is known to provide an exact representation of polynomials of degrees up to 2N – 1 using only N data values at the zeros of the Legendre polynomials. Recently, Mohlenkamp [16] has used this quadrature in his sample implementation of a fast SHT. Also, Driscoll and Healy [6] have exploited these quadrature ideas in an exact algorithm (for exact arithmetic) in a reversible SHT using the grid θ j = jπ /2N and λk = kπ/N , j, k = 0,. . ., 2N − 1. The analysis is done using f n,m =

1 · N



−1 −1 2N  



π 2N a j f θ j , λk Y¯nm θ j , λk 2 j=0 k=0

and assuming N to be a power of 2, the Chebychev quadrature weights a j are √

N −1

jπ  2 1 jπ aj = sin sin (2h + 1) N 2N h=0 2h + 1 2N while the synthesis is simply N −1 

 f θ j , λk = f n,m Ynm (θ j , λk ). n=0 |m|≤n

Thus, to each parallel θ j , there corresponds a quadrature weight a j which is proportional to the sine of the colatitude. This type of quadrature has been mentioned by Sneeuw [26] as the (second) Neumann method, but only approximations are given. Other schemes lead to Springer

176

Blais et al.

quadrature weights depending as well on degree n and order m. Furthermore, the quadrature weights may consider block averaging and desmoothing (see e.g., [5, 7, 10]). Related formulations are also discussed in [2, 3, 22] and [23] for geophysical applications. Notice that, for degree N, the above grids of 2N × 2N equispaced points in latitude with θ = π/2N and in longitude with λ = 2π /2N = π /N, have θ = 12 λ, which is not appropriate for numerous applications. Most geodetic and geophysical potential field applications require equiangular grids with θ = λ in latitude and longitude [5, 24, 28], and the poles do not need to be included .

3 Optimization of discrete SHTs For the intended applications of discrete SHTs, the Driscoll and Healy [6] formulation has been modified and optimized for very high degrees and orders approaching 3600 and beyond: First, instead of the mathematical normalization conventions as used in several software packages, such as SpherePack [1] and SpharmonicKit [18], the geodetic normalization and conventions have been adopted and implemented in the direct and inverse discrete SHTs. This follows the practice generally adopted in geopotential and gravity models [10, 14, 22, 28]. Second, the requirement for N to be a power of 2 as stated explicitly in [6] does not seem to be needed in the included mathematical derivation of the Chebychev weights but only required for the second part of the publication on some computational optimization using a divide and conquer strategy. Extensive numerical testing have confirmed the preceding direct and inverse discrete SHT formulations to be as accurate as can be expected for all N ≥ 1. Specific numerical results will be discussed below. Third, complications due to inclusion of the North pole (i.e. θ = 0) are avoided by first excluding j = 0 from the discretized latitudes, i.e., θ j = jπ /2N, with j = 1,. . ., 2N −1, and second, by redefining θ j as θ j = ( j + 12 )π /2N for j = 0,. . ., 2N−1. In both cases, discretization in longitude remains same, i.e., λk = πk/N , k = 0,. . ., 2N −1. With the first option, a0 = 0, implying that the North pole can be safely excluded. With the second option, the Chebychev weights a j are redefined as follows √ dj =



N −1

( j + 1/2) π  ( j + 1/2) π 2 1 sin sin (2h + 1) N 2N 2h + 1 2N h=0

for j = 0,. . ., 2N −1, which are symmetric about mid-range. Fourth, use have been made of the hemispherical symmetries in latitude and in the associated Legendre functions, i.e. Pnm (cos (π − θ )) = (−1)n+m Pnm (cos θ) which can be verified using the definition of the associated Legendre functions [27]. This symmetry is very important to optimize the efficiency of SHTs as essentially only half as many associated Legendre functions need to be evaluated in practical applications. Fifth, grids of 2N × 2N nodes with θ = 12 λ have been modified to 2N × 4N nodes with θ = λ for the majority of intended practical applications. This is achieved through normalization of the data vectors as the orthogonality of vectors should be independent of

Springer

Spherical harmonic transforms for discrete multiresolution applications

177

their respective lengths. Equiangular grids with θ = λ are critical for numerous practical applications such as with observational information from space platforms [21, 26]. Sixth, the formulation for analysis and synthesis are rearranged to perform computations along each parallel independently, involving functions of longitude only. Thus, Discrete Fourier Transforms (DFTs) can be used and implemented as Fast Fourier Transforms (FFTs) for computational efficiency. Explicitly, using the geodetic formulation and convention, one has for synthesis, given normalized spherical harmonic coefficients anm and bnm for m ≤ n, n = 0, 1, 2, . . ., N − 1, f (θ, λ) =

n N −1  

(anm cos mλ + bnm sin mλ)˙˙˙ P nm (cos θ)

n=0 m=0

=

N −1 N −1  

(anm cos mλ + bnm sin mλ)˙˙˙ P nm (cos θ )

m=0 n=m

=

N −1  m=0



N −1 





anm˙˙˙ P nm (cos θ) cos mλ +

n=m

N −1 





bnm˙˙˙ P nm (cos θ) sin mλ

n=m

and defining 

Am (θ ) Bm (θ )

 =

 N −1   anm n=m

bnm

˙˙˙ P nm (cos θ )

one has f (θ, λ) =

N −1 

{Am (θ ) cos mλ + Bm (θ ) sin mλ}

m=0

=

N −1   1 [Am (θ ) + i Bm (θ )] e−imλ + [Am (θ ) − i Bm (θ )] eimλ 2 m=0

 1 IDFT [Am (θ ) + i Bm (θ )] + IDFT [Am (θ ) + i Bm (θ)] 2 = Re IDFT [Am (θ ) + i Bm (θ )]

=

assuming discrete longitudes λk = kπ /N, k = 0, 1, 2, . . ., 2N −1, for unspecified discrete colatitudes θ. Writing Cm (θ ) = Am (θ ) + iBm (θ ), and correspondingly cnm = anm + ibnm , one then has f (θ, λk ) = Re IDFT [Cm (θ )] where Cm (θ) =

N −1  n=m

(anm + ibnm )˙˙˙ P nm (cos θ) =

N −1 

cnm˙˙˙ P nm (cos θ)

n=m

in which IDFT stands for inverse discrete Fourier transform. For analysis, given data f (θ j , λk ) at λk = kπ/N and θ j = jπ /2N or ( j+ 1/2)π /2N, k, j = 0, 1, 2, . . ., 2N −1, the normalized Springer

178

Blais et al.

spherical harmonic coefficients anm and bnm for m ≤ n, n = 0, 1, 2, . . ., N − 1, can be evaluated as follows: 

anm bnm

 =

  −1 −1 2N   cos mλk



π (−1)m 2N ˙˙˙ P nm cos θ j d j f θ j , λk sin mλk N j=0 k=0

or, using complex coefficients, cnm =

−1 −1 2N   π (−1)m 2N d j f (θ j , λk )e+imλk˙˙˙ P nm (cos θ j ) N j=0 k=0



2N −1 

(−1)m d j ˙˙˙ P nm (cos θ j )DFT k [ f (θ j , λk )]

j=0

with the preceding quadrature weights d j . Notice that in practice, depending on conventions, DFT and IDFT could be interchanged in the preceding derivation and for computational efficiency, direct and inverse FFTs would be substituted. Seventh, for degrees over 2000, the usual REAL∗ 8 or double precision results have been seen to deteriorate rapidly with increasing degrees (e.g., [4]). Hence, REAL∗ 16 or quadruple precision has been implemented for stable numerical results. Further details are included in the numerical analysis discussions below. Eighth, parallel computations have been implemented for the SHT synthesis, which is achieved through domain decomposition in latitude. Similar schemes can be used for the analysis formulation in terms of FFTs per parallel of data. Optimization work is continuing to further improve the parallelization of synthesis and analysis. For very high degrees and orders, parallel FFT code in quadruple precision is required and still need to be secured. Grid computation is also being explored to take advantage of the resources available through the Western Canada grid network (WestGrid, a network of computer clusters in Alberta and British Columbia, www.westgrid.ca).

4 Numerical analysis considerations The computations involving high degree and order SHTs require normalized associated Legendre functions ˙˙˙ P nm (cos θ) which are given as  ˙˙˙ P nm (cos θ) = ˙˙˙ P n (cos θ) =



2 (2n + 1) (n − m)! Pnm (cos θ ) (n + m)! 2n + 1Pn (cos θ )

assuming the usual geodetic normalizations. Also, for large n values it is imperative to use ˙˙˙nm (cos θ ) directly the recursive formulas for the normalized associated Legendre functions P as otherwise, significant numerical precision would be lost especially for high latitudes. Avoidance of such a strategy leads to significant error in the computation of direct and inverse SHTs. ˙˙˙nm (cos θ ) are computed as a lower trianThe normalized associated Legendre functions P gular matrix with the rows corresponding to the degrees n and the columns corresponding to Springer

Spherical harmonic transforms for discrete multiresolution applications

179

the orders m. Following [24], with the initialization ˙˙˙ P 00 (cos θ) = 1.0 and ˙˙˙ P 11 (cos θ) =

√ 3 sin θ

the diagonal terms are computed as ˙˙˙ P nn (cos θ) =

2n + 1 sin θ ˙˙˙ P n−1,n−1 (cos θ) 2n

and the subdiagonal terms are computed as ˙˙˙ P n+1,n (cos θ) =

√ 2n + 3 cos θ ˙˙˙ P n,n (cos θ ) .

The remaining terms for n ≥ 2 and n − 2 ≥ m ≥ 0 are then computed as  ˙˙˙ P nm (cos θ) = 

(2n − 1) (2n + 1) cos θ ˙˙˙ P n−1,m (cos θ ) − (n − m) (n + m) (2n + 1) (n + m − 1) (n − m − 1)˙˙˙ P n−2,m (cos θ) . (2n − 3) (n + m) (n − m)

The above recursive formulas have been used in geodesy for various geopotential field applications and recently by Wenzel [28–30] for degrees and orders up to 1800 with accuracy ranging from 10−11 to 10−13 for all colatitudes. However, for degrees and orders approaching 2000, Wenzel has indicated that the numerical error increases to 10−3 in magnitude. This has been confirmed in computations involving eight byte floating point (i.e., double precision, REAL∗ 8). Recently, some geodesists (e.g., [12, 13]) have re-examined these recursive formulations and modified the recurrence relation to produce scaled functions which may then be combined using Horner’s technique of nested multiplication to create partial sums. Their synthesis (only) results are stable and numerically accurate to an approximate degree of 2700. They also noted that the evaluation of normalized associated Legendre functions for high degrees, particularly near the poles, range in value over thousands of orders of magnitude (see also [2, 3] and [23]). For very high degree and order computations, asymptotic analyses of the behavior of the associated Legendre functions seem to be required [27] and Mohlenkamp [16] has reformulated the spherical harmonic analysis and synthesis in terms of modified basis functions with advantageous numerical results. Finally, the storage requirements for the normalized associated Legendre functions ˙˙˙ P nm (cos θ) of degrees and orders up to N are N(N+1)/2 per colatitude θ or N 2 (N +1) per meridian. Hence for a 2N × 2N grid, some 4N 3 (N +1) quantities need to be stored to avoid recomputing the Legendre functions. The sine and cosine terms require 2N quantities per longitude or 4N 2 in total. The Chebychev weights add another 2N quantities. Therefore, for very high degrees and orders, it is not always possible to store the spherical harmonics between analysis and synthesis to avoid recomputing them. Springer

180

Blais et al. Table 1 SHT results for simulated series with unit (upper part) and 1/degree2 (lower part) Coefficients, and θ = 1/2λ in REAL*8 Precision on AMD 64 Athlon FX-53 PC Synthesis/Analysis

Synthesis

Degrees

Grid

RMS (coef.)

Time (sec.)

RMS (data)

Time (sec.)

0–63 0–127 0–255 0–511 0–1023 0–1199 0–1499 0–1799 0–63 0–127 0–255 0–511 0–1023 0–1199 0–1499 0–1799

128 × 128 256 × 256 512 × 512 1024 × 1024 2048 × 2048 2400 × 2400 3000 × 3000 3600 × 3600 128 × 128 256 × 256 512 × 512 1024 × 1024 2048 × 2048 2400 × 2400 3000 × 3000 3600 × 3600

5.966e–15 1.693e–14 5.000e–14 6.879e–14 2.817e–13 5.060e–13 7.121e–13 1.084e–12 4.142e–17 5.806e–17 5.201e–17 2.877e–17 3.437e–17 4.328e–17 5.057e–17 6.981e–17

0.01 0.08 0.69 6.19 46.30 73.73 142.29 243.12 0.01 0.08 0.74 6.45 48.90 78.34 151.04 260.38

7.763e–15 4.315e–14 1.906e–13 2.943e–13 2.819e–12 5.860e–12 9.497e–12 1.584e–11 3.844e–17 9.084e–17 1.212e–16 5.578e–17 2.097e–16 3.804e–16 5.208e–16 6.067e–16

0.01 0.07 0.59 5.12 37.99 60.65 117.34 201.36 0.01 0.07 0.59 5.12 38.02 60.82 117.3 201.69

5 Numerical experimentation As discussed in earlier sections, various approaches have been employed in using spherical harmonics for the analysis of geospatial data. Based on these approaches some of the popular computer codes are Spherepack [1], experimental codes from [6] and derivatives [11, 18], and the codes developed by Mohlenkamp [15–17]. Experimentation with these codes by Provins [23] has shown that the scaling used in these codes are different from that expected for geodetic applications. For this study the modified formulation of Driscoll and Healy [6], described in Section 3, has been used with different grids and on different computer platforms. Both double and quadruple precision computations have been used. However, following extensive experimentation, double precision was restricted to degrees and orders less than 2000 for accuracy reasons. Test results using various grids with θ = 12 λ are shown in Table 1 with computations in double precision (i.e. REAL∗ 8). With θ = λ and quadruple precision (i.e. REAL∗ 16), results are shown in Table 2. For each Table, the first synthesis started with unit coefficients, anm = bnm = 1, except for bn0 = 0, for all degrees n and orders m, which corresponds to white noise, while for the synthesis in the second part, coefficients corresponding to 1/degree2 , i.e. explicitly, anm = 1/ (n + 1)2 , bnm = 0 Springer

for m = 0,

and

1/(n + 1)2 , otherwise.

Spherical harmonic transforms for discrete multiresolution applications

181

Table 2 SHT results for simulated series with unit (upper part) and 1/degree2 (lower part) Coefficients, and θ = λ in REAL*16 Precision on DEC Alpha Computer Synthesis/Analysis

Synthesis

Degrees

Grid

RMS (coef.)

Time (sec.)

RMS (data)

Time (sec.)

0–255 0–511 0–1023 0–2047 0–2186 0–2999 0–3199 0–3599 0–255 0–511 0–1023 0–2047 0–2186 0–2999 0–3199 0–3599

512 × 1024 1024 × 2048 2048 × 4096 4096 × 8192 4374 × 8748 6000 × 12000 6400 × 12800 7200 × 14400 512 × 1024 1024 × 2048 2048 × 4096 4096 × 8192 4374 × 8748 6000 × 12000 6400 × 12800 7200 × 14400

1.376e–31 1.718e–31 5.815e–31 1.679e–30 1.435e–30 2.698e–30 6.564e–30 6.077e–30 7.968e–35 5.387e–35 6.542e–35 8.611e–35 6.439e–35 7.538e–35 1.315e–34 8.604e–35

28.49 219.77 1736.11 13789.60 16467.16 41728.37 51670.81 73286.39 29.84 231.55 1834.56 14604.00 17746.40 45030.47 55670.20 79021.25

3.526e–31 6.721e–31 3.076e–30 9.266e–30 7.405e–30 2.513e–29 5.270e–29 6.250e–29 1.604e–34 6.784e–35 1.348e–34 1.911e–34 1.711e–34 3.510e–34 6.911e–34 6.310e–34

24.32 189.28 1487.78 11757.99 14303.93 36758.62 44556.38 63375.21 24.41 188.65 1475.51 11653.81 14145.02 36361.90 44055.92 62648.45

for all degrees n and orders m, were used to simulate a physically realizable situation. The synthesis using the coefficient defined above is followed by analysis of the computed spatial grid values. To quantify the error in computations, the root-mean-square (RMS) of the difference between initial and recomputed coefficients is computed. A second synthesis is then performed using the recomputed coefficients and a second RMS is computed of the difference between the two sets of computed grid values. Thus, starting with arbitrary coefficients {cnm }, the procedure can be summarized as follows: SHT [SHT−1 [{cnm }]] − [{cnm }] → RMS of first synthesis/analysis, and SHT−1 [SHT[SHT−1 [{cnm }]]] − SHT−1 [{cnm }] → RMS of second synthesis. Note that the first and second RMS values are in spectral and spatial domain, respectively. These RMS values are important checks on the accuracy of computations. Figures 1(a) and (b) show the plots of the RMS values in logarithmic scale and the computation times corresponding to Table 1. Figures 2(a) and (b) show the plots of the RMS values in logarithmic scale and the computation times corresponding to Table 2. The computations following the scheme outlined above are quite stable for degrees and orders up to 2000 when double precision (REAL∗ 8) is used and up to 3600 with quadruple precision (REAL∗ 16). Note that the RMS results for 1/degree2 coefficients are especially stable for very high degrees and orders, and the computation times are generally O(N 3 ). Further optimization is currently underway. The intermediate storing of the Legendre functions was experimented with but not implemented because of the huge storage requirements for high degrees and orders. Memory allocation often becomes a problem with such computations and as mentioned before, Springer

182

Blais et al.

Fig. 1a SHT RMS values for synthesis/analysis [a] and synthesis [b] of simulated series (unit coefficients [RMS1] and 1/degree2 coefficients [RMS2]) with θ = 1/2λ in REAL*8 precision on AMD 64 Athlon FX-53 PC.

Fig. 1b SHT time values for synthesis/analysis [S/A] and synthesis [S] of simulated series (unit coefficients [TIME1] and 1/degree2 coefficients [TIME2]) with θ = 1/2λ in REAL*8 precision on AMD 64 Athlon FX-53 PC.

Fig. 2a SHT RMS values for synthesis/analysis [a] and synthesis [b] of simulated series (unit coefficients [RMS1] and 1/degree2 coefficients [RMS2]) with θ = λ in REAL*16 precision on DEC Alpha computer. Springer

Spherical harmonic transforms for discrete multiresolution applications

183

Fig. 2b SHT time values for synthesis/analysis [S/A] and synthesis [S] of simulated series (unit coefficients [TIME1] and 1/degree2 coefficients [TIME2]) with θ = λ in REAL*16 precision on DEC Alpha computer.

Fig. 3 Computation times for one, two and four processors in REAL*8 precision on DEC Alpha computers of WestGrid (Lattice) network (www.westgrid.ca).

parallelization of the code requires access to parallel FFT code in quadruple precision. Discussions are underway to secure access to such parallel code in quadruple precision which is obviously available from several commercial vendors. For computations involving quadruple precision, several hours are required to perform synthesis and analysis for high degree and order spherical harmonic transforms. In the case of degree and order equal to 3600 the total required computation time is approximately 22 hours (Figure 2(b)). To reduce this, advantage can be taken of parallel computations. Limited experimentation with the message-passing-interface (MPI) approach of parallelization has shown promising results. This approach to parallelization is quite general and can be used in both shared memory and distributed memory environments. It is especially suitable for Beowulf clusters where standalone computers are networked together to perform computations in parallel. The parallelization in synthesis computations is achieved through domain decomposition in latitude, i.e., spatial grid values can be computed parallel by parallel. The range of θ values can be broken into p blocks, where p is the number of processors being used. Each processor then computes grid values for the range of θ values assigned to it. Computation times Springer

184

Blais et al.

involving one, two and four processors are shown in Figure 3 where significant reduction in the processing time is noticeable. The efficiency in computation so achieved can be expressed in terms of speedup S( p) with p processors, where speedup is S( p) = ts /t p with ts denoting the time required for sequential computations and t p the time for parallel computations. From Figure 3 it can be seen that the speedup is approximately equal to two with two processors and four with four processors. Thus, the speedup S( p) of computation is practically equal to number of processors used. This linear speedup is achieved because of minimal overhead and latency. There is no communication and transfer of message among processors during the computations. Also, as the range of θ values is divided equally among the processors, each processor carries the same load. Hence, the approach adopted provides a balanced and efficient parallelization of the synthesis. Work is in progress to implement a similar approach for the parallelization of the analysis part.

6 Example of geophysical application To illustrate the applicability of SHT in practice, a simple multiresolution geophysical application will be briefly described and other applications can be found in [23]. There are several models for the Earth’s geopotential in the public domain. One standard reference is EGM96 of degree and order 360 [14] derived from surface gravity and satellite observations. In 1999, Wenzel extended this geopotential model to degree and order 1800 through iterative refinements using additional gravity observations. Pavlis et al. [23] have promised a better model of degree and order 2160 in 2006. Wenzel’s model GPM98b is shown in Figure 4 with a resolution of 0.1 degree, or approximately 10 km at ground scale. The principal applications of such a model are in georeferencing and navigation using the Global Positioning System. Another area of application is in geophysical exploration where subsurface features can be extracted through the multiresolution wavelet filtering such as with the Poisson kernel based wavelets. The simplest of such wavelet filter corresponds to rescaling the magnitude of horizontal gradients of the geopotential (e.g., [19, 20]). With the GPM98b model, the results at ground scale are shown in Figure 5. The

Fig. 4 Geoid using GPM98b with ground resolution of approximately 10 km. Springer

Spherical harmonic transforms for discrete multiresolution applications

185

Fig. 5 Horizontal gradient magnitude of geopotential model GPM98b.

displayed subsurface features, especially in ocean areas, provide convincing evidence of the potential of this approach at global scales with SHTs. Exploration geophysicists have used this approach at regional scales (with FFTs) in conjunction with other observational information.

7 Concluding remarks Spherical harmonic transforms are necessary in global spectral analysis and in the optimization of spherical convolution operations in filtering, analytical continuation and other digital processing applications. For discrete analysis and synthesis applications, different quadrature and least-squares formulations exist in the literature and different conventions make the intercomparative analyses quite challenging. Different strategies imply different grids such as equiangular, Gaussian, etc., and the normalization used in the resulting frequency spectra is often different. Starting with the Chebychev quadrature approach of Driscoll and Healy [6], modifications and optimizations have resulted in very efficient computations of spherical harmonic transforms of high degrees and orders for analysis and synthesis applications. However when using double precision computations, the errors in terms of RMS values increase significantly with degrees and orders 2000 and higher. When the spectral coefficients are of unit value and degrees and orders of 1800 the RMS errors are of orders 10−16 –10−12 . The error increases with higher degrees and order SHTs and it is 10−3 with degree and order 2000. When quadruple precision is used, the RMS errors reduced significantly and are of orders 10−31 to 10−29 for degree and order 3600. Simulations involving spectral coefficients of the form 1/degree2 , which are more representative of practical situations, produce RMS errors of orders 10−17 –10−16 with degrees and orders approaching 2000. The same simulations with degrees and orders 3600 produce RMS errors of orders 10−35 –10−34 . Computation times with quadruple precision and degrees and orders approaching 3600 are of the orders of several hours. Experimentation with the parallelization of the synthesis code has shown significant improvement in the efficiency of computations. For the limited experimentation of this study with domain decomposition in latitude, the speedup achieved in Springer

186

Blais et al.

the computations is close to the number of processors used. This linear speedup is largely due to the absence of interprocessor communication resulting in minimal overhead and latency. Also, the computational load is distributed equally to all the processors. Experimental work is continuing on the parallelization of the spectral analysis of spatial grid values. Also, parallel FFT in quadruple precision will be required for very high degrees and orders. Acknowledgement The authors would like to acknowledge the sponsorship of the Natural Sciences and Engineering Research Council in the form of a Research Grant to the first author on Computational Tools for the Geosciences. Special thanks are hereby expressed to Dr. D. Phillips of Information Technologies, University of Calgary, for helping with the optimization of our code for different computer platforms. Comments and suggestions from a colleague, Dr. N. Sneeuw, are also gratefully acknowledged.

References 1. Adams JC, Swarztrauber PN (1997) SPHEREPACK 2.0: A model development facility. http://www.scd.ucar.edu/softlib/SPHERE.html 2. Blais JAR, Provins DA (2003) Optimization of computations in global geopotential field applications. In Lecture Notes in Computer Science, Computational Science – ICCS 2003, Part II, 2658, P. M. A. Sloot, D. Abramson, A. V. Bogdanov, J. J. Dongarra, A. Y. Zomaya, and Y. E. Gorbachev, (eds.), Springer-Verlag, pp. 610–618 3. Blais JAR, Provins DA (2002) Spherical harmonic analysis and synthesis for global multiresolution applications. Journal of Geodesy, 76:29–35 4. Blais JAR, Soofi MA (2004) Spherical Harmonic Transforms and Global Computations. Geoid Workshop, Joint Meeting Canadian and American Geophysical Unions. Montreal, QC 5. Colombo O (1981) Numerical methods for harmonic analysis on the sphere. Report no. 310, Department of Geodetic Science and Surveying, The Ohio State University 6. Driscoll JR, Healy DM (1994) Jr. Computing Fourier transforms and convolutions on the 2-Sphere. Advances in Applied Mathematics, 15:202–250 7. Gleason DM (1998) Obtaining minimally aliased geopotential coefficients from discrete data forms. Manuscripta Geodaetica, 14:149–162 8. Gorski KM, Hivon E, Wandelt BD (1998) Analysis issues for large CMB data sets. In Proceedings of Evolution of Large Scale Structure, Garching, Preprint from http://www.tac.dk/∼healpix 9. G´orski KM, Wandelt BD, Hivon E, Hansen FK, Banday AJ. (1999). The HEALPix Primer. http://arxiv.org/abs/astro-ph/9905275 10. Hajela DP (1984) Optimal estimation of high degree gravity field from a global set of 1 × 1 anomalies to degree and order 250. Report no. 358, Department of Geodetic Science and Surveying, The Ohio State University 11. Healy D, Jr., Rockmore D, Kostelec P, Moore S (1998) FFTs for the 2-Sphere—Improvements and variations. To appear in Advances in Applied Mathematics, Preprint from http://www.cs.dartmouth.edu/ geelong/publications 12. Holmes SA, Featherstone WE (2002) A unified approach to the Clenshaw summation and the recursive computation of very-high degree and order normalised associated Legendre functions. Journal of Geodesy, 76:279–299 13. Holmes SA, Featherstone WE (2002) SHORT NOTE: Extending simplified high-degree synthesis methods to second latitudinal derivatives of geopotential. Journal of Geodesy, 76:447–450 14. Lemoine FG, Kenyon SC, Factor JK, Trimmer RG, Pavlis NK, Chinn DS, Cox CM, Klosko, SM, Luthcke SB, Torrence MH, Wang YM, Williamson RG, Pavlis EC, NK, Rapp RH, Olson TR (1998) The development of the joint NASA GSFC and NIMA geopotential model EGM96. Technical Report NASA/TP1998-206861, NASA Goddard Space Flight Center, Greenbelt Maryland 20771, USA 15. Mohlenkamp MJ (2000) Fast spherical harmonic analysis: Sample code. http://amath.colorado.edu/ faculty/mjm 16. Mohlenkamp MJ (1999) A fast transform for spherical harmonics. The Journal of Fourier Analysis and Applications, 5(2/3):159–184, Preprint from http://amath.colorado.edu/faculty/mjm 17. Mohlenkamp MJ (1997) A fast transform for spherical harmonics. PhD thesis, Yale University, 18. Moore S, Healy D, Jr., Rockmore D, Kostelec P (1998) SpharmonKit25: Spherical harmonic transform kit 2.5, http://www.cs.dartmouth.edu/∼geelong/sphere Springer

Spherical harmonic transforms for discrete multiresolution applications

187

19. Moreau F, Gibert D, Holschneider M, Saracco G (1999) Identification of sources of potential fields with the continuous wavelet transform: Basic theory. Journal of Geophysical Research, 104:5003–5013 20. Moreau F, Gibert D, Holschneider M, Saracco G (1997) Wavelet analysis of potential fields. Inverse Problems 13:165–178 21. Pavlis NK, Holmes SA, Kenyon S, Schmidt D, Trimmer R (2004) Gravitational potential expansion to degree 2160. Presentation at the Gravity, Geoid and Space Missions, GGSM 2004, Porto, Portugal 22. Pavlis NK (1988) Modeling and estimation of a low degree geopotential model from terrestrial gravity data. Report no. 386, Department of Geodetic Science and Surveying, The Ohio State University 23. Provins DA (2004) Earth synthesis: Determining earth’s structure from geopotential fields, unpublished PhD thesis, University of Calgary, Calgary. Available from: http://www.geomatics.ucalgary. ca/links/GradTheses.html 24. Rapp RH (1982) A FORTRAN Program for the computation of gravimetric quantities from high degree spherical harmonic expansions. Report no. 334, Department of Geodetic Science and Surveying, The Ohio State University 25. Schwarzschild B (2003) WMAP Spacecraft maps the entire cosmic microwave sky with unprecedented precision. Physics Today, 21–24 26. Sneeuw N (1994) Global spherical harmonic analysis by least-squares and numerical quadrature methods in historical perspective. Geophys. J. Int., 118:707–716 27. Varshalovich DA, Moskalev AN, Khersonskij VK (1988) Quantum Theory of Angular Momentum. World Scientific Publishing, Singapore 28. Wenzel G (1998) Ultra high degree geopotential model GPM3E97A to degree and order 1800 tailored to Europe. In Proceedings of the Second Continental Workshop on the Geoid in Europe, Budapest, 1998, preprint from htttp://www.gik.uni-karlsruhe.de/∼wenzel/gpm3e/gpm3e97a.htm 29. Wenzel G (1998) Ultra high degree geopotential models GPM98A, B and C to degree 1800. Bulletin of International Geoid Service, Milan, 1998b, preprint from http://www.gik.unikarlruhe.de/∼wenzel/gpm98abc/gpm98abc.htm 30. Wenzel G (1985) Hochaufloesende Kugelfunktionsmodelle fuer das Gravitationspotential der Erde. Wissenschaftliche Arbeiten der Fachrichtung Vermessungswesen der Universitat Hannover, Nr. 135, Hannover

Springer