Accelerated Direct Solution of the Method-of-Moments ... - IEEE Xplore

INVITED PAPER

Accelerated Direct Solution of the Method-of-Moments Linear System In this paper, a direct method for solving integral equations, accelerated by compression of the method-of-moments impedance matrix, is presented. ´ beda, and By Alex Heldring, Jose´ Maria Tamayo, Eduard U Juan M. Rius, Senior Member IEEE

ABSTRACT

|

This paper addresses the direct (noniterative)

KEYWORDS

|

Adaptive cross approximation (ACA); computa-

solution of the method-of-moments (MoM) linear system, ac-

tional electromagnetics; fast solvers; integral equations; meth-

celerated through block-wise compression of the MoM imped-

od of moments (MoM)

ance matrix. Efficient matrix block compression is achieved using the adaptive cross-approximation (ACA) algorithm and the truncated singular value decomposition (SVD) postcompression. Subsequently, a matrix decomposition is applied that preserves the compression and allows for fast solution by backsubstitution. Although not as fast as some iterative methods for very large problems, accelerated direct solution has several desirable features, including: few problem-dependent parameters; fixed time solution avoiding convergence problems; and high efficiency for multiple excitation problems [e.g., monostatic radar cross section (RCS)]. Emphasis in this paper is on the multiscale compressed block decomposition (MS-CBD) algorithm, introduced by Heldring et al., which is numerically compared to alternative fast direct methods. A new concise proof is given for the N2 computational complexity of the MSCBD. Some numerical results are presented, in particular, a monostatic RCS computation involving 1 043 577 unknowns and 1000 incident field directions, and an application of the MS-CBD to the volume integral equation (VIE) for inhomogeneous dielectrics.

Manuscript received August 16, 2011; revised January 21, 2012; accepted March 27, 2012. Date of publication August 10, 2012; date of current version January 16, 2013. This work was supported by the Spanish Interministerial Commission on Science and Technology (CICYT) under Projects TEC2009-13897-C03-01, TEC2009-13897-C03-02, and TEC2010-20841-C04-02 and the CONSOLIDER CSD2008-00068. ´ beda, and J. M. Rius are with the AntennaLab, Department of A. Heldring, E. U `cnica de Catalunya, Signal Processing and Telecommunications, Universitat Polite 08034 Barcelona, Spain (e-mail: [email protected]). J. M. Tamayo is with ISAE, Campus ENSICA. 31500 Toulouse, France. Digital Object Identifier: 10.1109/JPROC.2012.2193369

364

Proceedings of the IEEE | Vol. 101, No. 2, February 2013

I . INTRODUCTION The past decades have seen a profusion of so-called fast methods for solving the linear system generated by the method of moments (MoM) applied to the integral equation formulation of scattering and radiation problems in electromagnetics. Practically, all of these methods share a common general framework: the linear system coefficient matrix (the impedance matrix) is approximated by a representation of a highly reduced size. This allows for storage of the impedance matrix for problems that are typically many orders of magnitude larger than before. Subsequently, the system is solved by an iterative algorithm such as conjugate gradients or the generalized minimum residual (GMRES) algorithm [1]. These algorithms involve the multiplication of the impedance matrix with a vector with each iteration. Thanks to the reduced size of the impedance matrix, the computational cost of the solution process is correspondingly reduced. Some examples of these iterative fast methods are the adaptive integral method (AIM) [2], the multilevel fast multipole algorithm (MLFMA) [3], the multilevel matrix decomposition algorithm (MLMDA) [4], the adaptive cross approximation (ACA) [6], and the singular value decomposition–matrix decomposition algorithm (SVD-MDA) [7]. Although the above iterative methods have revolutionalized the field of computational electromagnetics, allowing to solve problems of millions and even hundreds of millions of unknowns [8], there are some caveats: depending on the problem, iterative solvers sometimes show slow or even 0018-9219/$31.00 Ó 2012 IEEE

Heldring et al.: Accelerated Direct Solution of the Method-of-Moments Linear System

failed convergence. In general, it is difficult to know a priori how fast they will converge for a given problem. Often a preconditioner is necessary [9], which may turn out to be the bottleneck of the simulation. Some iterative methods, in particular the MLFMA, arguably the most successful one, depend on the kernel of the integral equation, which highly complicates its use for problems involving inhomogeneous media. Apart from the inherent (discretization) errors of the MoM, the approximate representation introduces an error which typically depends on a number of parameters of the fast method. The optimum choice of these parameters often depends on the specific problem at hand. The preconditioner may also bring along a number of problem-dependent parameters to adjust. Finally, iterative methods are not optimal for solving linear systems that involve multiple independent (excitation) vectors, because the iterations must be executed one vector at a time. This affects, for example, monostatic RCS computations. Recently, some publications have appeared proposing a different approach [10]–[12]. As above, the system matrix is approximated by a Bcompressed[ representation. While this may entail some of the problems mentioned above, the chosen method is the ACA [13], which is entirely algebraic, so it does not depend on the specific problem at hand or on the kernel of the integral equation. Subsequently, rather than solving the system iteratively, it is solved directly, by matrix decomposition, meanwhile preserving the compression. This leads to a solution in a fixed time, bypassing the problem of convergence. Furthermore, once the matrix is decomposed, one cheaply solves for any number of excitation vectors simultaneously by backsubstitution. The reduction in storage requirements and computation time of these direct methods with respect to uncompressed direct solution by lower upper (LU) decomposition is very important. The gain is not just a constant factor independent of the problem size, but the complexity (scaling of the computational effort with the number of unknowns N involved in the problem) comes down from N 3 for straightforward LU decomposition to N 2 for compressed direct solution. This, granted, is still much higher than, for instance, the MLFMA which has a complexity of N log N. For problems that are very large compared to the frequency the direct methods cannot compete with the iterative methods. But for intermediate problems, roughly up to a million unknowns, the direct methods are competitive, in terms of efficiency, in particular if multiple solutions are sought but also for their robustness against difficult and badly conditioned problems. Section II revisits the multiscale compressed block decomposition (MS-CBD), a direct method that was proposed by Heldring et al. [12], highlighting some new aspects that were not addressed in the original paper. In Section III, a comparative study of the MS-CBD with the alternative direct decomposition algorithms found in literature is presented. In Section IV, some results are presented, both for problems involving perfectly conducting surfaces and dielectric volumes.

II. MULTISCALE COMPRESSED BLOCK DECOMPOSITION A. Setup Phase As mentioned in the introduction, the first (setup) phase of the MS-CBD method is the construction of the impedance matrix in Bcompressed[ form. This amounts to a subdivision of the basis functions into subgroups, and using the ACA algorithm, followed by truncated singular value decomposition (SVD) recompression [7] to efficiently obtain a low-rank approximation of all the matrix subblocks that represent the interaction between two nonidentical subgroups (off-diagonal subblocks). In detail, the MS-CBD proceeds as follows. First, the basis functions are hierarchically subdivided according to a binary tree. This implies splitting the full set of basis functions in half, then doing the same with both halves, etc., until a chosen minimum of basis functions per subdivision (subgroup) is reached. The following Bsplitting scheme[ is adopted for splitting a subgroup at any level in half. 1) All basis functions in a group are assigned an Banchor point,[ typically their geometrical center. 2) The minimum and maximum Cartesian coordinates among all anchor points are determined. The splitting axis will be the dimension of largest extent. 3) The basis functions are sorted according to their coordinate along the splitting axis and the splitting is executed at the median (or the median plus one if impair). Other algorithms have been proposed in literature, notably the geometrical octal-tree (e.g., [3]) and the cobble-stone technique [10]. However, the former does not guarantee an even distribution in terms of the number of basis functions, while the latter is adapted to multiple subgroups, not to hierarchical binary splitting. The actual construction of the compressed matrix begins with a double loop over the subgroups at the highest level of the binary tree. The subgroup of the inner loop is compared to the subgroup of the outer loop and according to a given criterion, one of two actions is taken: either the compressed matrix subblock representing the interaction between the two subgroups is directly calculated with ACASVD, or the double loop is taken one step down the binary tree and the same comparison is made for the smaller Bchild subgroups.[ The criterion, and this is instrumental to the efficiency of the MS-CBD, must ensure that all the compressed blocks at all levels of the tree shall be of rank max

(1)

where max is a chosen fixed value. Obviously, if the inner and outer subgroups are the same, the block is subdivided rather than directly compressed. Eventually, the bottom of Vol. 101, No. 2, February 2013 | Proceedings of the IEEE

365


the tree is reached. Here, the self-interaction blocks are not compressed, but simply computed entirely. The number of levels in the binary tree is chosen such that the block size at the bottom is smaller than max , thus ensuring that no block in the entire impedance matrix violates (1). The difficulty in implementing the above criterion is that the rank of the matrix blocks is not known a priori in general. All that can be said [14], [15] is that asymptotically for electrically large subgroups, if both subgroups have a diameter D , and they have circumscribing spheres that do not overlap, the degrees of freedom in their interaction and thus the rank of their interaction matrix are proportional to

/

ðkDÞ4 ðkRÞ2

(2)

where R is the distance between the group centers. Of course, it is possible to try compressing every block directly using ACA and abandon as soon as the criterion is violated, to subdivide instead. But this is bound to result in a lot of superfluous work. Rather, we use (2) to make a rough estimate. With (2), considering that at the lowest level of the tree we cannot subdivide, so we compress even touching blocks, and that at the lowest level typically D , we define the following rule: compress if R > Rcrit , with

Rcrit ¼

D2

(3)

and subdivide otherwise. If the criterion is still violated, we abandon the compression and turn to subdivision. Conversely, if (3) prescribes to subdivide and all four resulting smaller compressed blocks turn out to be rank G ðmax =2Þ, we reject them and compress the single Bparent block[ instead. Although (2) is only an asymptotical value, quite inaccurate in practical cases [16], this procedure is surprisingly efficient. For example, in the setup phase for the largest problem of Section IV, the NASA Almond at 75 GHz, only 8% of the time was spent on Bcorrecting[ suboptimally compressed blocks.

B. Decomposition Phase Once the setup phase is complete, the matrix is decomposed using the algorithm MS-CBD, which is based on a nested implementation of the partitioned matrix inverse formulas [17].

Function B ¼ MS-CBDðZÞ 1) if Z 6¼ partitioned then B ¼ LU of Z, return 2) B11 ¼ MS-CBDðZ11 Þ 3) for i ¼ 2 to M 366


4) for j ¼ 1 to i 1 Pj1 5) Bji ¼ CBD-multðBjj ; Zji k¼1 BTkj Zki Þ 6) for k ¼ 1 to j 1 7) Bki ¼ Bki Bkj Bji 8) end for 9) end for P T 10) Bii ¼ MS-CBDðZii i1 j¼1 Zji Bji Þ 11) end for The algorithm is presented here for the general case of M partitions per level of the hierarchical subdivision. With the hierarchical subdivision explained above, the linear system matrix Z is subdivided into blocks Zij , i ¼ 1; . . . ; M, j ¼ 1; . . . ; M, and the diagonal blocks Zii are again subdivided recursively. The algorithm works on these blocks and returns a matrix B, which is subdivided according to the same hierarchical structure. For our present implementation, M ¼ 2 at every level of the hierarchy. The algorithm then reduces to the one given in [12] (for symmetric matrices). Conversely, for the nonhierarchical (one-level) case, the algorithm reduces to the one given in [11]. Please note that the algorithm is given for symmetric matrices. The extension to nonsymmetric matrices is straightforward. The algorithm CBD-mult is given as follows.

Function X ¼ CBD-multðB; YÞ 1) if B 6¼ partitioned then X ¼ U 1 L1 Y, return 2) X 1 ¼ CBD-multðB11 ; Y 1 Þ 3) for i ¼ 2 to M P T 4) X i ¼ CBD-multðBii ; Y i i1 j¼1 Bji X j Þ 5) for j ¼ 1 to i 1 6) X j ¼ X j Bji X i 7) end for 8) end for After the decomposition, the algorithm CBD-mult, which left-multiplies an MS-CBD decomposition B to a matrix Y is also used for solving the entire linear system, now with one or several simultaneous excitation vectors as its argument Y. It should be stressed here that the above algorithms give the formal decomposition procedure. In practice, most subblocks B and Z will be either compressed matrices or (nested) partitions thereof. Therefore, the operations (sums, products, and transpose) appearing in both algorithms need to be carefully defined for all possible cases. For details, see [12].

C. Computational Complexity In [13], we showed that, for surface discretization with a fixed number of samples per square wave length, the storage requirements of the MS-CBD asymptotically ðk ! 1Þ scale with N3=2 and the computation time with N 2 where N is the total number of unknowns involved.


However, the demonstration in [12] was rather convoluted. Here we present a more concise and intuitive proof. We start with the storage. For surface discretization, the number of basis functions n inside a subdomain with diameter kD is proportional to ðkDÞ2 . According to the compression or subdivision criterion in (3), subblocks of the hierarchical subdivision that are directly compressed rather than further subdivided have a fixed rank. If we replace with a constant in (2) and substitute n for ðkDÞ2 , we obtain n / kR. This implies that matrix blocks that are compressed correspond to source and field domains that are separated a distance R; such that kR is proportional to the number of basis functions n in these blocks. Blocks with smaller R are subdivided and their children are compressed at a lower level. Blocks with larger R are children of parent blocks that have already been compressed at a higher level. This leads to the conclusion that at any level the size of compressed blocks is proportional to the electrical distance between the corresponding source and field domains: if for the same object we increase the frequency (and adapt the mesh to keep the mesh size about =10), compressed blocks representing interactions between groups at the same physical distance will increase in size proportionally to k. Or, throughout the impedance matrix, the block sizes scale with k. The total number of unknowns in the problem Npscales with k2 . Consequently, the block ffiffiffiffi sizes grow with N . Hence, globally the number of nonzero grows with pffiffiffiffi elements associated with one row also3=2 N . Consequently, the storage scales with N . The MS-CBD algorithm in Section II-B consists of a double loop ði; jÞ over the block indices. If i 6¼ j, a third loop k over the block indices is executed containing a fixed number of matrix–matrix products, sums, and transposes. If i ¼ j, the MS-CBD is invoked recursively, so the triple loop is extended over all nondiagonal blocks at all hierarchical levels. Asymptotically, the matrix–matrix products are dominant, so the complexity is determined by a triple loop over the block indices times the corresponding block– block products, which makes it identical to a matrix– matrix product in terms of complexity. All the blocks consist of compressed SVD decompositions and their global sizes scale with N 3=2 . The computational effort therefore scales as the compressed product U 3 S3 V 3 of two N N compressed decompositions both with a rank pffiffiffiSVD ffi proportional to N , computed in the same way as the products inside the MS-CBD, namely U1 S0 V2 ¼ U1 ðS1 V1 U2 S2 ÞV2 U 00 S00 V 00 ¼ SVDðS0 Þ U3 S3 V3 ¼ ðU1 U 00 ÞS00 ðV 00 V2 Þ:

(4)

The most expensive operation in (4) is the p product ffiffiffiffi V 1 U 2 in line one which, because the size of V is NN 1 pffiffiffiffi and that of U 2 is N N , scales with N 2 . This determines

the complexity of the MS-CBD. Formally, the complexity is an upper limit for very large problems. Nevertheless, the experiment in Section IV-A shows that it yields a surprisingly accurate estimate for moderately large problems.

III . ALTERNATIVE DIRECT COM PRESSED DE COMPOS IT I ON M ET HODS Apart from MS-CBD, two different decomposition methods have been proposed in literature. In [10], the block LU decomposition [18] is used. In [11], a multiblock (more than two blocks) implementation of the partitioned matrix inverse formulas, called the compressed block decomposition method (CBD) is proposed. The MS-CBD is in fact a nested version of the latter, with the number of blocks per level restricted to M ¼ 2. The block LU decomposition can also be converted into a multiscale version. For comparison, the multiscale LU decomposition algorithm (MS-LU) is given below, for symmetric matrices. For the nonhierarchical (one-level) case, it reduces to the one given in [10].

Function B ¼ MS-LUðZÞ 1) if Z 6¼ partitioned then B ¼ LU of Z, return 2) for i ¼ 1 to M 3) for j ¼ 1 to i 4) if j > 1 thenPDj1 ¼ LU-multðBj1;j1 ; Bj1;i Þ j1 5) Bji ¼ Zji k¼1 BTkj Dk 6) end for 7) Bii ¼ MS-LUðBii Þ 8) end for The algorithm LU-mult is given by as follows.

Function X ¼ LU-multðB; YÞ 1) if B 6¼ partitioned then X ¼ U 1 L1 Y, return 2) for i ¼ 1 to M 3) if i > 1 then PDi1 T¼ LU-multðBi1;i1 ; V i1 Þ 4) V i ¼ Y i i1 j¼1 Bji Dj 5) end for 6) for i ¼ M to 1 P 7) X i ¼ LU-multðBii ; V i M j¼iþ1 Bij X j Þ 8) end for In the LU-mult algorithm, the loops at lines 2 and 6 represent the well-known LU forward substitution and backsubstitution. The above leads to a choice between four different algorithms to accomplish the decomposition. The setup phase is identical between block LU and CBD, and also identical between MS-CBD and MS-LU. In order to assess the four options, we have applied all four to an identical problem, the computation of the monostatic RCS of a perfectly conducting Vol. 101, No. 2, February 2013 | Proceedings of the IEEE

367


Fig. 1. NASA Almond compressed matrix size as a function of the total number of unknowns. Multiblock versus multiscale compression.

surface, the NASA Almond [19] using the electric field integral equation (EFIE) and Rao–Wilton–Glisson (RWG) basis functions [20], at 6.8, 12, and 25 GHz, always adapting the discretization size to about 13 samples per wavelength. The ACA threshold was always ¼ 105 . Fig. 1 shows the total size of the compressed impedance matrix (stored in single precision, only the upper diagonal part since the EFIE matrix is symmetric), for the single-scale, multiblock methods on the one hand and the nested methods on the other, as a function of the total number N of basis functions. In both multiscale methods, the number of blocks per level is always equal to M ¼ 2. Fig. 2 shows the time consumption of the decomposition phase for the four cases. Concerning the solution phase, we solved using either CBD-mult or LUmult for 1000 incidence angles simultaneously. The MSCBD was fastest with about 100 s for the highest frequency, then the CBD with 200 s, and then block LU with 400 s. The MS-LU was slowest with about 800 s. The RCS results for the four cases were practically identical. Based on the above experiment, we conclude that, although block LU is more efficient than CBD, the reverse is true regarding the nested

versions of either method. The reason for this is that, while the actual factorization algorithm MS-CBD is slower than MS-LU, the multiplication function CBD-mult is much faster than LU-mult, since the latter needs to apply forward substitution and backsubstitution as seen in the algorithm given above. In the nested implementations, the multiplication functions are called recursively from within the factorization routines and this becomes the dominant operation. Overall, the MS-CBD is the most efficient, especially for larger problems (upwards from 50 000 unknowns). Note that the choice of the number of blocks M for the multiblock algorithms always involves a compromise between compressed matrix size and decomposition efficiency: fewer blocks results in a larger matrix but faster decomposition. In the above experiments, M ¼ 8 for the smallest case, which represents an optimum in terms of decomposition efficiency. For the larger cases, M ¼ 15 and 31, respectively, pffiffiffiffi following the rule for optimum complexity: M / N derived in [22].

I V. NUME RI CAL RE S UL T S All the calculations reported in this paper have been done on a personal computer (PC) with a Quad Intel(R) Xeon(R) X5482 processor at 3.20 GHz and 64 GB of memory. The code was implemented in Matlab, in single precision.

A. NASA Almond This section presents results for the NASA Almond [19] monostatic RCS calculations using MS-CBD. First, the RCS was computed at 21 GHz using 99 915 RWG basis functions and eight-level MS-CBD, varying the SVD postcompression threshold parameter (the ACA threshold is ten times lower [7]), in order to establish the necessary

Fig. 3. Monostatic RCS of NASA Almond (inset) at 21 GHz, calculated Fig. 2. NASA Almond decomposition time as a function of the total number of unknowns. Four different algorithms.

368


with MS-CBD, 99 915 unknowns, and 1000 incidence angles. Three different SVD thresholds .


Fig. 5. Dielectric slab from [23]. Relative permittivities: "1 ¼ 1.44 and "2 ¼ 2.56. Dimensions: L ¼ 5.01710 , h ¼ 1.43240 , d1 ¼ d2 ¼ 0.41810 , where 0 is the free-space wavelength.

Fig. 4. Monostatic RCS of NASA Almond at 75 GHz, calculated with MS-CBD, 1 043 577 unknowns, and 1000 incidence angles.

percent. We are presently unable to provide an explanation as to why this loss of compression rate should depend on the size of the problem.

value of to obtain converged results. Fig. 3 shows the results. As can be observed, the curves for ¼ 105 and ¼ 106 are virtually indistinguishable. Then, the computation was repeated at 75 GHz, now with 1 043 577 basis functions and 11-level MS-CBD (the average RWG edge length was 0:09 in both cases). The target being identical, it was assumed that the value ¼ 105 is sufficient also for the larger case. The result is shown in Fig. 4. At both frequencies, the number of incidence angles was 1000. In Table 1, the storage requirements and computation times for the two cases are shown. It is noteworthy that, while the setup time for the large problem is some 13% higher than the value predicted by the theory in Section II-C, based on the smaller problem, the decomposition time is about 6% lower. In any case, considering the limitations of the model of Section II-C, the correspondence is remarkable. However, there is one anomaly standing out in the results that needs further investigation: the decomposed matrix of the large problem is almost twice as large as the original compressed matrix. In itself this should not be surprising; the preservation of the compression rate has not been proven theoretically, it is merely observed in practice (and it is intuitively explainable, after all, that the information content of a matrix should not grow by inverting it). But for smaller problems, on the order of 100 000 unknowns, the difference is typically a few

B. Dielectric Slab In this section, we present an example of the MS-CBD performance for a different integral equation formulation, the volume integral equation using bitetrahedral basis functions [22] for problems involving nonhomogeneous dielectrics. The example is a numerical experiment from the published literature [23]. It concerns a dielectric slab, shown in Fig. 5. In [23], the bistatic RCS of this slab is computed with a fast iterative method, using 206 200 unknowns. The total computation time was 38 h. We analyzed the same object using MS-CBD. Fig. 6 shows our result for two different values of the SVD threshold . As can be seen, the result has almost converged for ¼ 102 . It also corresponds well with that of [23].

Table 1 Performance Parameters for MS-CBD Applied to NASA Almond at Two Different Frequencies

Fig. 6. Bistatic RCS of the dielectric slab from [23], computed with MS-CBD, using two different SVD thresholds.

Vol. 101, No. 2, February 2013 | Proceedings of the IEEE

369


Table 2 Performance Parameters for MS-CBD Applied to the Dielectric Slab From [23]. Number of Unknowns N ¼ 210 711. Number of Levels L ¼ 9. Two Different Values of the SVD Threshold

The performance parameters of our analysis are summarized in Table 2. It is noteworthy that the decomposition phase is very fast; the bottleneck of these computations is the setup phase. REFERENCES [1] Y. Saad and M. Schultz, BGMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems,[ SIAM J. Sci. Stat. Comput., vol. 7, no. 3, pp. 856–869, 1986. [2] E. Bleszynski, M. Bleszynski, and T. Jaroszewicz, BAdaptive integral method for solving large-scale electromagnetic scattering and radiation problems,[ Radio Sci., vol. 31, no. 5, pp. 1225–1251, Sep./Oct. 1996. [3] J. Song, C. C. Lu, and W. C. Chew, BMultilevel fast multipole algorithm for electromagnetic scattering by large complex object,[ IEEE Trans. Antennas Propag., vol. 45, no. 10, pp. 1488–1493, Oct. 1997. [4] E. Michielsen and A. Boag, BA multilevel matrix decomposition algorithm for analyzing scattering from large structures,[ IEEE Trans. Antennas Propag., vol. 44, no. 8, pp. 1086–1093, Aug. 1996. ´ beda, and [5] J. M. Rius, J. Parroń, E. U J. R. Mosig, BMultilevel matrix decomposition algorithm for analysis of electrically large electromagnetic problems in 3-D,[ Microw. Opt. Technol. Lett., vol. 22, no. 3, pp. 177–185, Aug. 5, 1999. [6] K. Zhao, M. N. Vouvakis, and J.-F. Lee, BThe adaptive cross approximation algorithm for accelerated method of moments computations of EMC problems,[ IEEE Trans. Electromagn. Compat., vol. 47, no. 4, pp. 763–773, Nov. 2005. [7] J. M. Rius, J. Parroń, A. Heldring, J. M. Tamayo, and E. Ubeda, BFast iterative solution of integral equations with method of moments and matrix decomposition algorithmVSingular value decomposition,[ IEEE Trans. Antennas Propag, vol. 56, Special Issue on Large and Multiscale Computational Electromagnetics, no. 8, pp. 2314–2324, Aug. 2008.

V. CONCLUSION This paper addresses the MS-CBD method for fast direct solution of the MoM linear system. A criterion for optimum matrix compression is derived and a concise proof is presented that the storage requirements and computation time scale with N 3=2 and N 2 , respectively, for asymptotically high frequency. The MS-CBD is compared to other matrix decomposition algorithms and found to be the most efficient, in particular for large problems. The monostatic RCS of the NASA Almond at a frequency requiring more than 1 000 000 unknowns is calculated for 1000 incidence angles. Also, the MS-CBD is demonstrated to be very efficient when applied in combination with the volume integral equation formulation for inhomogeneous dielectrics. h

[8] J. M. Taboada, L. Landesa, J. M. Bertolo, F. Obelleiro, J. L. Rodriguez, J. C. Mourino, and A. Gomez, BHigh scalability multipole method for the analysis of hundreds of millions of unknowns,[ in Proc. 3rd Eur. Conf. Antennas Propag., Berlin, Germany, Mar. 23–27, 2009, pp. 2753–2756. [9] Y. Saad, Iterative Methods for Sparse Linear Systems. Boston, MA: PWS, 1996. [10] J. Shaeffer, BDirect solve of electrically large integral equations for problem sizes to 1 M unknowns,[ IEEE Trans. Antennas Propag., vol. 56, no. 8, pp. 2306–2313, Aug. 2008. [11] A. Heldring, J. M. Rius, J. M. Tamayo, ´ beda, BFast direct solution J. Parroń, and E. U of method of moments linear system,[ IEEE Trans. Antennas Propag., vol. 55, no. 11, pp. 3220–3228, Nov. 2007. [12] A. Heldring, J. M. Rius, J. M. Tamayo, J. Parroń, and E. Ubeda, BMultiscale compressed block decomposition for fast direct solution of method of moments linear system,[ IEEE Trans. Antennas Propag., vol. 59, no. 2, pp. 526–536, Feb. 2011. [13] M. Bebendorf, BApproximation of boundary element matrices,[ Numer. Math., vol. 86, pp. 565–589, 2000. [14] A. S. Y. Poon, R. W. Brodersen, and D. N. C. Tse, BDegrees of freedom in multiple-antenna channels: A signal space approach,[ IEEE Trans. Inf. Theory., vol. 51, no. 2, pp. 523–536, Feb. 2005. [15] O. M. Bucci, C. Gennarelli, and C. Savarese, BRepresentation of electromagnetic fields over arbitrary surfaces by a finite and nonredundant number of samples,[ IEEE Trans. Antennas Propag., vol. 46, no. 3, pp. 351–359, Mar. 1998. [16] A. Heldring, J. M. Tamayo, and J. M. Rius, BOn the degrees of freedom in the interaction between sets of elementary scatterers,[ in

[17]

[18]

[19]

[20]

[21]

[22]

[23]

Proc. 3rd Eur. Conf. Antennas Propag., Berlin, Germany, Mar. 23–27, 2009, pp. 2511–2514. T. Banachiewicz, BZur Berechnung der Determinanten, wie auch der Inversen und zur darauf basierten Auflosung der Systeme linearer Gleichungen,[ Acta Astronom. Ser. C, vol. 3, pp. 41–67, 1937. G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed. Baltimore, MD: Johns Hopkins Univ. Press, 1996. A. C. Wool, H. T. G. Wan, M. J. Schuh, and M. L. Sanders, BBenchmark radar targets for the validation of computational electromagnetics programs,[ IEEE Antennas Propag. Mag., vol. 35, no. 1, pp. 84–89, Feb. 1993. R. M. Rao, D. R. Wilton, and A. W. Glisson, BElectromagnetic scattering by surfaces of arbitrary shapes,[ IEEE Trans. Antennas Propag., vol. AP-30, no. 3, pp. 409–418, May 1982. A. Heldring, J. M. Rius, and J. M. Tamayo, BComments on fast direct solution of method of moments linear system,[ IEEE Trans. Antennas Propag., vol. 58, no. 3, pp. 1015–1016, Mar. 2010. D. H. Schaubert, D. R. Wilton, and A. W. Glisson, BA tetrahedral modeling method for electromagnetic scattering by arbitrarily shaped inhomogeneous dielectric bodies,[ IEEE Trans. Antennas Propag, vol. AP-32, no. 1, pp. 77–85, Jan. 1984. X. Nie, L.-W. Li, N. Yuan, T. S. Yeo, and Y. Gan, BPrecorrected-FFT solution of the volume integral equation for 3-D inhomogeneous dielectric objects,[ IEEE Trans. Antennas Propag., vol. 53, no. 1, pp. 313–320, Jan. 2005.

ABOUT THE AUTHORS Alex Heldring was born in Amsterdam, The Netherlands, on December 12, 1966. He received the M.S. degree in applied physics and the Ph.D. degree in electrical engineering from the Delft University of Technology, Delft, The Netherlands, in 1993 and 2002, respectively. Currently, he is an Associate Professor at the Telecommunications Department, Universitat `cnica de Catalunya, Barcelona, Spain. His Polite special research interests include integral equation methods for electromagnetic problems.

370


´ Maria Tamayo was born in Barcelona, Jose Spain, on October 23, 1982. He received the degree in mathematics and the degree in telecommunica`cnica tions engineering from the Universitat Polite de Catalunya (UPC), Barcelona, Spain, both in 2006, and the Ph.D. degree in telecommunications engineering from UPC in 2011. Currently, he holds a postdoctoral position at ISEA, Toulouse, France. His current research interests include accelerated numerical methods for solving electromagnetic problems.


´ beda was born in Barcelona, Spain, in Eduard U 1971. He received the Telecommunication Engineer degree and the Doctor Ingeniero degree from `cnica de Catalunya (UPC), the Universitat Polite Barcelona, Spain, in 1995 and 2001, respectively. In 1996, he was with the Joint Research Center, from the European Commission, in Ispra, Italy. From 1997 to 2000, he was a Research Assistant at the Electromagnetic and Photonic Engineering group at UPC. From 2001 to 2002, he was a Visiting Scholar in the Electromagnetic Communication Laboratory, Electrical Engineering Department, Pennsylvania State University (PSU), University Park. Since 2003, he has been at UPC. He is the author of 15 papers in international journals and 35 papers in international conference proceedings. His main research interests are numerical computation of scattering and radiation using integral equations.

Juan M. Rius (Senior Member, IEEE) received the ´ n[ degree and the BIngeniero de Telecomunicacio BDoctor Ingeniero[ degree from the Universitat `cnica de Catalunya (UPC), Barcelona, Spain, Polite in 1987 and 1991, respectively. In 1985, he joined the Electromagnetic and Photonic Engineering group at the Department of Signal Theory and Communications (TSC), UPC, where he currently holds a position of ´tico[ (equivalent to Full Professor). From BCatedra 1985 to 1988, he developed a new inverse scattering algorithm for microwave tomography in cylindrical geometry systems. Since 1989, he has been engaged in the research for new and efficient methods for numerical computation of electromagnetic scattering and radiation. He is the developer of the graphical electromagnetic computation (GRECO) approach for high-frequency RCS computation, the integral equation formulation of the measured equation of invariance (IE-MEI), and the multilevel matrix decomposition algorithm (MLMDA) in 3-D. Current interests are the numerical simulation of electrically large antennas and scatterers. He has held positions of BVisiting Professor[ at EPFL, Lausanne, Switzerland, from May 1, 1996 to October 31, 1996; BVisiting Fellow[ at City University of Hong Kong, Hong Kong, from January 3, 1997 to February 4, 1997; BCLUSTER chair[ at EPFL from December 1, 1997 to January 31, 1998; and BVisiting Professor[ at EPFL from April 1, 2001 to June 30, 2001. He has more than 54 papers published or accepted in refereed international journals (31 in the IEEE TRANSACTIONS) and more than 150 papers in international conference proceedings.

Vol. 101, No. 2, February 2013 | Proceedings of the IEEE

371