Nov 7, 2006 - a replacement for Markowitz' algorithm may accelerate circuit simulation significantly. Key words Sparse linear systems, Fill reduction, Pivot.
This is a preprint of Material published in
Electrical Engineering (Archiv f¨ ur Elektrotechnik), vol. 90, no. 2, Dec. 2007, pp. 143-146.
You may find the most current version of this file together with a BibTEX entry at http://www.reiszig.de/gunther/. The definitive publication is available at http://www.springerlink.com.
Fill reduction techniques for circuit simulation Gunther Reißig TU Berlin, Fakult¨ at IV - Fachgebiet Regelungssysteme, Sekretariat EN 11, Einsteinufer 17, D-10587 Berlin, Germany, http://www.control.tu-berlin.de/~reiszig/ Received: 13 December 2006 / Accepted: 29 January 2007 / Published online: 15 March 2007
Abstract We investigate the performance of a combination of sophisticated local symmetric ordering methods with a simple symmetrization step on a test set of Jacobians obtained from modified nodal equations. It is demonstrated that using such ordering heuristics as a replacement for Markowitz’ algorithm may accelerate circuit simulation significantly.
Key words ordering
Sparse linear systems, Fill reduction, Pivot
1 Background When the behavior of an electrical circuit is to be simulated, numerical integration techniques are usually applied to its equations of modified nodal analysis. This requires the solution of systems of nonlinear equations, and, in turn, the solution of numerous linear equations of the form Ax = b, (1) where A is a real n×n matrix, typically nonsingular and extremely sparse, and x, b ∈ Rn [1,2,3]. To solve (1), sometimes with A, x, and b complex, is also necessary in other analyses, such as DC- and small signal analysis, and the efficiency by which this is done determines the quality of simulation tools as a whole to a great extent. Although the coefficient matrices of the linear equations to be solved are unsymmetric and indefinite, several simulators, including TITAN [2], solve these equations directly without pivoting for numerical accuracy, see also [3,4,5]. Nevertheless, it is well known that a deliberate choice of the pivot ordering may be crucial for successfully solving equation (1) as a result of the fill introduced during the calculations [6]. Traditionally, circuit simulators have relied on the direct application of Markowitz’ algorithm and its variants to the typically unsymmetric coefficient matrix A
from (1), see e.g. [3,7,8]. As ordering methods that take advantage of symmetry, so-called symmetric methods, are considerably more efficient than their competitors, it has been proposed to apply symmetric ordering methods to the matrix |A| + |A|T , where X T is the transpose of X and |X| denotes the matrix obtained from X by substituting each entry by its absolute value, e.g. [6]. That strategy makes sense only if A is nearly symmetric, which is usually not the case with circuit matrices. However, the trick allows for the following refinement: Initially, diagonal entries with Markowitz product zero are chosen as pivots, as many as possible. The zero-nonzero pattern obtained after those steps is that of a submae obtained from A by removing the pivot rows and trix A columns. Now, a symmetric ordering method is applied e + |A| e T , thereby completing the pivot ordering for to |A| A. The pivot ordering obtained will be appropriate for e rather than A itself, is nearly symmetric. The A if A, observation that matrices arising in circuit simulation usually do have the latter property is due to Reißig and Klimpel [9,10] and has recently been proposed by others as well [11,12]. (The merits of a further refinement, permutation to block upper triangular form, will be discussed in Section 2.) Recent work has shown that sophisticated local symmetric ordering methods applied to circuit matrices symmetrized by the above technique yield orderings that are significantly better than those obtained from Markowitz’ algorithm and other classical ordering heuristics, in some cases at virtually no extra computational cost [8]. In this note we demonstrate that a combination of the above symmetrization technique with sophisticated local symmetric ordering methods outperforms Markowitz’ algorithm in terms of the resulting number of factorization operations when applied directly to the unsymmetric circuit matrices from modified nodal equations, as we had already announced in [13]. In other words, we prove that combination is capable of accelerating circuit simulation significantly.
n X
ci (1 + ri )
(2)
i=1
of factorization operations determined by the pivot orderings obtained, which represents divisions and multiplications, where ci and ri are the number of off-diagonal nonzeros in column and row i of the factors L and U , respectively, of P AP T , and P is the permutation matrix corresponding to the pivot ordering. We chose the implementation of the diagonal pivoting version of Markowitz’ algorithm [7,8] of the circuit simulator TITAN [2] as our reference algorithm. For all other algorithms, we report ratios, i.e., quantities calculated for these algorithms divided by the corresponding quantities for the reference algorithm. We judge algorithms by comparing the geometric mean of the ratios calculated. For example, by “the number of factorization operations for method M is x% less than for method M’ ” we express the following: The quotient of the geometric mean over the ratios of operation counts measured for method M and the corresponding geometric mean for method M’ is 1 − x/100. x is rounded to two decimal digits. For the combinations of Markowitz’ algorithm with symmetric ordering methods applied to the problems “bag” through “sei” listed in Tab. 1, we report the arithmetic mean of the number of factorization operations over 11 runs. For Markowitz’ algorithm, and for the problems buc through X of Tab. 1, we report the result of one run only. From the problem data presented in [8] and Tab. 1, we see that the initial steps of Markowitz’ algorithm only slightly reduce the dimension of the problem, but remove 4% − 43% of the nonzeros. Moreover, while 3.7% − 52% of the nonzeros of the original matrices are structurally unsymmetric, those initial steps remove most or all of
m1 m0
Name n0 m0 bag 143956 995495 cor 123159 891955 eng 4951 42410 jac 935 21769 m8 10534 236683 m14 15944 417551 m24 26202 709681 m40 156115 2039389 sei 3573 291742 buc 10049 78116 gue 88714 549047 te 59716 398494 tei 174880 1205722 xch 10948 83644 X 16492 170454 geometric mean
u0 m0
in % in %
u1 m1
in % o 93 7.3 0 3.5 · 107 74 26 0 1.8 · 106 80 20 0 1.1 · 106 83 17 0 1.0 · 106 83 17 0 3.9 · 107 84 16 0 7.1 · 107 86 13 0 4.1 · 108 88 12 0 4.2 · 109 96 3.7 0 2.5 · 108 58 51 17 8.4 · 105 69 32 2.3 1.6 · 107 64 36 0.032 7.6 · 105 67 33 0.0099 4.7 · 106 57 52 17 8.3 · 105 67 40 11 2.8 · 106
MMF1/2
Problem
AMMF3
We compare Markowitz’ algorithm with a combination of a simple symmetrization step described in Section 1 1/2 and the symmetric ordering methods MMD, AMMF3 , 1/2 and MMF from [8]. The latter three are variants of the Minimum Degree [14], the Approximate Minimum Mean Local Fill, and the Minimum Mean Local Fill algorithm [15]; see [8] and the references given there for details. Our test suite of input data consists of 15 matrices extracted from the circuit simulator TITAN [2]. Our tests have been performed on one of the CPUs of a SUN Enterprise E4500 workstation with 6 Gbytes of memory. The running times for Markowitz’ algorithm, 1/2 MMD and AMMF3 , which are based on bounds on the local fill, appeared to be roughly the same, and MMF1/2 , which is based on exact local fill counts, was more than one order of magnitude slower [8]. Our primary measure for comparing ordering algorithms is the number
MMD
2 Computational results
1/2
Gunther Reißig Markowitz’ A.
2
o 0.95 0.94 0.97 1.12 0.77 0.79 0.94 0.81 0.61 0.92 0.82 1.00 1.08 1.02 0.86 0.90
o 0.76 0.94 0.90 1.27 0.59 0.63 0.68 0.57 0.38 0.77 0.64 1.02 0.96 0.78 0.76 0.75
o 0.68 0.94 0.88 0.92 0.43 0.55 0.49 0.48 0.28 0.69 0.43 0.99 0.86 0.73 0.63 0.63
Table 1 Circuit matrices and performance of Markowitz’ 1/2 algorithm and its combination with the MMD, AMMF3 1/2 and MMF algorithms. n0 , m0 , and u0 denote the number of rows, nonzeros, and structurally unsymmetric nonzeros, respectively. m1 and u1 denote the number of nonzeros and structurally unsymmetric nonzeros, respectively, in the matrices remaining after the initial steps of Markowitz’ algorithm. o denotes the number of factorization operations (2), for the combinations divided by the corresponding value for Markowitz’ algorithm. The values in the row “geometric mean” are geometric means of the ratios reported in the respective column. The values of n0 and m0 are exact, all ratios are rounded to a precision of 10−2 , and the remaining quantities are rounded to two decimal digits.
them. In particular, the remaining matrices for the problems “bag” through “sei” are symmetric, and only about 0.01% − 17% of the nonzeros of the other remaining matrices are structurally unsymmetric. As observed in [8], the advantages of the symmetric ordering methods over Markowitz’ algorithm carry over to the combination of those methods with a symmetrization step for problems leading to symmetric remaining matrices. The results presented in Tab. 1 show that these advantages carry over to the combination even if the remaining matrices are structurally unsymmetric. (Tab. 1 also corrects three mistakes contained in Table 1 of [13].) After all, the combination of Markowitz’ algorithm 1/2 with the MMD, AMMF3 , and MMF1/2 algorithms leads to 10%, 25%, and 37% fewer factorization operations than Markowitz’ algorithm alone. Furthermore, it is evident that the running time of the above combination with the MMD algorithm should never exceed that of an analogous but unsymmetric implementation of Markowitz’ algorithm. In fact, the code of Markowitz’ algorithm we
Fill reduction techniques for circuit simulation
used was always much slower than its combination with 1/2 both the MMD and the AMMF3 heuristic. In general, permuting coefficient matrices to block triangular form [16], of which removal of pivots with zero Markowitz product – the technique we applied – is a first step, followed by ordering and factoring the diagonal blocks can speed up the solution of linear equations even further. However, we found that the effect of that improvement is insignificant if applied to circuit matrices: For the circuit problems bag through sei from Tab. 1, removal of pivots with zero Markowitz product leads to irreducible matrices in all cases except m40. For those problems from Tab. 1 that lead to reducible matrices, the dimension and the number of nonzeros, respectively, of the largest diagonal block would always be greater or equal to 97.6% and 98.7%, respectively, of the corresponding numbers for the whole matrix. Furthermore, the number of structurally unsymmetric nonzeros would be reduced in two cases only and by approximately 0.1%. A final decision on which of those heuristics is best would not only depend on the kind of circuits to be simulated, but also on the computer architecture and the specific numerical factorization algorithm used [17] and is beyond the scope of this paper. However, as shown in [13], the savings in factorization operations of the 1/2 MMF1/2 over the AMMF3 heuristic may very well reduce the overall simulation time.
3
4. I. N. Hajj, P. Yang, and T. N. Trick. Avoiding zero pivots in the modified nodal approach. IEEE Trans. Circuits and Systems, 28(4):271–278, 1981. 5. G.-L. Tan. An algorithm for avoiding zero pivots in the modified nodal approach. IEEE Trans. Circuits and Systems, 33(4):431–434, 1986. 6. J. Vlach and K. Singhal. Computer methods for circuit analysis and design. Van Nostrand Rheinhold, 1983. 7. H. M. Markowitz. The elimination form of the inverse and its application to linear programming. Management Sci., 3:255–269, 1957. 8. G. Reißig. Local fill reduction techniques for sparse symmetric linear systems. Electr. Eng., Published online: 7 November 2006, 2006. Avail. at author’s homepage. 9. G. Reißig and T. Klimpel. Fill-In Minimierung in der Schaltkreissimulation. internal rept., Infineon Technologies, MP PTS, M¨ unchen, 28 Mar. 2000. 10. G. Reißig and T. Klimpel. Verfahren zum computergest¨ utzten Vorhersagen des Verhaltens eines durch Differentialgleichungen beschreibbaren Systems. Patent appl. DE 101 03 793 A 1 (pend.), 28 Jan. 2001. (“Method for computer-aided prediction of the behavior of a system described by differential equations”, in German). 11. T. A. Davis. A column pre-ordering strategy for the unsymmetric-pattern multifrontal method. ACM Trans. Math. Software, 30(2):167–195, 2004. 12. A. Basermann, U. Jaekel, M. Nordhausen, and K. Hachiya. Parallel iterative solvers for sparse linear systems in circuit simulation. Future Generation Computer Systems, 21:1275–1284, 2005. 13. G. Reißig. A new method for ordering sparse matrices and its performance in circuit simulation. In G. Horton, editor, Proc. 18th Europ. Simulation Multiconfer3 Conclusions ence (ESM), Magdeburg, Germany, June 13-16, 2004, pages 216–221. The Society for Modeling and Simulation We have shown that for the purpose of circuit simulation, Intern. (SCS), SCM Publishing House, 2004. a combination of sophisticated local symmetric ordering 14. A. George and J. W. Liu. The evolution of the minimum methods with a simple symmetrization step yields pivot degree ordering algorithm. SIAM Rev., 31(1):1–19, Mar. orderings significantly better than those obtained from 1989. Markowitz’ algorithm alone, in some cases at virtually 15. E. Rothberg and S. C. Eisenstat. Node selection strateno extra computational cost and that that combination gies for bottom-up sparse matrix ordering. SIAM J. Matrix Anal. Appl., 19(3):682–695, July 1998. is capable of accelerating circuit simulation significantly. 16. I. S. Duff, A. M. Erisman, and J. K. Reid. Direct methods for sparse matrices. Oxford University Press, 1986. Acknowledgements I thank P. I. Barton (MIT, Cambridge), 17. I. J. Lustig, R. E. Marsten, and D. F. Shanno. The interG. Denk (Qimonda, M¨ unchen), U. Feldmann (Qimonda, M¨ unchen), action of algorithms and architectures for interior point F. Grund (Weierstraß-Institut, Berlin), T. Klimpel (M¨ unmethods. In P. M. Pardalos, editor, Advances in optichen), and A. Reibiger (TU Dresden, Dresden) for their valumization and parallel computing, pages 190–204. Northable hints, comments, and encouragement. Holland, 1992.
References 1. L. O. Chua, C. A. Desoer, and E. S. Kuh. Linear and Nonlinear Circuits. McGraw–Hill, 1987. 2. U. Feldmann, U. A. Wever, Q. Zheng, R. Schultz, and H. Wriedt. Algorithms for modern circuit simulation. ¨ 46(4):274–285, 1992. Int. J. Electron. Commun. (AEU), 3. L. W. Nagel. SPICE 2: A computer program to simulate semiconductor circuits. Technical Report ERL-M520, Univ. of Calif. Berkeley, Electronic Res. Lab., Berkeley, CA, 1975.