Additive preconditioning in matrix computations - Wiley Online Library

PAMM · Proc. Appl. Math. Mech. 7, 1021201–1021202 (2007) / DOI 10.1002/pamm.200701105

Additive preconditioning in matrix computations V. Y. Pan1,∗ , B. Murphy1∗∗ , R.E. Rosholt1∗∗∗ , D. Ivolgin2† , G. Qian2‡ , I. Taj-Eddin2§ , Y. Tang2¶ , and X. Yan2 1

2

Department of Mathematics and Computer Science, Lehman College of the City University of New York, Bronx, NY 10468, USA. Ph.D. Program in Computer Science, The City University of New York, New York, NY 10016, USA.

We combine our novel SVD-free additive preconditioning with aggregation and other relevant techniques to facilitate the solution of a linear system of equations and other fundamental matrix computations. Our analysis and experiments show the power of our algorithms, guide us in selecting most effective policies of preconditioning and aggregation, and provide some new insights into these and related subjects. Compared to the popular SVD-based multiplicative preconditioners, our additive preconditioners are generated more readily and for a much larger class of matrices. Furthermore, they better preserve matrix structure and sparseness and have a wider range of applications (e.g., they facilitate the solution of a consistent singular linear system of equations and of the eigen-problem). © 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Multiplicative preconditioning is a popular technique for facilitating the solution of a nonsingular linear systems of n equations Ay = f . Originally, preconditioning meant the transition to equivalent but better conditioned linear systems M Ay = M f , AN x = f , or more generally M AN x = M f for y = N x and nonsingular n × n preconditioners M and N , closely linked to the SVD A = SΣT . Such systems can be solved faster and/or more accurately, except that using SVD-based preconditioners M and/or N easily destroys matrix structure and their computation is costly except for some special classes of matrices A. In our alternative we add preconditioners P = U V T of smaller ranks r to obtain better conditioned matrices C = A + P . Hereafter we call such preconditioners additive or APs, write σj (A) for the jth largest singular value of a nonsingular n × n matrix A, and write cond A = σ1 (A)/σn (A) for its condition number. Knowing the SVD A = SΣT one can compute APs U V T of rank r such that cond C = σr+1 (A)/σn−r (A), but without using the SVD we still readily generate rank-r APs U V T such that cond C has the order σ1 (A)/σn−r (A). Namely, it is enough if the matrices U and V are well conditioned, the ratio ||A||/||U V T || is neither large nor small, and the r × r south-most blocks of the matrices SU and V T are well conditioned. The latter property is likely to hold automatically unless the matrices U and/or V are selected by an adversary. To facilitate the solution of a system Ay = f and other matrix computations, we combine our additive preconditioning with various other techniques. Surely, having an APP U V T and well conditioned matrix C, we can apply the SMW formula A−1 = (C − U V T )−1 = C −1 + C −1 U G−1 V T C −1 (by Sherman, Morrison, and Woodbury) to confine the conditioning problems to computing and inverting the r × r Schur aggregate G = Ir − V H C −1 U . For smaller r this inversion is simple, but we still need to compute the matrix C −1 U and the aggregate G with extended precision. We extend the classical iterative refinement algorithm to this task and apply some advanced semi-numerical algorithms that output higher precision sums and products in double-precision computations. We can avoid using iterative refinement by applying our dual SMW formula C− = (A−1 + V U T )−1 = A− AV H −1 U T A, which confines divisions to the stage of inverting the q × q dual Schur aggregate H = Iq + U H AV. We compute dual APs V U T of rank q such that cond C− has the order σq+1 (A)/σn (A). If this makes the system C− y = f well conditioned, then the original conditioning problems are confined to computing the dual aggregate H. By solving linear systems Ay = f with these primal and dual algorithms we output det A = (det G) det C = (1/ det H) det C− as by-product. We specify how to extend these algorithms recursively if the rank of the AP U V T or V U T is too small to decrease cond A as we desire. Our tests show high accuracy of the output values of the determinants and the solutions of linear systems for ill conditioned inputs. ∗

Supported by PSC CUNY Awards 69330–0038 and 69350–0038. [email protected] ∗∗∗ [email protected] † [email protected] ‡ [email protected] § [email protected] ¶ [email protected] [email protected] ∗∗

© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

ICIAM07 Minisymposia – 02 Numerical Analysis

1021202

Our techniques are particularly effective for computing a basis for the null space N (A) of a singular matrix A or a vector in this space. Suppose an n × n matrix A has a nullity r = n − rank A. Then the matrix C = A + U V T is likely to be nonsingular for random n × r matrices U and V , and if so the null space N (A) is given by the range of the null space aggregate C −1 U . Moreover, cond C is likely to be of the order of cond A for weakly random and well conditioned matrices U and V provided the ratio ||A||/||U V T || is neither large nor small. Thus we remove singularity but keep the order of the condition number. The Schur aggregates and the null space aggregates can be viewed as a bridge from APs to aggregation methods. (Compare the aggregation methods in [1], evolved into the Algebraic Multigrid in the 1980s.) We can effectively combine our techniques for null space and linear solving where the input matrix A is both singular and ill conditioned, e.g., where we reduce a nonsinguar but ill conditioned linear system Ay = f to computing a vector (1, yT )T in the null space of the matrix (−f , A). This alternative to using the SMW formula leads to the same benefits and shortcomings. The eigenspace associated with an eigenvalue λ of a matrix A is the null space of the matrix λI − A, and so we can extend our techniques to the matrix eigenproblem. In particular in every step of the inverse power iteration one solves an ill ˜ − A)y = w, even where λ ˜ approximates a simple and well conditioned eigenvalue λ, but we conditioned linear system (λI ˜ − A + U V T instead. As an obvious advantage, we can employ the solve a well conditioned linear system with a matrix λI CG type algorithms. Our analysis and extensive tests show that the resulting acceleration at every step of the inverse iteration does not slow down its convergence. We prove all our claims formally and confirm them experimentally. Our extensive numerical and semi-numerical tests, designed by the first author, were carried out by all coauthors together. All other results of this work are due to the first author. Their detailed description can be found in [2], the relevant Tech. Reports of 2007 in the Computer Science Dept. of the Graduate Center of the CUNY, and the bibliography therein.

References [1] W. L. Miranker, V. Y. Pan, Methods of Aggregations, Linear Algebra and Its Applications, 29, 231–257 (1980). [2] V. Y. Pan, D. Ivolgin, B. Murphy, R. E. Rosholt, I. Taj-Eddin, Y. Tang, X. Yan, Additive Preconditioning and Aggregation in Matrix Computations, Computer and Math. (with Applications), in press.

© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim