Application of Determinant Maximization Programming to the SOR and Chebyshev Methods for Complex Linear Systems
Wai{Shing Luk , Jan Janssen
Abstract | In the SOR and Chebyshev methods for solving complex linear systems, one crucial procedure is to nd an ellipse that encloses the spectrum of a given complex matrix and gives the smallest convergence factor. For the Chebyshev method, the original Manteuel's procedure requires the ellipse to be symmetric to the real axis, which is far from optimal for most complex matrices. Moreover, the procedure requires a combinatorial search for all possible ellipses. A new approach is proposed that approximates the optimal ellipse problem as a determinant maximization programming problem with linear matrix inequality constraints. The approximated problem can be solved eciently by the recently developed interior-point method. Moreover, both the SOR and Chebyshev cases can be handled in a uni ed manner in this approach.
I. Introduction
There have been extensive studies on the successive overrelaxation (SOR) method and the Chebyshev method for solving large linear systems (see e.g. [20], [19], [21], [5], [2], and [12], [13], [3], [1]). Although the complex case has been considered in the literature, only the use of real parameters for those methods appears to have been investigated. In this paper, methods with complex parameters for the following complex linear system are considered: Ax = b; A 2 C nn ; b; x 2 C n : (1) This work was motivated by the recent development of convolution-based waveform relaxation methods for solving large sparse linear ordinary dierential equations [15], [10], [7], [9]. Consider the following linear system of ordinary dierential equations: C u(t) _ + Gu(t) = f(t); u(0) = u0 where C; G 2 Rnn. Convolution-based waveform relaxation methods can be viewed as conceptually applying classical iterative techniques to the corresponding complex linear system in the Laplace (frequency) domain: ^ + Cu0; s 2 C ; Re(s) 0 (sC + G)^u(s) = f(s) where h^ (s) denotes the Laplace transformed function R1 ?st h(t) dt. The actual manipulations in these methe 0 Department of Computer Science and Engineering, Chinese University of Hong Kong, Shatin, N.T., Hong Kong. E-mail:
[email protected]
y Katholieke Universiteit Leuven, Department of Computer Science, Celestijnenlaan 200A, Heverlee, Belgium. E-mail:
fjan.janssen,
[email protected]
y
and Stefan Vandewalle
y
ods however remain in the time domain. Since a multiplication in the Laplace domain corresponds to a convolution product in the time domain, these methods invoke certain number of convolution products, hence their names. Previously, development of the convolution successive overrelaxation method (CSOR) have motivated researchers to extend the theory of the classical SOR method to the complex domain [8]. Similar development of the convolution Chebyshev method [9] leads us to study the Chebyshev method in the complex domain. Beside this, other possible applications include solving partial dierential equations that model processes involving complex coecient functions and complex boundary conditions (see e.g. [4]). In [8], a complex SOR method with complex relaxation factor is studied when A is consistently ordered. Here, we will mainly consider the Chebyshev method applied to (1) for a general A. Most of the original theory of the Chebyshev method can easily be extended to the complex case. However, an important problem remains to be solved, i.e., to nd the best ellipse that encloses the spectrum of A. In the literature, usually Manteuel's procedure is used [12]. However, the method outlined in [12] only considers real matrices; the ellipses are restricted to be symmetric to the real axis, and the procedure cannot be extended directly to the complex case. Moreover, as pointed out in [6], it requires a combinatorial search for all possible ellipses to check for feasibility and optimality. Ho derived a similar method that avoids searching all possible ellipses [6]. Yet the method scenario still cannot be extended directly to the complex case and in a worst case requires a combinatorial search. A similar optimal ellipse problem occurs in the complex SOR theory in determining the optimal relaxation factor [7]. For that problem, no good method has been found yet. We observe that the Chebyshev ellipse problem and the SOR ellipse problem can be viewed in a uni ed manner as a nonlinear programming problem with linear matrix inequality (LMI) constraints. In the Chebyshev method, the objective function is usually taken to be the asymptotic convergence factor. We notice that it is usually the case that the ellipse with smaller area gives a better convergence rate and hence nding the ellipse with the smallest area is a good alternative in practice. Thus we suggest that the minimization of the asymptotic convergence factor is replaced by minimizing the area of ellipse. The resulting problem is classi ed as a determinant maximization (maxdet) prob-
lem [18]. It turns out that the maxdet problem can be solved eciently by the recently developed interior-point method [18]. Note that the smaller ellipse does not necessarily give better convergence. However, since the spectrum is seldom exactly known, it is more practical to nd the smallest ellipse instead of the \best" one in most applications. For the estimation of eigenvalues in the Chebyshev method, the readers may be referred to [13], [3], [1], [14]. Note that the methods described in those papers can be extended to the complex case in a straightforward manner. In this paper, we mainly discuss the optimal ellipse problem for the Chebyshev method. The technique can however also be applied for the SOR method, and we will show an example in Section IV. The rest of this paper is organized as follows. We start with a review of the Chebyshev method for solving complex linear systems in Section II. We point out a crucial problem of nding an ellipse that encloses the spectrum in the complex domain in more details. This leads us to discuss the determinant maximization problem in Section III. We give examples to demonstrate this method in Section IV. Finally, numerical experiments are given in Section V.
d+c and d ? c. The parameter a refers to the length of the longer semi-axis of the ellipse. Manteuel shows that the Chebyshev iterative process only depends on d and c. More precisely, the residual polynomial is given by the scaled and translated Chebyshev polynomial: T ( d?z ) pk (z) = k dc Tk ( c ) where Tk (z) is the classical Chebyshev polynomialof degree k. Let the asymptotic convergence factor of pk (z) at point be de ned by: (; d; c) = klim jp ()1=k j: !1 k It can be shown that for the Chebyshev iteration,
p
2 2 (; d; c) = (d ? ) + p (d2 ? )2 ? c : d+ d ?c
p
(3)
p where the branch for is chosen such that d2 = d. In order to get fast convergence, one has to nd the ellipse that minimizes the maximum value of (j ; d; c): II. The Chebyshev method for complex systems min max ( ; d; c): (4) In this section, the Chebyshev method for solving comd;c2C j 2(A) j plex linear systems is brie y reviewed. A more detailed discussion can be found in [3], [12], [13]. Consider a poly- Denote S as the set of eigenvalues of A lying on the convex nomial iterative method for solving linear system (1). Such hull of (A). It can be shown that the elements of S completely determine the mini-max problem (4) [12]. That is, a method can be expressed as an iteration for any d and c, we have xk = x0 + 'k?1(A)r0 max ( ; d; c) = max (j ; d; c): where 'k?1 (z) is a polynomial of degree k ? 1, starting from j 2(A) j j 2S the initial guess vector x0 . The residual vector rk = b?Axk Therefore, problem (4) can be simpli ed by can be shown to be given by min max ( ; d; c): (5) rk = (I ? A'k?1(A))r0 = pk (A)r0 : d;c2C j 2S (A) j where I denotes the identity matrix. The polynomial pk (z) In [12], a procedure for nding the optimal solution of (5) is is called the residual polynomial. However, the discussion is restricted to the case The asymptotic convergence behavior of a polynomial discussed. that A is real and the ellipses are symmetric to the real axis. iterative method is determined by We summarize the procedure as follows. In the real case, equation (3) is determined by two real variables d and c2 , max jpk (j )j; j 2 (A) j and is denoted by (; d; c2). Since A is real, its eigenvalues real or complex conjugate. Denote the positive where (A) denotes the spectrum of A. If (A) is are either + hull S as the set of elements of S on the upper part of the known exactly, we may construct a polynomial such that complex plane, i.e. S + = fi 2 S : Im(i ) 0g. The opmaxj jpk (j )j is minimum. However, in practice the eigenvalues of A are not known precisely during the iterations, it timal+ellipse passes through either two or three eigenvalues of ellipses that pass through a is more practical to consider the residual polynomial on a on S . Consider a family + pair of eigenvalues on S , say 1 and 2. The ellipse that xed region containing the spectrum. If the region is given 2 2 gives the minimum of max f ( 1 ; d; c ); (2 ; d; c )g can be by an ellipse, it can be shown that Chebyshev polynomials are optimal in the sense of asymptotic convergence rate. derived analytically by nding the local minimum of the Let (d; c; a) denote the ellipse in the complex plane that intersection of two surfaces contains the spectrum of A: (1 ; d; c2) = (2 ; d; c2): (d; c; a) = fz 2 C : jz ? d + cj + jz ? d ? cj 2ag (2) This ellipse is called pair-wise best ellipse. The result utiwhere d; c 2 C and a 2 R. The parameter d refers to the lizes the alternative theorem [12], which is only valid for real center of the ellipse. The two foci points are located at variables. An ellipse that passes through three eigenvalues
on S + is uniquely determined and can be derived analytically. This ellipse is called three-way ellipse. Basically the algorithm searches for all possible pair-wise best ellipses and three-way ellipses, and selects the one that encloses all eigenvalues on S + and gives the minimal convergence rate. For more details, the reader is referred to the original paper [12]. In the complex case, there is no closed form solution for this mini-max problem. In order to nd the optimal parameters, an optimization routine could be used. In our numerical experiments to be presented in Section V, we used an optimization routine minimax in MATLAB Optimization Toolbox. However, in our experience, the method may fail to get the solution for some initial guesses. In this paper, we present another approach for this minimization problem. The method is based on two observations. One is that the problem (5) can be formulated as a nonlinear programming problem with linear matrix inequality (LMI) constraints. Another one is that an ellipse with smaller area usually gives better convergence. If we use the area as the objective function, the problem can be cast as a determinant maximization programming problem with LMI constraints. In the next section, this subject will be reviewed.
In this paper, we attempt to apply this technique to solve the mini-max problem in the Chebyshev method. In [18], the problem of determining the minimumvolume ellipsoid that contains a given set of points is formulated as the maxdet problem. For our present purposes, we recall the two dimensional case. Let the points be 1 ; : : :; M in R2, let the ellipse be given by f 2 R2 : kA + b k 1g; (6) where A = AT > 0 2 R22 and b 2 R2. The area of is proportional to jA? 1j. The problem of determining the minimum area ellipse can be formulated as the following determinant maximization programming problem: minimize log jA? 1 j (7) subject to A = AT > 0 kA j + b k2 1; j = 1; : : :; M: Note that the norm constraints kA j + b k2 1 can be expressed as matrix inequalities of the form: I A j + b 0: (A j + b )T 1
III. Review of determinant maximization programming
Now we propose a method for solving the optimal ellipse problem by applying the technique in the previous section. We identify the complex plane with the set of coordinates in R2 such that the set in (6) represents the same ellipse de ned by (2). The mini-max problem in (4) can rst be reformulated as a nonlinear programming problem with linear matrix inequality (LMI) constraints: minimize maxj (j ; d; c) subject to A = AT > 0 (8) kA ~j + b k2 1; j = 1; : : :; M where the objective function maxj (j ; d; c) is the asymptotic convergence factor given by (3). The point ~j is the ~j = [Re(j ), R2 representation of the eigenvalue j , i.e. Im(j )]. It is dicult to apply the interior-point method to this programming problem directly because the objective function is non-convex. However, it is usually observed that an ellipse with the smallest area is also a \good" one. This can be explained by the fact in follows. If an ellipse, say A , is totally enclosed by another ellipse, say B , then the virtual asymptotic averaged spectral radius of A is smaller than that of B [9, p. 7]. In other words, the optimal ellipse cannot enclose other ellipses and hence it has to be small. Therefore, we attempt to use (7) as an approximation to (8) to nd a \good" ellipse. As a result, we can apply the interior-point method developed in [18]. Figure 1 illustrates an example of using such technique to nd the smallest ellipse. The points indicated by a \+" mark the eigenvalues of a randomly generated matrix of size 100100. We rst construct a convex hull of the spectrum in order to reduce the number of constraints. This construction is of low cost; its time complexity is O(n logm)
This section outlines the determinant maximization programming (maxdet) described in [18]. Let jK j denote the determinant of a matrix K. Consider the following optimization problem with the variable x 2 Rm minimize gT x + log jG(x)?1j subject to G(x) > 0 F(x) 0 where G(x) = G0 + x1 G1 + + xm Gm 2 Rll F(x) = F0 + x1 F1 + + xm Fm 2 Rqq: The matrices Gj 2 Rll and Fj 2 Rqq are symmetric. The notation G(x) > 0 means G(x) is positive de nite, and the notation F(x) 0 means F(x) is positive semide nite. This is a convex optimization program with linear matrix inequality (LMI) constraints. Note that both the objective function and the constraints are convex. The maxdet problem generalizes the semide nite programming problem in which G(x) = 1, i.e., minimize gT x subject to F(x) 0: One advantage of formulating problems in semide nite programming instead of other optimization problems is that the interior-point method can be applied eciently to this problem [17]. The interior-point method has polynomial time complexity and can solve practical problem very ef ciently. Recently in [18], the techniques for the determinant maximization problem have been further developed.
IV. Our approach
0.8
0.6
0.6 0.4 0.4 0.2
0.2 0
0
−0.2
−0.2
−0.4 −0.4 −0.6 −0.6
−0.8 0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
2.2
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Fig. 1. The smallest ellipse for the Chebyshev method in solving a Fig. 3. The smallest ellipse for the complex SOR method. The center complex linear system. of the ellipse is restricted to be at the origin.
The ellipses in this problem are required to be centered at the origin. This restriction can be formulated easily by taking b = [0; 0]T. The resulting ellipse has only three parameters. Figure 3 illustrates an example of the technique.
0.8 0.6 0.4 0.2
V. Numerical experiments
0 −0.2 −0.4 −0.6 −0.8 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Fig. 2. The smallest ellipse for the Chebyshev method in solving a nonsymmetric real linear system. The ellipse is restricted to be symmetric to the real axis.
where n is the input size and m is the output size [11]. Then the ellipse is found by using the maxdet solver developed by Vandenberghe et al. (see [18] for information on how to get the corresponding software maxdet). Note that the ellipse has only ve parameters. In our experience, the problem can be solved very eciently. One advantage of formulating the problem as a nonlinear programming problem is that other related problems can be handled in a similar manner. Here three examples are given. First, one can use this method to nd the smallest ellipse in the Chebyshev iteration for calculating a few eigenvalues of a large sparse matrix [16]. We can also add other restrictions easily to solve other optimal ellipse problems. For example, in real nonsymmetric cases, the ellipses are required to be symmetric to the real axis in order for the iteration parameters to be real. In that case, one can restrict A to be a diagonal matrix and b = [xb ; 0]T. The resulting ellipse has only three parameters. Figure 2 illustrates an example of using the same technique to nd the smallest ellipse. Similarly, the optimal ellipse problem of the complex SOR method can be solved by this technique.
Consider a two-dimensional complex Helmholtz equation: 2 2 p ? @@xu2 ? @@yu2 ? 1 u + i2u = f; i = ?1 on the unit square (x; y) 2 (0; 1) (0; 1), with 1 2 R and 2 is a real function (cf. [4], Example 6.2). We assume that the solution satis es zero Dirichlet boundary conditions. A standard ve-point discretization on a rectangular mesh with mesh size h = 1=(m + 1) is applied to yield the linear system (1) with A 2 C nn , where n is the number of grid points on a uniform m m grid. The matrix A is of the form: A = L ? 1 h2I + ih2 D; D = diag(d1; d2; : : :; dn) where L is a usual Laplacian matrix, and D is a diagonal matrix whose diagonal elements are the values of 2 at the grid points. The values dj are chosen as random numbers in [0,10]. The right hand side b has constant components 1 + i in this experiment. For demonstration purpose, the spectrum is computed explicitly in the experiments. First we use the routine maxdet to nd the ellipse with smallest area and compute the asymptotic convergence factor maxj 2(A) (j ; d; c). The experiment is repeated with dierent mesh size h. The value of 1 is set to 100 in this experiment. The results are given in Table I. Next, the ellipse found by maxdet is re ned by the routine minimax in MATLAB Optimization Toolbox to obtain better convergence. The results are given in Table II. By comparing Table I with Table II, we observe that the ellipse with minimalarea already leads to a good convergence factor.
TABLE I
The areas and the convergence factors, associated with the smallest ellipses calculated by maxdet
h
area of maxj (j ; d; c)
1/4 1/8 1/16 1/32 1.31 0.385 0.0542 0.0127 0.893 0.983 0.990 0.995 TABLE II
The areas and the convergence factors, associated with the ellipses refined by minimax
TABLE III
Convergence factors, associated with the ellipses refined by minimax
1
a.c.f.(T) a.c.f.(M)
TABLE IV
Convergence factors, associated with the ellipses refined by minimax
1
.
h
area of maxj (j ; d; c)
1/4 1/8 1/16 1/32 1.79 0.467 0.0652 0.0128 0.872 0.977 0.989 0.994
0 40 70 90 100 0.9054 0.9378 0.9653 0.9849 0.9951 0.9075 0.9380 0.9654 0.9850 0.9952
a.c.f.(T) a.c.f.(M)
0 40 70 90 100 0.8946 0.9288 0.9594 0.9822 0.9942 0.8935 0.9276 0.9595 0.9813 0.9951
the problem can be approximated by a determinant maximization programming problem with LMI constraints. Numerical experiments show that the approximated problem can be solved eciently by the recently developed interiorpoint method. Yet the solution parameters still achieve good convergent rate. In our experiments, the exact eigenvalues were assumed to be known. In practice, an adaptive method should be used in analogy with [13], where the convex hull of (A) is estimated during iterations. The applicability of the technique in the context of the complex SOR method also needs further investigation.
The above experiment is repeated with h = 1=32 and with four dierent values of 1. Note that the spectra with 1 dier only by a shift so that the parameters of the smallest ellipse only need to be calculated once. We implement a scheme of the Chebyshev method in [12, Section 2.5] and solve the linear system with the parameters obtained by maxdet. The convergence behavior is plotted in Figure 4. We measure the experimental convergence factor by calculating (krk k=kr0k)1=k when the number of iterations k is suciently large. The results are given in Table III. Entries of the second row of the table represent the calculated asymptotic convergence factors maxj (j ; d; c) and Acknowledgments those of the third row represent the measured ones. As the The rst author would like to acknowledge that most of table shows, the measured convergence rates agree with his works of this paper were done when he was working the theoretical values. Finally, similar results are obtained at the department of computer science, Katholieke Univerafter re ning the parameters by the routine minimax, as siteit Leuven. shown in Table IV.
References [1] D. Calvetti, G. H. Golub, and L. Reichel. An adaptive Chebyshev iterative method for nonsymmetric linear systems based on In this paper, the optimal ellipse problem for the Chebymodi ed moments. Numer. Math., 67:21{40, 1994. shev method has been investigated. We has observed that [2] R.C.Y. Chin and T. A. Manteuel. An analysis of block successive overelaxation for a class of matrices with complex spectra. SIAM J. Numer. Anal., 30:564{585, 1988. [3] H. C. Elman, Y. Saad, and P. E. Saylor. A hybrid Chebyshev 10 Krylov subspace algorithm for solving nonsymmetric systems of linear equations. SIAM J. Sci. Stat. Comput., 7:840{855, 1986. 10 [4] R. W. Freund. Conjugate gradient-type methods for linear sys10 sigma1 = 100 tems with complex symmetric coecient matrices. SIAM J. Sci. Stat. Comput., 13:425{448, 1992. 10 [5] L. Hageman and D. M. Young. Applied Iterative Methods. Academic Press, New York, 1981. 10 sigma1 = 90 [6] D. Ho. Tchebychev acceleration technique for large scale non10 symmetric matrices. Numer. Math, 56:721{734, 1990. [7] M. Hu, K. Jackson, J. Janssen, and S. Vandewalle. Remarks on 10 the optimal convolution kernel for CSOR waveform relaxation. Advances in Comp. Math., 7(1{2):135{156, 1997. 10 [8] M. Hu, K. Jackson, and B. Zhu. Complex optimal SOR parameters and convergence regions. Technical report, Department of 10 Computer Science, University of Toronto, Canada, 1995. Worksigma1 = 70 10 sigma1 = 0 sigma1 = 40 ing Notes. [9] J. Janssen and S. Vandewalle. Convolution-based Cheby10 shev acceleration of waveform relaxation methods. Tech0 100 200 300 400 500 600 number of iterations nical Report TW256, Departement Computerwetenschappen, Katholieke Universiteit Leuven, April 1997. Fig. 4. The residual plot of the Chebyshev method in solving the [10] J. Janssen and S. Vandewalle. On SOR waveform relaxation complex Helmholtz equation with dierent values of 1 . methods. SIAM J. Numer. Anal., 34(6):2456{2481, Dec. 1997. VI. Conclusion and future works
1
0
−1
relative norm of residual
−2
−3
−4
−5
−6
−7
−8
−9
[11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
D. G. Kirkpatrick and R. Seidel. The ultimate planar convex hull algorithm? SIAM J. Comput., 15:287{299, 1986. T. A. Manteuel. The Tchebychev iteration for nonsymmetric linear system. Numer. Math., 28:307{27, 1977. T. A. Manteuel. Adaptive procedure for estimation of parameters for the nonsymmetric Chebyshev iteration. Numer. Math., 31:183{208, 1978. T. A. Manteuel and G. Starke. On hybrid iterative methods for nonsymmetric systems of linear equations. Numer. Math., 73:489{506, 1996. M. W. Reichelt, J. K. White, and J. Allen. Optimal convolution SOR acceleration of waveform relaxation with application to parallel simulation of semiconductor devices. SIAM J. Sci. Comput., 16(5):1137{1158, Sept. 1995. Y. Saad. Chebyshev acceleration techniques for solving nonsymmetric eigenvalue problems. Math. Comput., 42:567{588, 1984. L. Vandenberghe and S. Boyd. Semide nite programing. SIAM Rev., 38:49{95, 1996. L. Vandenberghe, S. Boyd, and S.-P. Wu. Determinant maximization with linear matrix inequality constraints. SIAM J. Matrix Anal. Appl., April 1998. to appear. R. S. Varga. Matrix Iterative Analysis. Automatic Computation Series. Prentice Hall, Englewood Clis, NJ, 1962. D. M. Young. Iterative methods for solving partial dierence equations of elliptic type. Trans. Amer. Math. Soc., 76:92{111, 1954. D. M. Young. Iterative Solution of Large Linear Systems. Academic Press, New York, 1971.