Block and Seed BiCGSTAB algorithms for ... - Semantic Scholar

Block and Seed BiCGSTAB algorithms for nonsymmetric multiple linear systems A. El Guennouni, K. Jbilouy This paper is dedicated to the memory of Ruediger Weiss

Abstract

In this paper we give a block version of the BiCGSTAB algorithm for solving multiple linear systems using some algebraic orthogonalities. These orthogonalities allow us to de ne a new procedure called PNB : Partial Near Breakdown procedure in order to stabilize the convergence of the block BiCGSTAB algorithm. We also introduce a seed version of the (block) BiCGSTAB algorithm. Finally, some numerical examples will be reported to illustrate the eectiveness of the proposed methods.

AMS subject classi cation: 65F10. Key words: Block Krylov subspace, block methods, Lanczos method, multiple right-hand

sides, nonsymmetric linear systems, seed system.

Laboratoire d'analyse numerique et d'Optimisation, UFR IEEA-M3, Universite des sciences et technologies de Lille, F-59650 Villeneuve d'Ascq, France. Email: [email protected] yUniversit e du Littoral, zone universitaire de la Mi-voix, batiment H. Poincare, 50 rue F. Buisson, BP 699, F-62228 Calais Cedex, France. Email: [email protected] (address for correspondence)

1

1 Introduction Many applications require the solution of several linear systems of equations with the same coecient matrix and dierent right-hand sides. This problem can be written as

AX = B (1:1) where A is an N N nonsingular and nonsymmetric real matrix, B and X are N s rectangular matrices whose columns are b ; b ; : : : ; b s and x ; x ; : : :; x s respectively and s is of moderate size s N . When N is small, We can use LU factorization, we rst decompose the matrix A at a cost of O(N ) operations and then solve for each right-hand side at a cost of O(N ). So, for direct methods, the main cost is the decomposition of A and each right-hand side is solved eciently by making use of this decomposition. However, for large N , direct methods become very expensive. Instead of applying an iterative method to each linear system, it is more ecient to use a method for all systems simultaneously. Thus, several generalizations of the classical Krylov subspace methods have been developed in the last years. (1)

(2)

( )

(1)

(2)

( )

3

2

The rst class of these methods contains the block solvers such as the block Biconjugate Gradient algorithm (Bl-BCG) [16] (see [21] for a stabilized version), the block generalized minimal residual (Bl-GMRES) algorithm introduced in [25] and studied in [22] and the block quasi minimal residual (Bl-QMR) algorithm [8] (see also [14]) and recently the block BiCGSTAB algorithm (Bl-BiCGSTAB) [3]. These methods construct, at step k, an approximation of the form Xk = X + Zk where Zk belongs to the block Krylov subspace Kk (A; R ) = spanfR ; AR ; : : :; Ak? R g with R = B ? AX and X 2 IRN s is an initial guess. The correction Zk is determined by using a (semi)minimization or an orthogonality property. We note that block methods such as block Lanczos method [10], [9] and block Arnoldi method [20] are also used for solving large eigenvalue problems. The purpose of these block methods is to converge faster than their single right-hand side counterparts. 0

0

0

0

1

0

0

0

0

Another approach consists in projecting the initial residual matrix globally onto a matrix Krylov subspace. This second class contains the global GMRES algorithm [11] and global Lanczos-based methods [12]. A third class of methods uses a single linear system, named the seed system and then consider the corresponding Krylov subspace. The initial residual of the other linear systems are projected onto this Krylov subspace. The whole process is repeated with an other seed system until all the systems are solved. This procedure has been used in [24] and [1] for the conjugate gradient method and in [23] when the matrix A is nonsymmetric using the GMRES algorithm [19]. This approach is very eective when the right-hand 2

sides are closely related [1]. Other seed methods can also be found in [17], [18] and [27]. In this paper we derive the block BiCGSTAB algorithm from some algebraic orthogonalities. This allows us to introduce a new variant of this method. Finally, we propose an attractive based-BiCGSTAB method for solving multiple linear systems with closely related right-hand sides. The remainder of the paper is organized as follows. In section 3, we introduce a block version of the BiCGSTAB algorithm. In section 4, we de ne a new variant called the Partial Near Breakdown block BiCGSTAB algorithm (PNB Bl-BICGSTAB). This method is introduced to cure some near breakdown situations and also stabilize the convergence of Bl-BiCGSTAB. In Section 5 we propose the seed BiCGSTAB method. Section 6 reports on some numerical examples. Throughout this paper, we use the following notations. For X and Y two N s matrices, we consider the following inner product < X; Y >F = tr(X T Y ) where tr(Z ) denotes the trace of the square matrix Z and X T the transpose of the matrix X . The associated norm is the Frobenius norm denoted by k : kF . We will use the notation < ; > for the usual inner product in IRN . For a matrix V 2 IRN s , the block Krylov subspace Kk (A; V ) is the subspace generated by the columns of the matrices V; A V; : : :; Ak? V . 2

1

2 The block BCG algorithm The block Biconjugate gradient (Bl-BCG) algorithm was rst proposed by O'Leary [16] for solving the problem (1.1). This algorithm is a generalization of the one right-hand side well known BCG algorithm [7]. Bl-BCG computes two sets of direction matrices fP ; : : : ; Pk g and fP~ ; : : :; P~k g that span the block Krylov subspaces Kk (A; R ) and Kk (AT ; R~ ) where R = B ? AX and R~ is an arbitrary N s matrix. 0

+1

0

0

+1

0

0

0

The algorithm is summarized as follows [16] ALGORITHM 1: block BCG X is an N s initial guess, R = B ? AX . R~ is an arbitrary N s matrix. P = R , P~ = R~ . For k = 1; 2; : : : compute k = (P~kT? APk? )? R~ Tk? Rk? Xk = Xk? + Pk? k Rk = Rk? ? APk? k ~k = (PkT? AT P~k? )? RTk? R~k? R~k = R~k? ? AT P~k? ~ k 0

0

0

0

0

0

0

0

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

3

0

k = (R~ Tk? Rk? )? R~ Tk Rk ~k = (RTk? R~k? )? RTk R~ k Pk = Rk + Pk? k P~k = R~k + P~k? ~k end. 1

1

1

1

1

1

1 1

The algorithm breaks down if one of the matrices P~kT? APk? or R~Tk? Rk? are singular. Note also that the coecient matrices computed in the algorithm are solution of s s linear systems. The matrices of these systems could be very ill-conditioned and this would aect the iterates computed by Bl-BCG. 1

1

1

1

The Bl-BCG algorithm uses a short three-term recurrence but in many situations, the algorithm typically exhibits very irregular convergence behavior. This problem can be overcome by using a block smoothing technique as de ned in [13] or a block QMR procedure [8]. A stabilized version of the Bl-BCG algorithm has been proposed in [21]; this method converges quite well but is generally more expensive than Bl-BCG. The matrix residuals and the matrix directions generated by ALGORITHM 1 verify the following properties. Other based block methods are proposed in [2, 4].

Proposition 1 [16] If no break-down occur, the matrices computed by the Bl-BCG algorithm satisfy the following relations

(1) (2) (3) (4)

R~Ti Rj = 0 and P~iT APj = 0 for i < j . spanfP ; : : :; Pk g = spanfR ; : : : ; AkR g. spanfP~ ; : : :; P~k g = spanfR~ ; : : : ; AT k R~ g. Rk ? R 2 Kk (A; R ) and the columns of Rk are orthogonal to Kk (AT ; R~ ). 0

0

0

0

0

0

0

0

0

3 The block BiCGSTAB algorithm In [3], we de ned the block BiCGSTAB algorithm by using right matrix orthogonal polynomials. In this section, we show how to derive Bl-BiCGSTAB from some algebraic orthogonalities. Note that this technique is also new when s = 1 (classical BiCGSTAB [26]). Consider the multiple linear system

AX = B: Let Rbk and Pkb denote the k-th residual and the k-th direction produced by Bl-BCG. We de ne the k-th residual of Bl-BiCGSTAB by

Rk = (I ? !k A) Sk 4

(3:1)

where

Sk = (I ? !k? A) : : : (I ? ! A) Rbk : The parameter !k is chosen to minimize the F-norm of Rk so Sk ; ASk >F : !k = F = < Tk ; Tk >F ; Xk = Xk? + Pk? k + !k Sk ; Rk = S k ? ! k T k ; k = ?(R~ T Vk? )? (R~T Tk ); Pk = Rk + (Pk? ? !k Vk? ) k ; end. 0

0

0

1

1

1

1

0

1

0

1

0

1

1

0

0

1

1

1

1

0

1

Remark 1 1)

For Bl-BiCGSTAB, we have the following orthogonalities (R~ T0 Sk ) = 0 and (R~T0 AQk ) = 0:

2) If s = 1, Algorithm 2 reduces to the classical BiCGSTAB algorithm [26] de ned by using recurrences of orthogonal polynomials. Note that the expression of the coecient k computed by Algorithm 2 with s = 1 is more simple than the one given in [26]. 3) At step k, a breakdown will occur in Algorithm 2 if the matrix (R~ T0 APk?1 ) is singular.

4 The Partial Near Breakdown procedure Recall that the block BiCGSTAB algorithm can suer from a possible (near) breakdown when the matrix R~T APk? is 'near' singular and then the iterates of block BiCGSTAB would be aected. 0

1

6

A frequent situation corresponds to the case where the matrix R~T APk is 'near' singular but not zero, we call this situation a Partial Near Breakdown : PNB. It contains in particular the so called de ation problem treated in the block QMR algorithm [8] based on the Lanczos process with multiple starting vectors; see also [15] for a detailed discussion for the block Conjugate Gradient algorithm. To remedy this situation, we must be able to answer the following constraints : 1. De ne a procedure which can detect "numerically" the PNB problem. 2. Detect also the linear systems causing this situation. 3. Drop these linear systems (which will be called the dropped systems) and ensure the continuation of the algorithm with a smaller block-size. The linear systems that are not dropped will be called the current systems. 4. De ne the iterates of the dropped systems using all the informations obtained before the detection of the PNB situation. Concerning the rst two constraints we suggest the use of the modi ed Gram-Schmidt process (MGS). It allows us to detect "numerically" the linear systems causing the PNB problem and compute a QR decomposition of the matrix to be inverted in the algorithm. Given a matrix 2 IRss , the MGS process is summarized as follows 0

ALGORITHM 3: [R; Q; Ipnb] = MGS(; tol) 1: Initialize Ipnb = ;; 2: For i = 1; : : :; s do q~ = (:; i); for j = 1; : : :i ? 1 do; if j 2= Ipnb; rj;i =< (:; i); q~ > ; q~ = q~ ? rj;i(:; j ); endif end 3: ri;i = kq~k ; 4: if ri;i < tol then Ipnb = Ipnb [ fig; else qi = q~=ri;i; endif end 2

2

The set Ipnb contains the indices of the columns that are (almost) linearly dependent and tol is a chosen tolerance. 7

To answer the last two constraints, we propose to drop all the systems whose columns are in the set Ipnb and then continue the algorithm with a smaller block-size using the same iterations as in the block BiCGSTAB algorithm; this will be justi ed later. The iterates of the dropped systems will be de ned by using some orthogonalities of the dropped residuals onto the right current block Krylov subspace. In the sequel we use the following notations : - sc and sd are the numbers of the current and dropped systems respectively. - Rk , Pk and Sk (of dimension n sc) denote the current iterates of the Bl-BiCGSTAB algorithm. - Rdk , Pkd and Skd (of dimension n sd) will be the dropped iterates of Bl-BiCGSTAB. The main idea is to conserve the orthogonalities of the k-th current and dropped iterates on the right current block Krylov subspace Kk (AT ; R~s ) where R~s is the n sc matrix obtained from R~ by deleting the columns causing previously the PNB problem. c 0

c 0

0

4.1

Construction of the current iterates

After detecting the PNB problem and dropping the linear systems causing this situation, the k-th current iterates will be constructed as follows

Rk = (I ? !k A)Sk ; where Sk is given by

(4:1)

Sk = (I ? !k? A) : : : (I ? ! A) Rb;s (4:2) k ; !k is chosen such that the F-norm of the current and dropped matrix residuals are minimized and Rb;s k is an n sc matrix that is orthogonal to the current block Krylov subspace s T ~ Kk (A ; R ). By induction, we can show that Rb;s k satis es the following coupled two-term recurrences 1

1

c

c

c 0

c

b;s b;s Rb;s k = Rk? ? APk? k ; c 1

c 1

c

(4:3)

b;s (4:4) Pkb;s = Rb;s k + Pk ? k ; where the matrix parameters k and k are chosen to ensure the following orthogonalities T ~ s ); Rb;s (4:5) k ? Kk (A ; R c

c

c

and

c 1

c 0

Pkb;s ? AT Kk (AT ; R~s ): c

c 0

8

(4:6)

The current Bl-BiCGSTAB matrix direction Pk is de ned by

Pk = (I ? !k A) Qk ;

(4:7)

where

(4:8) Qk = (I ? !k? A) : : : (I ? ! A) Pkb;s ; The sc sc matrix parameters k and k appearing in the previous equations are uniquely determined. In fact the orthogonalities (4.5) and (4.6) implies that R~ s T Sk = 0s ; 1

c 0

and Hence, we have

c

1

c

R~s T AQk = 0: c 0

k = (R~s T APk? )? (R~s T Rk? ) k = ?(R~s T APk? )? (R~s T ASk ): c 0

1

1

c 0

c 0

1

1

1

c 0

Finally, from (4.1)-(4.4), (4.7) and (4.8) we get the recurrence formulas satis ed by the current iterates which are similar to those given by Bl-BiCGSTAB (without Partial Near Breakdown procedure) with a smaller block-size. 4.2

Construction of the dropped iterates

The dropped residual matrix will be constructed by

Rdk = (I ? !k A)Skd;

(4:9)

where

(4:10) Skd = (I ? !k? A) : : : (I ? ! A) Rb;d k : and Rb;d k is an n sd matrix that is orthogonal to the right current block Krylov subspace Kk (AT ; R~ s ). Thus it can be expressed as 1

1

c 0

b;s ; b;d Rb;d k k = Rk? ? APk?

(4:11)

c 1

1

where k is determined from the following orthogonality R~ s T Skd = 0s ; c 0

therefore

c

k = (R~s T APk? )? R~s T Rdk? : c 0

1

9

1

c 0

1

From (4.9), (4.10) and (4.11) we obtain Rdk = Skd ? !k ASkd; Skd = Rdk? ? APk? k : Finally to minimize the F-norm of Rk and Rdk , we choose !k as follows !k =< [Sk Skd]; [Sk Skd] >F = < [ASk ASkd]; [ASk ASkd] >F : The Bl-BiCGSTAB with Partial Near Breakdown procedure (PNB Bl-BiCGSTAB) is summarized as follows 1

1

ALGORITHM 4: PNB Bl-BiCGSTAB ? Initializations : Choose X and R~ ; sc = s; sd = 0; R = B ? AX ; P = R ; ? Iterations : for k = 1; 2; : : : While card (Ipnb) 6= 0, do [Rk? ; Qk? ; Ipnb] = MGS (R~ T APk? ; tol) sc = sc ? card(Ipnb ); sd = sd + card(Ipnb ); Drop the columns of Xk? ; R~ ; Rk? and Pk? which indices i 2 Inbp; Construct the new dropped iterates Xkd? and Rdk? End while k = R?k? QTk? (R~T Rk? ); Sk = Rk? ? APk? k ; k = R?k? QTk? (R~T Rdk? ); Skd = Rdk? ? APk? k ; !k =< [Sk Skd]; [Sk Skd ] >F = < [ASk ASkd]; [ASk ASkd] >F ; Rk = Sk + !k ASk; Rdk = Skd + !k ASkd; Xk = Xk? + Pk? k ? !k Sk ; Xkd = Xkd? + Pk? k ? !k Skd; k = ?R?k? QTk? (R~T ASk ); Pk = Rk + (Pk? + !k APk? ) k ; 0

0

0

0

1

0

0

1

1

0

1

0

1

1

1

1

1

1

1 1 1

1

1

0

1

1

0

1

1

1

1

1

1 1

1

1

1

1

0

1

In practice, the invariance of the current block Krylov subspace could be reached before the convergence of the dropped systems. In this case, we restart the algorithm with new current iterates chosen among the dropped ones and a new matrix direction Pkd such that Pkd = (I ? !k A) Qdk ; 10

Qdk = (I ? !k? A) : : : (I ? ! A) Pkb;d; b;s Pkb;d = Rb;d k + Pk? k ; Imposing the AT -orthogonality of Pkb;d to Kk (AT ; Rs ) we obtain k = ?(R~ s T APk? )? (R~s T ASkd): 1

1

c 1

c 0

c 0

1

1

c 0

This technique allows us to conserve the orthogonalities of the new current iterates onto the corresponding right current block Krylov subspace.

5 The seed BiCGSTAB method In this section, we present a new method for solving linear systems with multiple righthand sides. This method consists in selecting one of the systems, called the seed system, to solve it by the BiCGSTAB algorithm and then construct the residuals of the others by using the informations obtained from the seed systems iterates. The whole process will be repeated with another unsolved system as a seed until all the systems are solved. The advantage of the new seed method when compared to other seed methods is the fact that the convergence of all systems will be ensured in at at most N iterations (total iterations). We note also that in practice, for example in wave scattering problems [24], in time marching methods for PDE [6] or in structural mechanic problems [5], the right-hand sides are not arbitrary. They are usually discrete values of some function b(t). In other words, b i = b(ti); i = 1; : : : ; s: Assuming that b(t) 2 C k , it can be shown that the solutions x i are discrete values of some function x(t) 2 C k . So if b i 's are close, we may also expect the x i 's to be close [1]. In theses situations, 'near' linear dependence may arise among the right-hand sides. This makes the block methods ineective and they may even suer from a near breakdown problem. Special treatment is needed to handle this situation. However, the seed methods are very eective when the right-hand sides bi's are close, it usually only takes few restarts to solve all the systems. Theoretical analysis has been given to explain this phenomena for the seed Conjugate Gradient method [1]. In the sequel, we note by the index of the seed system. To solve it, we use the BiCGSTAB algorithm with another choice of the parameter !. So, given r~ 2 IRN , the seed iterates are de ned by ( )

( )

( )

( )

0

rk = sk ? !k Ask ; ( )

where

( )

( )

sk = rk? ? k Apk? ; ( )

( ) 1

( )

11

( ) 1

and

pk = rk + k (pk? ? !k Apk? ): Note that rk and pk are also de ned by ( )

( )

( )

( )

( ) 1

( ) 1

( )

rk = (I ? !k A) : : : (I ? ! A)rk ;b; ( )

and

pk = (I ? !k A) : : : (I ? ! A)pk ;b; (5:2) are the k-th residual and the the k-th direction produced by the ( )

where rk ;b and pk ;b BCG algorithm. Thus ( )

( )

(5:1)

( )

1

rk ;b ? Kk (AT ; r~ );

(5:3)

pk ;b ? AT Kk (AT ; r~ ):

(5:4)

( )

and

0

( )

This leads to

( )

1

0

k = < r~ ; rk? > = < r~ ; Apk? > ; k = ? < r~ ; Ask > = < r~ ; Apk? > : ( )

( )

0

( ) 1

2

( )

0

( ) 1

0

2

0

2

( ) 1

2

The nonseed residual matrix will be constructed by

Rk = (I ? !k A)Sk ;

(5:5)

where

Sk = (I ? !k? A) : : : (I ? ! A) Rbk : and Rbk is a matrix that is orthogonal to the Krylov subspace Kk (AT ; r~ ) i.e 1

1

(5:6)

0

Rbk ? Kk (AT ; r~ ):

(5:7)

0

A simple argument induction shows that Rbk can be expressed as

Rbk = Rbk? ? Apk? ;b k ; ( ) 1

1

where pk? ;b is the BCG direction of the seed system. k is determined by using the orthogonality (5.7), which leads to ( ) 1

r~T Sk = 0; 0

therefore

k =

r~T Rk? : < r~ ; Apk? > 0

0

12

1

( ) 1

2

(5:8)

From (5.5), (5.6) and (5.8) we obtain

Rk = Sk ? !k ASk ; Sk = Rk? ? Apk? k : ( ) 1

1

Finally to minimize the F-norm of the seed and nonseed residuals rk and Rk , we choose !k as follows ( )

!k =< [sk Sk ]; [sk Sk ] >F = < [Ask ASk ]; [Ask ASk] >F : Remark 2 The expression of Rbk in (5.8) can be replaced by Rbk = Rbk? ? Apk? ~k ; ( )

( )

( )

( )

1

1

for any direction which veri es

pk? ? AT Kk (AT ; r~ ); 1

0

where ~k is determined from the orthogonality (5.7). The result of Remark 1 is very important. It shows that, at any iteration, we can change the seed system by another among the nonseed ones. More precisely, suppose that we want to change the seed system at the m-th iteration and let denote by 0 the index of the new seed system. If we de ne the direction corresponding to the new seed system by

pm = rm + m (pm? ? !mApm? ); (

0

0

(

)

)

(

0

( )

)

where

( )

1

1

m = ? < r~ ; Asm > : < r~ ; Apm? > In fact, by construction, we have rm = (I ? !m A) : : : (I ? ! A)rm ;b; and pm? = (I ? !m? A) : : : (I ? ! A)pm?;b such that rm ? Km(AT ; r~ ); pm? ? AT Km? (AT ; r~ ): Thus we can express pm as follows pm = (I ? !m A) : : : (I ? ! A)pm ;b; 0

(

0

)

0

(

( )

0

( )

2

2

1

(

1

1

1

( )

0

0

)

(

(

)

(

0

)

( )

1

)

0

0

1

1

0

)

(

0

)

1

13

(

0

)

1

(5:9)

where

pm ;b = rm ;b + m pm?;b : By remarking that the choice of m in (5.9) ensures the orthogonality of pm ;b onto the Krylov subspace Km (AT ; r~ ) and using the result of Remark 2 we see that we can change the seed system (the seed direction also) and construct the residual matrix corresponding to the seed and nonseed systems without destroying the orthogonalitie conditions. This allows us to achieve the convergence of all the systems in at most N iterations. In practice, we change the seed system only when breakdown occurs or when the seed system has converged. The seed BiCGSTAB algorithm is summarized as follows ALGORITHM 5: seed BiCGSTAB Choose the seed system of index ; x and r~ arbitrary; p = r = b ? Ax ; Construct the nonseed matrix residual R from an arbitrary X For k = 0; 1; : : : k =< r~ ; rk? > = < r~ ; Apk? > ; k = r~T Rk? = < r~ ; Apk? > ; sk = rk? ? k Apk? ; Sk = Rk? ? Apk? k ; !k =< [sk Sk ]; [sk Sk ] >F = < [Ask ASk ]; [Ask ASk ] >F ; rk = sk ? !k Ask ; Rk = Sk ? !k ASk; If convergence of the seed system, take another seed system of index 0; k = ? < r~ ; Ask > = < r~ ; Apk? > ; pk = rk + k (pk? ? !k Apk? ); = 0; Else k = ? < r~ ; Ask > = < r~ ; Apk? > ; pk = rk + k (pk? ? !k Apk? ); end. We note that, in our numerical tests, the seed systems was chosen arbitrary. 0

(

)

(

(

0

0

)

(

0

) ( )

1

)

(

0

)

0

( ) 0

0

0

( ) 0

( )

0

0

( )

0

( )

( ) 1

0

1

(

(

0

0

( )

( ) 1 ( )

( )

0

( )

( )

(

0

(

(

)

2

2

( ) 1

( )

0

0

)

( )

2

0

( ) 1

( ) 1

)

( )

0

( )

( ) 1

( )

( )

)

)

0 ( ) 1

0

( ) 1

1 ( )

( )

2

0

( ) 1

2

0 ( ) 1

( ) 1

( ) 1

2

2

6 Numerical examples This section reports on some experimental results. Our examples have been coded in Matlab and have been executed on a SUN SPARC workstation. The initial guess was X = 0 and B = rand(N; s), where function rand creates an N s 0

14

random matrix with coecients uniformly distributed in [0,1] except for the last two examples.

Example 1 In this example, we compared the performance of Bl-BiCGSTAB, Bl-QMR and BlGMRES(10). The tests were stopped as soon as maxi skRk (:; i)k 10? : maxi skR (:; i)k The matrix test A represents the ve-point discretization of the operator L(u) = ?uxx ? uyy + ux on the unit square [0,1] [0,1] with Dirichlet boundary conditions. The discretization was performed using a grid size of h = 1=61 which yields a matrix of dimension N = 3600. The matrix A is nonsymmetric and has positive de nite symmetric part. We choose

= 100. The number of second right-hand sides was s = 10. We set R~ = R in the Bl-BiCGSTAB algorithm. Figure 1 reports on convergence history for Bl-BiCGSTAB and Bl-QMR. In this gure, we plotted the maximum residual norms versus the ops (number of arithmetic operations). As observed from Figure 1, Bl-BiCGSTAB returns the best results. We note that for this experiment Bl-GMRES(10) requires 1:1 10 ops to achieve the convergence and so was not represented in Figure 1. =1:

=1:

2

0

6

2

1

1

0

0

10

2 BL BiCGSTAB BL QMR 1

maximum residual norm

0

−1

−2

−3

−4

−5

0

2

4

6 flops

8

12 8

Figure 1: A = A ; s = 10 1

15

10

x 10

Example 2 For this example, we used two matrices from the Harwell-Boeing collection A and A where A = Saylr4 (N = 3564, nnz(A ) = 22316) and A = Utm1700a (N = 1700, nnz(A ) = 21313). We also used ILUT as a preconditioner with a dropping tolerance = 10? . We compared the performance (in term of ops) of the Bl-BiCGSTAB algorithm, the Bl-GMRES(10) algorithm and the Bi-CGSTAB algorithm applied to each single righthand side. Two dierent numbers of right-hand sides were taked s = 5 and s = 10. The iterations were stopped when 2

3

2

4

2

3

3

maxi

s kRk (:; i)k2 10

=1:

?9 :

In Table 2 and Table 3, f (s) and f (s) represent the following ratios f (s)= ops(Bl-BiCGSTAB)/( ops(Bi-CGSTAB)), 1

2

1

f (s)= ops(Bl-BiCGSTAB)/ ops(Bl-GMRES(10)). 2

We note that ops(Bi-CGSTAB) corresponds to the ops required for solving the s linear systems by the Bi-CGSTAB algorithm. Table 1 f (s) f (s) s = 5 0.77 0.06 s = 10 0.89 0.12 Test for the matrix A = Saylr4 1

2

2

As shown in Table 1 and Table 2, Bl-BiCGSTAB is less expensive than Bl-GMRES and than Bi-CGSTAB applied to each single right-hand side linear system. Table 2 f (s) s = 5 0.49 s = 10 0.38 Test for the matrix A 1

3

16

f (s) 0.62 0.36 = Utm1700a 2

Example 3 In this example we compared Bl-BiCGSTAB, PNB Bl-BiCGSTAB and Bl-QMR (with a de ation procedure). In Figure 2 we plotted the maximum residual norms versus the

ops required for the three algorithms to converge. The iterations were stopped when maxi

s kRk (:; i)k2 10

=1:

?8 :

The tolerance used for detecting the Partial Near Breakdown situation and the de ation problem was tol = 10?8 . The matrix test used in Figure 2 was A = Sherman1. The size of this matrix is N = 1000 and nnz(A ) = 3750. We choose s = 8 and a maximum of 1000 iterations was allowed to the Bl-QMR algorithm. 4

3

As shown in Figure 2, Bl-BiCGSTAB failed to converge, Bl-QMR did not achieve the convergence within 1000 iterations. 4 bl BiCGSTAB Bl QMR PNB Bl BiCGSTAB 2

−2

i

k

max ||R (:, i)||

2

0

−4

−6

−8

−10

0

0.5

1

1.5

2 flops

2.5

3

Figure 2: A = Sherman1; s = 8 4

17

3.5

4 8

x 10

Example 4 In this example, we tested the eectiveness of the seed BiCGSTAB algorithm for solving multiple linear systems with closely related right-hand sides. The matrix test used for this example was A = add32. The size of this matrix is N = 4960 and nnz(A ) = 19848. The tests were stopped as soon as maxi skRk (:; i)k 10? : maxi skR (:; i)k The right-hand side matrix B is the N s matrix whose element are B (i; j ) = sin( 2N (i + j ? 2)): So the i-th column b i of the matrix B is obtained by shifting the components of the column b i? by one position and the rst component is replaced by the last one. Figure 3 reports on the convergence history of the seed BiCGSTAB. In this gure we plotted the residual norms, in the logarithmic scale, for the s linear systems versus the iterations. 5

5

=1:

=1:

2

0

6

2

( )

(

1)

2

1

residual norms

0

−1

−2

−3

−4

−5

0

10

20

30

40 iterations

50

60

70

Figure 3: A = A ; s = 20 Remark that three restarts were necessary to achieve the convergence of the s linear systems. We note that no strategy was used to choose the seed linear system. In Table 3 we listed the number of matrix-vector products required for convergence by the seed BiCGSTAB algorithm and the BiCGSTAB algorithm when applied to each linear system. We used dierent values of s; s = 10, s = 20 and s = 30. 5

18

Table 3 BiCGSTAB applied at seed each right-hand side BiCGSTAB s = 10 890 731 s = 20 1806 1387 s = 30 2684 2299 Test for the matrix add32

Example 5 For this last example we compared seed BiCGSTAB with BiCGSTAB applied to each linear system using ILUT as a preconditionner (with a dropping tolerance of = 10? ). The matrix test was A = Sherman5 (N = 3312, nnz(A ) = 20793). Dierent values of s were used; s = 10, s = 30 and s = 50. The right-hand side matrix B was the same as for Example 4. 5

6

6

2

0

residual norms

−2

−4

−6

−8

−10

0

1

2

3 iterations

4

5

6

Figure 4: A = A ; s = 30 6

As seen in Figure 4, one restart was necessary to acheve the convergence for this experiment. In Table 4, we reported matrix-vector products required for the seed BiCGSTAB and the BiCGSTAB algoritms.

19

Table 4 BiCGSTAB applied at seed each right-hand side BiCGSTAB s = 10 80 54 s = 30 240 165 s = 50 400 289 Test for the matrix Sherman5

7 Conclusion In this paper, we rst proposed a block BiCGSTAB algorithm for solving multiple linear systems. The method was obtained directly from linear algebra relations (without using right matrix orthogonal polynomials). This allowed us to de ne a new procedure for curing near breakdown situations. The derived algorithm (PNB Bl-BICGSTAB) was numerically compared to other algorithms and was more eective. When the columns of the right-hand side matrix B are closely related, which is the case in some practical problems, we also proposed a more adaptive new method (seed BiCGSTAB). The numerical tests show that the proposed methods are very attractive.

References [1] T. Chan and W. Wang, Analysis of Projection Methods for Solving Linear Systems with Multiple Right-hand Sides, SIAM J. Sci. Comput., 18(1997), pp. 1698-1721. [2] A. El Guennouni, Mise en oeuvre et variantes par bloc des methodes de type Lanczos, Ph.D. Thesis, Universite des Sciences et Technologies de Lille, France, 2000 [3] A. El Guennouni, K. Jbilou and H. Sadok, A blcok BiCGSTAB algorithm for multiple linear systems, Tech. Report 86, LMPA (1999), Universite du Littoral, Calais France. [4] A. El Guennouni, K. Jbilou and H. Sadok, The block Lanczos method for linear systems with multiple right-hand sides, Tech. Report 90, LMPA (1999), Universite du Littoral, Calais France. [5] C. Farhat and F. X. Roux, Implicit parallel processing in structural mechanics, Technical Report CU-CSSC-93-26, Center of aerospace Structures, University of Colorado, November 1993. [6] Paul F. Fisher, Projection techniques for iterative solution of Ax = b with successive right-hand sides, Technical Report 93-90, ICASE, December 1993. 20

[7] R. Fletcher, Conjugate gradient methods for inde nite systems, In G. A. Watson, editor, Proceedings of the Dundee Biennal Conference on Numerical Analysis 1974, pages 73-89. Springer Verlag, New York, 1975. [8] R. Freund and M. Malhotra, A Block-QMR Algorithm for Non-Hermitian Linear Systems with multiple Right-Hand Sides, Linear Alg. Appl., 254(1997), pp. 119157. [9] G. H. Golub and R.R. Underwood, A block Lanczos algorithm for computing the q algebraically largest eigenvalues and a corresponding eigenspace of large, sparse, real symmetric matrices, Proc. of 1974 IEEE conf. on Decision avd Control, Phoenix. [10] G. H. Golub and R. R. Underwood, The block Lanczos method for computing eigenvalues, in Mathematical Software 3, J. R. Rice, ed., Academic Press, New York, 1977, pp. 364-377. [11] K. Jbilou A. Messaoudi and H. Sadok, Global FOM and GMRES algorithms for matrix equations, Appl. Numer. Math., 31(1999), pp 49-63a. [12] K. Jbilou H. Sadok, Global Lanczos-based methods with applications, Technical Report LMA 42, Universite du Littoral, Calais, France 1997. [13] K. Jbilou, Smoothing iterative block methods for linear systems with multiple righthand sides, J. Comput. Appl. Math., 107(1999), pp 97-109. [14] M. Malhotra, R. Freund and P. M. Pinsky. Iterative Solution of Multiple Radiation and Scattering Problems in Structural Acoustics Using a Block QuasiMinimal Residual Algorithm, Comp. Meth. Appl. Mech. Eng., 146(1997), pp. 173-196. [15] A. Nikishin and A. Yeremin, Variable Block CG Algorithms for Solving Large Sparse Symmetric Positive De nite Linear Systems on Parallel Computers, I: General Iterative Scheme, SIAM J. Matrix Anal. Appl., 16(1995), pp. 1135-1153. [16] D. O'Leary, The block conjugate gradient algorithm and related methods, Linear Algebra Appl., 29(1980), pp. 293-322. [17] B. Parlett, A New Look at the Lanczos Algorithm for Solving Symmetric Systems of Linear Equations, Linear Algebra Appl., 29(1980), pp. 323-346. [18] Y. Saad, On the Lanczos Method for Solving Symmetric Linear Systems with Several Right-Hand Sides, Math. Comp., 48(1987), pp. 651-662. [19] Y. Saad and M. H. Schultz, GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statis. Comput., 7(1986), pp. 856-869. 21

[20] M. Sadkane, Block Arnoldi and Davidson methods for unsymmetric large eigenvalue problems, Numer. Math., 64(1993), pp. 687-706. [21] V. Simoncini, A stabilized QMR version of block BICG, SIAM J. Matrix Anal. Appl., 18-2(1997), pp. 419-434. [22] V. Simoncini and E. Gallopoulos, Convergence properties of block GMRES and matrix polynomials, Linear Algebra Appl., 247(1996), pp. 97-119. [23] V. Simoncini and E. Gallopoulos, An Iterative Method for Nonsymmetric Systems with Multiple Right-hand Sides, SIAM J. Sci. Comp., 16(1995), pp. 917-933. [24] C. Smith, A. Peterson and R. Mittra, A Conjugate Gradient Algorithm for Tretement of Multiple Incident Electromagnetic Fields, IEEE Transactions On Antennas and Propagation, 37(1989), pp. 1490-1493. [25] B. Vital, Etude de quelques methodes de resolution de problemes lineaires de grande taille sur multiprocesseur, Ph.D. thesis, Universite de Rennes, Rennes, France, 1990. [26] H. Van Der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of BiCG for the solution of nonsymmetric linear systems, SIAM J. Sci. Stat. Comput., 13(1992), pp. 631-644. [27] H. Van Der Vorst, An Iterative Solution Method for Solving f (A) = b, using Krylov Subspace Information Obtained for the Symmetric Positive De nite Matrix A, J. Comp. Appl. Math., 18(1987), pp. 249-263.

22