Numer. Math. (1999) 82: 1–20
Numerische Mathematik
c Springer-Verlag 1999
Electronic Edition
Block and asynchronous two-stage methods for mildly nonlinear systems Zhong-Zhi Bai1,? , Violeta Migall´on2,?? , Jos´e Penad´es2,?? , Daniel B. Szyld3,??? 1
2 3
State Key Laboratory of Scientific/Engineering Computing, Institute of Computational Mathematics and Scientific/Engineering Computing, Chinese Academy of Sciences, P.O. Box 2719, Beijing 100080, P.R. China; e-mail:
[email protected] Departamento de Ciencia de la Computaci´on e Inteligencia Artificial, Universidad de Alicante, E-03071 Alicante, Spain; e-mail:
[email protected],
[email protected] Department of Mathematics, Temple University, Philadelphia, Pennsylvania 19122-6094, USA; e-mail:
[email protected]
Received June 5, 1997 / Revised version received December 29, 1997
Summary. Block parallel iterative methods for the solution of mildly nonlinear systems of equations of the form Ax = Φ(x) are studied. Two-stage methods, where the solution of each block is approximated by an inner iteration, are treated. Both synchronous and asynchronous versions are analyzed, and both pointwise and blockwise convergence theorems provided. The case where there are overlapping blocks is also considered. The analysis of the asynchronous method when applied to linear systems includes cases not treated before in the literature. Mathematics Subject Classification (1991): 65H10, 65F10 1. Introduction In this paper we consider the solution of the mildly nonlinear system (1)
Ax = Φ(x),
?
Supported by the Chinese National Natural Science Foundation Project 19601036 Supported by Spanish CICYT grant numbers TIC96-0718-C02-02 and PB96-1054CV02-01 ??? Supported by National Science Foundation grants INT-9521226 and DMS-9625865 Correspondence to: D.B. Szyld ??
Numerische Mathematik Electronic Edition page 1 of Numer. Math. (1999) 82: 1–20
2
Z.-Z. Bai et al.
where A ∈ Rn×n is nonsingular and Φ : Rn → Rn is a nonlinear function with certain local smoothness properties; see Sects. 3 and 4. This equation arises in many problems of science and engineering, and in particular in discretizations of certain nonlinear differential equations, e.g., of the form ∆u = σ(u); see e.g., [36]. We are interested in solution methods of (1) in which the matrix A is L X partitioned into L by L blocks Aij ∈ Rni ×nj , with n` = n, i.e., `=1
A ∈ Ln (n1 , n2 , . . . , nL ) = {A ∈ Rn×n | A = (A`k ), A`k ∈ Rn` ×nk , 1 ≤ `, k ≤ L}, and A`` are nonsingular for ` = 1, . . . , L. When the context is clear we will simply use Ln for Ln (n1 , n2 , . . . , nL ). This partition may correspond to a partition of the underlying grid, or of the domain of the differential equation being studied, or it may originate from a partitioning algorithm of the sparse matrix A, as done, e.g., in [6], [35]. In particular, we are interested in several parallel generalizations of the following block Jacobi algorithm. Let xi be the vector at the ith iteration. Assume that the vectors xi and Φ(xi ) (`) are partitioned in subvectors xi , Φ(xi )(`) ∈ Rn` , ` = 1, . . . , L, in a way conformally with the partition of A, i.e., xi ∈ Vn (n1 , n2 , . . . , nL ) T
T
= {x ∈ Rn | x = (x(1) , . . . , x(L) )T , x(`) ∈ Rn` , 1 ≤ ` ≤ L}. Algorithm 1 (Block Jacobi). Given an initial vector x0 ,
(2)
For i = 1, 2, . . . , until convergence. For ` = 1 to L L X (`) (k) Solve A`` xi = − A`k xi−1 + Φ(xi−1 )(`) . k6=`, k=1
The block methods considered in this paper include, in particular, those in which (2) is not solved exactly, but instead approximated using an (inner) iterative method. These are block two-stage methods, also called inner/outer iterations, and have been studied extensively for linear and nonlinear systems; see e.g., [12], [13], [14], [25], [32], and the references given therein. A point two-stage method for the solution of (1), i.e., when L = 1, was recently studied in [1]. Algorithm 1 as well as its two-stage generalizations are ideal for parallel processing, since up to L different processors can each solve or approximate one of the problems (2). These algorithms are synchronous in the sense that to begin the calculation of the ith iterate, each
Numerische Mathematik Electronic Edition page 2 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
3
processor has to wait until all processors have completed their calculation of the (i − 1)th iterate. In this paper, we also study asynchronous block methods, i.e., methods in which each processor begins a new calculation without waiting for the others to complete their respective tasks. Asynchronous methods have the potential of converging much faster than synchronous methods, especially when there is load imbalance, e.g., when one of the systems (2) takes much longer to solve than all the others; see e.g., [13], [24], [27], [31]. A block asynchronous method for the solution of (1) was analyzed in [9] using a different approach, and without considering the two-stage case. In the following section we present some definitions and preliminary results used in the paper. In Sect. 3, we present a general framework to study the block methods and prove their convergence, while in Sect. 4 we analyze the convergence of the asynchronous methods. Our results apply to a rather general class of nonsingular matrices, including block H-matrices; see e.g., [2], [38], [40]. We introduce a very general computational model for these asynchronous iterations. Thus, our convergence proofs include a large class of methods, including those with overlap. 2. Preliminaries Given a vector x ∈ Rn , we say that it is nonnegative (positive), denoted x ≥ 0 (x > 0), if all components of x are nonnegative (positive). Similarly, if x, y ∈ Rn , x ≥ y (x > y) means that x − y ≥ 0 (x − y > 0). For a vector x ∈ Rn , |x| denotes the vector whose components are the absolute values of the corresponding components of x. These definitions carry over immediately to matrices. Let x > 0, we consider the vector norm 1 kykx = max yj . (3) 1≤j≤n xj This vector norm is monotonic and for every matrix B ∈ Rn×n it satisfies k |B|xkx = kBkx , where kBkx denotes the matrix norm of B induced by the vector norm defined in (3); see e.g., [30]. A nonsingular matrix A is called M -matrix if it has non-positive off-diagonal entries and it is monotone (i.e., A−1 ≥ O); see e.g., [7], [42]. By ρ(A) we denote the spectral radius of the square matrix A. We define the following subset of Ln used in the analysis of iterative methods for block H-matrices; see, e.g., [3], [4], [17], [40]. Again, we do not write the parameters (n1 , n2 , . . . , nL ), when they are clear from the context. Ln,I (n1 , n2 , . . . , nL )
Numerische Mathematik Electronic Edition page 3 of Numer. Math. (1999) 82: 1–20
4
Z.-Z. Bai et al.
= {A = (A`k ) ∈ Ln | A`` ∈ Rn` ×n` nonsingular, ` = 1, . . . , L}, For a matrix A ∈ Ln , let D(A) = Diag(A11 , A22 , . . . , ALL ), i.e., its blockdiagonal part. Thus, A ∈ Ln,I if and only if A ∈ Ln and D(A) is nonsingular. For any matrix A = (aij ) ∈ Rn×n , we define its comparison matrix < A > = (αij ) by αii = |aii |, αij = −|aij |, i 6= j. Similarly, for A ∈ Ln,I we define its type-I and type-II comparison matrices hAi = (hAiij ) ∈ RL×L −1 and hhAii = (hhAiiij ) ∈ RL×L as hAiii = kA−1 ii k , hAiij = −kAij k, i 6= j, and hhAiiii = 1, hhAiiij = −kA−1 ii Aij k, i 6= j, i, j = 1, 2, . . . , L, respectively; see [2], [18], [38]. We also define, for A ∈ Ln , the block absolute value [A] = (kAij k) ∈ RL×L . The definition for a vector v ∈ Vn = Vn (n1 , n2 , . . . , nL ) is analogous. Here k · k is any consistent matrix norm satisfying kIk = 1. This block absolute value has the following properties. Lemma 2.1. [2] Let A, B ∈ Ln , x, y ∈ Vn and γ ∈ R. Then, (a) |[A] − [B]| ≤ [A + B] ≤ [A] + [B], |[x] − [y]| ≤ [x + y] ≤ [x] + [y], (b) [AB] ≤ [A][B], [Ax] ≤ [A][x], and (c) [γA] ≤ |γ|[A], [γx] ≤ |γ|[x]. Following [37], A is said to be an H-matrix if < A > is an M -matrix. We say that A ∈ Ln,I is a Type-I (Type-II) block H-matrix if hAi (hhAii) I (A ∈ H II ). It follows that is an M -matrix. We denote this by A ∈ HB B I II HB ⊂ HB with the inclusion being strict. Lemma 2.2. (a) If A ∈ Rn×n is an H-matrix, then |A−1 | ≤ < A >−1 [33], [37]. I ⊂ L , then [A−1 ] ≤ hAi−1 [2]. (b) If A ∈ HB n,I II ⊂ L , then [A−1 ] ≤ hhAii−1 [D(A)−1 ] [2]. (c) If A ∈ HB n,I
Definition 2.3. Let A ∈ Rn×n . The representation A = M − N is called a splitting if M is nonsingular. It is called a convergent splitting if ρ(M −1 N ) < 1. A splitting A = M − N is called (a) regular if M −1 ≥ O and N ≥ O [42], (b) weak regular if M −1 ≥ O and M −1 N ≥ O [7], [36], (c) H-splitting if < M > − |N | is an M -matrix [25], (d) H-compatible splitting if < A > = < M > − |N | [25], I -compatible splitting if hAi = hM i − [N ], and (e) HB II -compatible splitting if hhAii = hhM ii − [D(M )−1 N ]. (f) HB Lemma 2.4. Let A = M − N be a splitting. (a) If the splitting is an H-splitting, then A and M are H-matrices and ρ(M −1 N ) ≤ ρ(< M >−1 |N |) < 1. (b) If the splitting is H-compatible and A is an H-matrix, then it is an Hsplitting and thus convergent. I -compatible, then both A and M ∈ H I . (c) If the splitting is HB B II -compatible, then both A and M ∈ H II . (d) If the splitting is HB B
Numerische Mathematik Electronic Edition page 4 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
5
Proof. Parts (a) and (b) are shown in [25]. Parts (c) and (d) follow from the definitions and some simple bounds; see [2]. 2 Lemma 2.5. [41] Let H1 , H2 , . . . , Hi , . . . be a sequence of nonnegative matrices in Rn×n . If there exist a real number 0 ≤ θ < 1, and a vector v > 0 in Rn , such that Hi v ≤ θv, i = 1, 2, . . . , then ρ(Kj ) ≤ θj < 1, where Kj = Hj · · · H2 ·H1 , and therefore lim Kj = j→∞
O.
Definition 2.6. A mapping Φ : Rn → Rn is called P -bounded (block P bounded) if there exists a nonnegative matrix P ∈ Rn×n (P ∈ RL×L ) such that |Φ(x) − Φ(y)| ≤ P |x − y| ([Φ(x) − Φ(y)] ≤ P [x − y]), for all x, y ∈ Rn . Furthermore, if ρ(P ) < 1, Φ is said to be a (block) P -contracting mapping. Lemma 2.7. Let A ∈ Rn×n be nonsingular. Then the mildly nonlinear system (1) has a unique solution provided that anyone of the following conditions hold. (a) A is a monotone matrix, Φ is P -bounded, and ρ(A−1 P ) < 1. (b) A is an H-matrix, Φ is P -bounded, and ρ(< A >−1 P ) < 1. I , Φ is block P -bounded, and ρ(hAi−1 P ) < 1. (c) A ∈ HB II , Φ is block P -bounded, and ρ(hhAii−1 [D(A)−1 ]P ) < 1. (d) A ∈ HB Proof. Parts (a) and (b) can be found in [1]. Parts (c) and (d) are shown in a similar way. 2 3. Block methods We present a general framework which includes, as a particular case, the block two-stage method described in Sect. 1. To that end, consider the (outer and inner) splittings A = B` − C` , B` = M` − N` , ` = 1, . . . , L, and a set of diagonal nonnegative matrices E` , such that (4)
L X
E` = I.
`=1
The sequence q(`, i) indicates, e.g., the number of inner iterations an iterative method uses to approximate the solution of the `th system (2), at the ith iteration.
Numerische Mathematik Electronic Edition page 5 of Numer. Math. (1999) 82: 1–20
6
Z.-Z. Bai et al.
Algorithm 2 (Nonlinear Two-stage Multisplitting). Given an initial vector x0 , and a sequence of numbers of inner iterations q(`, i), ` = 1, . . . , L, i = 1, 2, . . . For i = 1, 2, . . . , until convergence. For ` = 1 to L y`,0 = xi−1 For j = 1 to q(`, i) M` y`,j = N` y`,j−1 + C` xi−1 + Φ(xi−1 ) (5) xi =
(6)
L X
E` y`,q(`,i) .
`=1
The concept of multisplittings, first introduced in [34], provides a very general setting to study parallel block methods; see, e.g., [2], [3], [4], [13], [17], [19], [21], [28], [31], [39]. This general setting encompasses cases, e.g., where there is overlap, i.e., where more than one processor computes approximations to the same variable, and the weighting matrices E` have positive entries smaller than 1, see e.g., [22], [29]. The two-stage generalization of Algorithm 1 can be recovered from Algorithm 2 by the appropriate choice of diagonal matrices E` , and by choosing, e.g., B` to be block diagonal D(A), cf. [13], [24], and the comments after Theorem 3.1. We emphasize that only the components of y`,j for which the diagonal matrix E` is nonzero is used in (6). Thus, (5) needs to be interpreted more as a representation of the work in the `th processor, usually involving of the order of n` variables, than as a global operation involving all n variables. In order to analyze the convergence properties of Algorithm 2, we can write the ith iteration vector as follows, cf. [13], L X xi = E` (M`−1 N` )q(`,i) xi−1 `=1
q(`,i)−1
+
(7)
X j=0
(M`−1 N` )j M`−1 (C` xi−1 + Φ(xi−1 )) ,
or equivalently xi =
L X `=1
(8)
E` (M`−1 N` )q(`,i) xi−1
+ I − (M`−1 N` )q(`,i) B`−1 (C` xi−1 + Φ(xi−1 )) .
The next theorem is our first local convergence result: if the splittings satisfy certain minimum convergence properties, and the initial guess is
Numerische Mathematik Electronic Edition page 6 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
7
close to the solution, we have convergence, provided that as the iterations continue, the system is better approximated. This result is similar in spirit to that of [14]. Theorem 3.1. Let A ∈ Rn×n be a nonsingular matrix. Let Φ : Rn → Rn be continuously differentiable in an open neighborhood of a solution x? of the mildly nonlinear system (1). Consider that the splittings A = B` − C` , ` = 1, . . . , L, satisfy kB`−1 (C` + Φ0 (x? ))k∞ < 1, ` = 1, . . . , L. Assume further that the splittings B` = M` − N` , ` = 1, . . . , L, are convergent. Then, if lim q(`, i) = ∞, ` = 1, . . . , L, there exists a δ > 0 such that i→∞
for every initial vector x0 ∈ Sδ (x? ) = {x ∈ Rn | kx − x? k∞ < δ} the sequence of vectors generated by Algorithm 2 converges to x? . Proof. Since ρ(M`−1 N` ) < 1 and lim q(`, i) = ∞, ` = 1, . . . , L, given i→∞
any > 0, there exists an integer i0 such that k(M`−1 N` )q(`,i) k∞ ≤ , for all i ≥ i0 , ` = 1, . . . , L. Let β be a positive constant satisfying kB`−1 (C` + Φ0 (x? ))k∞ ≤ β < 1, ` = 1, . . . , L. Let us consider εi = xi − x? the error vector at the ith iteration. Since x? is a fixed point of (7), or equivalently (8), we obtain after some algebraic manipulations that, for i = 1, 2, . . . , εi =
L X `=1
E` (M`−1 N` )q(`,i) εi−1
+(I − (M`−1 N` )q(`,i) )(J` (x? )εi−1 + B`−1 y(x? , xi−1 )) ,
for i = 1, 2, . . . , where J` (x? ) = B`−1 (C` + Φ0 (x? )), ` = 1, . . . , L, and y(x? , x) = Φ(x) − Φ(x? ) − Φ0 (x? )(x − x? ). Then, kεi k∞ ≤ max {k(M`−1 N` )q(`,i) k∞ kεi−1 k∞ 1≤`≤L
+(1 + k(M`−1 N` )q(`,i) k∞ )(kJ` (x? )k∞ kεi−1 k∞ (9)
+kB`−1 k∞ ky(x? , xi−1 )k∞ )}, i = 1, 2, . . . .
On the other hand, by the hypotheses we can assume, without loss of generality, that Φ is continuously differentiable on the convex set S (x? ) = {x ∈ Rn | kx − x? k∞ < }. Hence (see e.g., [36, Exercise 3.2.6]), there exists α > 0 such that (10)
ky(x? , x)k∞ ≤ αkx − x? k2∞ for all x ∈ S (x? ).
Numerische Mathematik Electronic Edition page 7 of Numer. Math. (1999) 82: 1–20
8
Z.-Z. Bai et al.
From (9), we can consider 0 < δ < small enough such that, if x0 ∈ Sδ (x? ) then kxi − x? k∞ < for all i < i0 . (11) Furthermore, from (9) again, if i ≥ i0 we have kεi k∞ ≤ max {( + (1 + )β)kεi−1 k∞ 1≤`≤L
(12)
+(1 + )kB`−1 k∞ ky(x? , xi−1 )k∞ } = %()kεi−1 k∞ + γ`0 ()ky(x? , xi−1 )k∞ ,
where %() = + (1 + )β and γ`0 () = (1 + ) max {kB`−1 k∞ }. Then, for i = i0 , using (10), (11) and (12) we obtain
1≤`≤L
kεi0 k∞ ≤ %()kεi0 −1 k∞ + γ`0 ()αkεi0 −1 k2∞ < %()kεi0 −1 k∞ + γ`0 ()αkεi0 −1 k∞ = ϕ`0 ()kεi0 −1 k∞ , where ϕ`0 () = %() + γ`0 ()α. Without loss of generality we can consider small enough such that ϕ`0 () < 1. Then, by induction we easily obtain kεi k∞ < ϕ`0 ()kεi−1 k∞ with ϕ`0 () < 1, for all i ≥ i0 , and then the proof is complete.
2
Several comments on alternative hypotheses for Theorem 3.1 are in order. First, the infinite norm can be replaced by any weighted max-norm associated with a positive vector; see (3) and Sect. 4. Second, when all the outer splittings in Algorithm 2 are the same, i.e., B` = B, C` = C, ` = 1, . . . , L, the assumption kB −1 (C + Φ0 (x? ))k∞ < 1 can be replaced by the more general ρ(B −1 (C + Φ0 (x? ))) < 1. This applies, in particular, for the two-stage version of Algorithm 1. Third, the fact that Φ is continuously differentiable in an open neighborhood of x? can be relaxed to the following set of assumptions. The map Φ is Lipschitz continuous, Φ is Bdifferentiable (i.e., for every z ∈ Rn there exists a positively homogeneous function BΦ(z) : Rn → Rn called the B-derivative of Φ at z such that Φ(z + v) − Φ(z) − BΦ(z)(v) lim = 0), and the associated function BΦ v→0 kvk is Lipschitz continuous at x? ; see e.g., [1], [36]. We note that B-derivatives are similar to directional derivatives. These hypotheses were used in [1] for the proof of convergence of the point two-stage method with a fixed number of inner iterations. In the next results, we prove the convergence of Algorithm 2 for any number of inner iterations in a few general cases, namely, when the matrix A is monotone, or it is an H-matrix or block H-matrix of different types. We impose further conditions on the outer and inner splittings.
Numerische Mathematik Electronic Edition page 8 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
9
Theorem 3.2. Let A ∈ Rn×n be a monotone matrix. Let the splittings A = B` − C` , ` = 1, . . . , L, be regular and the splittings B` = M` − N` , ` = 1, . . . , L, be weak regular. Assume further that Φ : Rn → Rn is a P bounded mapping such that ρ(A−1 P ) < 1. Then, the nonlinear two-stage multisplitting Algorithm 2 converges to the unique solution of the mildly nonlinear system (1), for any initial vector x0 and any sequence of numbers of inner iterations q(`, i) ≥ 1, ` = 1, . . . , L, i = 1, 2, . . . . Proof. From Lemma 2.7 it follows that there exists a unique x? ∈ Rn such that Ax? = Φ(x? ). Let εi = xi − x? be the error vector at the ith iteration of Algorithm 2. Since x? is a fixed point of (7), it follows that for i = 1, 2, . . . , L X (13) εi = E` (M`−1 N` )q(`,i) εi−1 `=1
q(`,i)−1
+
X j=0
(M`−1 N` )j M`−1 (C` εi−1 + Φ(xi−1 ) − Φ(x? )) .
Then, using the inequalities M`−1 N` ≥ O, ` = 1, . . . , L, and M`−1 C` ≥ O, ` = 1, . . . , L,
(14)
and the fact that Φ is P -bounded, we obtain the following bound: |εi | ≤
L X `=1
(`)
E` Ti |εi−1 |, i = 1, 2, . . . ,
where (`)
(15) Ti
= (M`−1 N` )q(`,i) +
q(`,i)−1
X j=0
(M`−1 N` )j M`−1 (C` + P ) ≥ O.
On the other hand, some algebraic manipulations yields the equality (M`−1 N` )q(`,i) + q(`,i)−1
=I−
X j=0
q(`,i)−1
X j=0
(M`−1 N` )j M`−1 C`
(M`−1 N` )j M`−1 A,
and we rewrite (15) as (16)
(`) Ti
q(`,i)−1
=I−
X j=0
(M`−1 N` )j M`−1 (A − P ).
Numerische Mathematik Electronic Edition page 9 of Numer. Math. (1999) 82: 1–20
10
Z.-Z. Bai et al.
Moreover, since the matrices A−1 and P are nonnegative, and ρ(A−1 P ) < 1, the matrix A − P = A(I − A−1 P ) is monotone. Consider any fixed vector e > 0 (e.g., e = (1, 1, . . . , 1)T ) and v = (A − P )−1 e. Since (A − P )−1 ≥ O and no row of (A − P )−1 can have all null entries, we get v > 0. By the same arguments, M`−1 e > 0, ` = 1, . . . , L. Then, we have from (16) that (`) Ti v
q(`,i)−1
=v−
X j=0
=v−
M`−1 e
(M`−1 N` )j M`−1 e q(`,i)−1
−
X j=1
(M`−1 N` )j M`−1 e.
Because e > 0 and the matrices M`−1 N` and M`−1 are nonnegative, it (`) (`) follows that Ti v ≤ v − M`−1 e. Moreover, since Ti v ≥ 0 and v − −1 M` e < v, there exists constants 0 ≤ θ` < 1, ` = 1, . . . , L, such that (`) Ti v ≤ θ` v, ` = 1, . . . , L, i = 1, 2, . . . . Hence, setting θ = max {θ` }, we get
L X `=1
1≤`≤L
(`)
E` Ti v ≤ θv. By Lemma 2.5, this implies that the sequence of
error vectors tends to zero, and then the proof is complete.
2
As it was pointed out in [23], the hypothesis on the outer splittings in Theorem 3.2 can be relaxed to require A = B` − C` to also be weak regular and, in addition, M`−1 C` ≥ O; see (14).
Theorem 3.3. Let A ∈ Rn×n be an H-matrix. Let the splittings A = B` − C` , and B` = M` − N` , ` = 1, . . . , L, be H-compatible. Assume further that Φ : Rn → Rn is a P -bounded mapping such that ρ(< A >−1 P ) < 1. Then, the nonlinear two-stage multisplitting Algorithm 2 converges to the unique solution of the mildly nonlinear system (1), for any initial vector x0 and any sequence of numbers of inner iterations q(`, i) ≥ 1, ` = 1, . . . , L, i = 1, 2, . . . .
Proof. By Lemma 2.4, the matrices M` , ` = 1, . . . , L, are H-matrices. Therefore, using Lemma 2.2, we have the inequalities |M`−1 | ≤ < M` >−1 . Thus, if εi = xi −x? is the error at the ith iteration of Algorithm 2, it follows from (13) that L X |εi | ≤ E` |M`−1 N` |q(`,i) |εi−1 | `=1
q(`,i)−1
+
X j=0
|M`−1 N` |j |M`−1 |(|C` ||εi−1 |
Numerische Mathematik Electronic Edition page 10 of Numer. Math. (1999) 82: 1–20
+ |Φ(xi−1 ) − Φ(x? )|)
Block and asynchronous two-stage methods for mildly nonlinear systems
≤
L X `=1
11
(`) E` T˜i |εi−1 |, i = 1, 2, . . . ,
where (`) T˜i = (< M` >−1 |N` |)q(`,i) q(`,i)−1
+
(17)
X
(< M` >−1 |N` |)j < M` >−1 (|C` | + P ) .
j=0
The matrices
L X `=1
(`)
E` T˜i
can be considered as the matrices used in the proof
of Theorem 3.2, to bound the error vector ε˜i (i.e., |˜ εi | ≤
L X `=1
(`)
E` T˜i |˜ εi−1 |) of
an iterative process corresponding to Algorithm 2 to solve the mildly nonlinear system < A >x = Φ(x), with outer splittings < A > = < B` > − |C` | and inner splittings < B` > = < M` > − |N` |, ` = 1, . . . , L. That system and these splittings satisfy the hypotheses of Theorem 3.2, and therefore we can affirm that there exists a positive vector v˜ and a constant 0 ≤ θ < 1 L X (`) such that E` T˜i v˜ ≤ θ˜ v . Hence, by Lemma 2.5 the proof is complete. `=1
2
In the case that all the outer splittings in Algorithm 2 are the same, i.e., B` = B, C` = C, ` = 1, . . . , L, e.g., for the two-stage version of Algorithm 1, the assumption in Theorem 3.3 that the outer splittings be H-compatible can be replaced by the less restrictive hypothesis of being an H-splitting. The next convergence result corresponds to the Type-I and Type-II block H-matrices. Its proof is similar to those in Theorems 3.2 and 3.3, but note that many of the inequalities are in RL and not in Rn . I (H II ) ⊂ L (n , n , . . . , n ). Let the splitTheorem 3.4. Let A ∈ HB n,I 1 2 L B I tings A = B` − C` and B` = M` − N` , ` = 1, 2, . . . , L, be HB II compatible (HB -compatible and such that D(M` ) = D(B` ) = D(A)), and the weighting matrices E` , ` = 1, 2, . . . , L, satisfying (4), satisfy in L X [E` ] ≤ I, the L × L identity matrix. Assume further that addition `=1
Φ : Rn → Rn is a block P -bounded mapping such that ρ(hAi−1 P ) < 1 (ρ(hhAii−1 [D(A)−1 ]P ) < 1). Then, the nonlinear two-stage multisplitting Algorithm 2 converges to the unique solution of the mildly nonlinear system
Numerische Mathematik Electronic Edition page 11 of Numer. Math. (1999) 82: 1–20
12
Z.-Z. Bai et al.
(1), for any initial vector x0 and any sequence of numbers of inner iterations q(`, i) ≥ 1, ` = 1, . . . , L, i = 1, 2, . . . . Proof. From Lemma 2.7 it follows that (both in the Type-I and Type-II cases) there exists a unique vector x? ∈ Rn such that Ax? = Φ(x? ). Thus, if we let εi = xi − x? be the error at the ith iteration of Algorithm 2, then (13) holds. I -compatibility of the splittings For the Type-I case, we note that the HB I imply that M` ∈ HB , and thus by Lemma 2.2 (b), we have that [M`−1 ] ≤ hM` i−1 , ` = 1, 2, . . . , L. Using this last inequality, by taking block absolute values on both sides of (13), applying Lemma 2.1 and the block P -bounded property of Φ, we obtain the inequality (18)
[εi ] ≤
L X `=1
(`) [E` ] T˜i,I [εi−1 ], i = 1, 2, . . . , where
(`) T˜i,I = (hM` i−1 [N` ])q(`,i) q(`,i)−1
(19)
+
X
(hM` i−1 [N` ])j hM` i−1 ([C` ] + P )
j=0
is a nonnegative matrix in RL×L . After some manipulations we obtain the following identity, cf. (16). (`) T˜i,I = I −
q(`,i)−1
X
(hM` i−1 [N` ])j hM` i−1 (hAi − P ).
j=0
Furthermore, since hAi−1 and P are nonnegative matrices and by the hypothesis ρ(hAi−1 P ) < 1, the matrix (hAi − P ) is a monotone matrix. Now, in a way similar to the proof of Theorem 3.2, we can deduce that there exist a positive vector v ∈ RL and a nonnegative constant θ ∈ [0, 1) such that (`) T˜i,I v ≤ θv,
` = 1, 2, . . . , L,
i = 1, 2, . . . .
By Lemma 2.5, we immediately get [εi ] → 0 when i → ∞. This implies that εi → 0 when i → ∞, completing the proof of the Type-I case. For the Type-II case, let us denote by P˜ = [D(A)−1 ]P , and for ` = ˜` = D(A)−1 B` , C˜` = D(A)−1 C` , M ˜ ` = D(A)−1 M` , and 1, 2, . . . , L, B −1 ˜ ` ∈ H I , and that by ˜ N` = D(A) N` . With this notation, observe that M B −1 ˜ −1 ] ≤ hM ˜ ` i , ` = 1, 2, . . . , L. Using these Lemma 2.2 (b), we have [M ` relations, taking block absolute values on both sides of (13) as before, after
Numerische Mathematik Electronic Edition page 12 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
13
inserting D(A)D(A)−1 in the appropriate places, we obtain L X ˜ ` i−1 [N ˜` ])q(`,i) [E` ] (hM [εi ] ≤ `=1 q(`,i)−1
+
X
−1
˜ `i (hM
−1
˜` ])j hM ˜ `i [N
([C˜` ] + P˜ ) [εi−1 ].
j=0
This expression has the same form as (18), with matrices of the same structure as (19). To complete the proof, we note that the splittings D(A)−1 A = ˜` − C˜` , and B ˜` = M ˜` − N ˜` are H I -compatible and correspond to the B B −1 solution of the system D(A) Ax = D(A)−1 Φ(x), which satisfy the hypotheses of the Type-I case. 2 We should point out that the hypothesis of Theorem 3.4 can be changed to apply to the solution of another nonlinear system (20)
Hx = Φ(x),
where the matrix H ∈ Ln,I is equimodular to A, i.e., it belongs to the set I (A) = {H = (H ) ∈ L , kH −1 k = kA−1 k, kH k = kA k, i 6= ΩB ij ij ij n,I ii ii II (A) = {H = j, i, j = 1, . . . , L}, in the Type-I case, and to the set ΩB (Hij ) ∈ Ln,I , kHii−1 Hij k = kA−1 ii Aij k, i, j = 1, . . . , L} in the Type-II case; see [2]. Furthermore, Theorem 3.4 also applies, essentially with the same proof, to matrices A (or H) which may not have a nonsingular D(A), I or H II but such that there exists or such that they do not belong to HB B nonsingular matrices R and S, such that RAS has those properties; see [2]. These comments apply to Theorem 4.4 in the next section as well. We conclude this section by remarking that for the inner iterations one can chose, in addition to the classical splittings corresponding to Jacobi, Gauss-Seidel, etc., relaxed methods such as JOR, SOR, AOR, etc.. In the latter cases, the relaxation parameters need to be chosen in such a way that the splittings satisfy the hypotheses of Theorems 3.2, 3.3 and 3.4. This was done for the point methods in [1], and the same considerations carry through to the block methods studied here. 4. Asynchronous iterations The computational model we consider for the asynchronous iterations is as follows. Each processor, say the `th processor, starts a cycle of computations by collecting the most recent vectors computed by the other processors, say the processors k, k 6= `. Let us call this cycle the ith iteration (which would be different than the ith iteration in the synchronous case). Thus, the
Numerische Mathematik Electronic Edition page 13 of Numer. Math. (1999) 82: 1–20
14
Z.-Z. Bai et al.
iteration subscript is increased every time a processor starts a new cycle of computations. The vectors computed by the other processors will be from older cycles, some say i − 1, but many others older than that, and we call these earlier cycles r(k, i), i.e., the cycle in which the processor k computed the vector used at the beginning of the ith cycle (from this definition the condition (21) below follows directly). In other words, in order (`) (k) to get a new vector xi , the `th processor collects the vectors xr(k,i) , and uses the weighting matrices to get the vector
L X k=1
(k)
Ek xr(k,i) . This sequence
of weighted vectors can be considered in practice as the sequence of iterate (k) vectors. Recall that in fact not all the components of the vectors xr(k,i) are needed in the computations, so that the local storage is of order n and not nL. Formally, we define the sets Ji ⊆ {1, 2, . . . , L}, i = 1, 2, . . . , as ` ∈ Ji if the `th processor starts its computation of a new iterate at the ith step. As is customary in the description and analysis of asynchronous algorithms, we assume that the subscripts r(`, i) and the sets Ji satisfy the following conditions. They appear as classical conditions in convergence results for asynchronous iterations; see e.g., [5], [8], [15], [20]. (21) (22) (23)
r(`, i) < i for all ` = 1, 2, . . . L, i = 1, 2, . . . . lim r(`, i) = ∞ for all ` = 1, 2, . . . , L.
i→∞
The set {i | ` ∈ Ji } is unbounded for all ` = 1, 2, . . . , L.
With this notation, the asynchronous counterpart of Algorithm 2 can be described by the following algorithm. Algorithm 3 (Nonlinear Asynchronous Two-stage Multisplitting). (`) Given the initial vectors x0 = x0 , ` = 1, . . . , L. For i = 1, 2, . . . (`) if ` 6∈ Ji xi−1 L X (k) (M`−1 N` )q(`,i) Ek xr(k,i) k=1 q(`,i)−1 (`) L X X (24) xi = (k) −1 −1 j + (M N ) M (C Ek xr(k,i) ` ` ` ` j=0 k=1 L X (k) +Φ( Ek xr(k,i) )) if ` ∈ Ji . k=1
In order to analyze Algorithm 3, we consider the operators G(i) = (G(1) (i), . . . , G(L) (i)), with G(`) (i) : RnL → Rn defined for yˆ ∈ RnL as
Numerische Mathematik Electronic Edition page 14 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
15
follows, for ` = 1, . . . , L and i = 1, 2, . . . . (`)
G (i)(ˆ y) =
(M`−1 N` )q(`,i) Qˆ y+
q(`,i)−1
X j=0
(M`−1 N` )j M`−1 (C` Qˆ y +Φ(Qˆ y )),
where Q = [E1 , . . . , E` , . . . , EL ] ∈ Rn×nL . The asynchronous iteration (24) can then be rewritten as the following iteration. ( (`) xi−1 (`) if ` 6∈ Ji (25) xi = (1) (`) (L) G(`) (i) xr(1,i) , . . . , xr(`,i) , . . . , xr(L,i) if ` ∈ Ji . The following lemma, which is a special case of Theorem 3.2 in [26], is used in our convergence proofs. Lemma 4.1. Let G(i) be a sequence of operators on RnL having a common fixed point x ˆ? . Let k · k` be a norm on Rn , ` = 1, . . . , L. Let a ∈ RL , a > 0 1 and denote k · ka the weighted max-norm kxka = max { kx(`) k` }. For 1≤`≤L a` all i = 1, 2, . . . , assume that there exists a constant 0 ≤ α < 1 such that kG(i)ˆ x−x ˆ? ka ≤ αkˆ x−x ˆ? ka , for all x ˆ ∈ RnL . Assume further that the sequence r(`, i) and the sets Ji , ` = 1, . . . , L, i = 1, 2, . . . , satisfy conditions (21–23). Then the asynchronous iteration (25) converges to x ˆ? for any initial vector x ˆ0 . Theorem 4.2. Let A ∈ Rn×n be a monotone matrix. Let the splittings A = B` − C` , ` = 1, . . . , L, be regular and the splittings B` = M` − N` , ` = 1, . . . , L, be weak regular. Suppose that Φ : Rn → Rn is a P -bounded mapping such that ρ(A−1 P ) < 1. Assume further that the sequence r(`, i) and the sets Ji , ` = 1, . . . , L, i = 1, 2, . . . , satisfy conditions (21–23). Then, the nonlinear asynchronous two-stage multisplitting Algorithm 3 converges nL T T to (xT ? , . . . , x? ) ∈ R , where x? is the unique solution of the mildly (`) nonlinear system (1), for any initial vectors x0 , ` = 1, . . . , L, and any sequence of numbers of inner iterations q(`, i) ≥ 1, ` = 1, . . . , L, i = 1, 2, . . . . Proof. From the proof of Theorem 3.2 we know that there exists a positive vector v and a constant 0 ≤ θ < 1 such that (26)
(`)
Ti v ≤ θv, ` = 1, . . . , L, i = 1, 2, . . . , (`)
where the matrices Ti are defined in (15). Let us define vˆ = (v T , . . . , v T )T nL T T ∈ RnL and x ˆ? = (xT ˆ? is a fixed point of G(i), it is ? , . . . , x? ) ∈ R . As x easy to prove that (`) (`) x − x? ≤ Ti Q|ˆ x−x ˆ? |, ` = 1, . . . , L, i = 1, 2, . . . , G (i)ˆ
Numerische Mathematik Electronic Edition page 15 of Numer. Math. (1999) 82: 1–20
16
Z.-Z. Bai et al.
for all x ˆ ∈ RnL . Thus (27)
|G(i)ˆ x−x ˆ? | ≤ H(i)|ˆ x−x ˆ? |, i = 1, 2, . . . ,
where
(1) Ti Q H(i) = ... ∈ RnL×nL .
(28)
(L)
Ti
Q
From (28) and (26), it follows that H(i)ˆ v ≤ θˆ v . Hence, using the monotonic vector norm k · kvˆ defined in (3), kH(i)kvˆ ≤ θ, and then by (27), kG(i)ˆ x−x ˆ? kvˆ ≤ θkˆ x−x ˆ? kvˆ , i = 1, 2, . . . . Since the norm kˆ y kvˆ in RnL can be expressed in the form kˆ y kvˆ = T T T (`) (1) (L) max kˆ y kv , for any yˆ = (ˆ y , . . . , yˆ ) , using Lemma 4.1, the con-
1≤`≤L
vergence is shown.
2
Again, as with the hypotheses of Theorem 3.2, in Theorem 4.2 the outer splittings need not be regular, but just weak regular with the additional hypothesis (14). We remark that an alternative proof could be obtained using the theory of paracontracting operators; see [16]. Theorem 4.3. Let A ∈ Rn×n be an H-matrix. Let the splittings A = B` − C` , and B` = M` − N` , ` = 1, . . . , L, be H-compatible. Suppose that Φ : Rn → Rn is a P -bounded mapping such that ρ(< A >−1 P ) < 1. Assume further that the sequence r(`, i) and the sets Ji , ` = 1, . . . , L, i = 1, 2, . . . , satisfy conditions (21–23). Then, the nonlinear asynchronous nL T T two-stage multisplitting Algorithm 3 converges to (xT ? , . . . , x? ) ∈ R , where x? is the unique solution of the mildly nonlinear system (1), for any (`) initial vectors x0 , ` = 1, . . . , L, and any sequence of numbers of inner iterations q(`, i) ≥ 1, ` = 1, . . . , L, i = 1, 2, . . . . Proof. Using similar notation as in Theorem 4.2 we can prove, for all x ˆ∈ nL R , that (`) (`) x − x? ≤ T˜i Q|ˆ x−x ˆ? |, ` = 1, . . . , L, i = 1, 2, . . . , G (i)ˆ (`) where matrices T˜i are defined in (17). By the proof of Theorem 3.3 we (`) have T˜i v˜ ≤ θ˜ v for some v˜ ∈ Rn , v˜ > 0 and ` = 1, . . . , L, and thus the theorem follows in the same manner as Theorem 4.2. 2
Analogous to Theorem 4.3, as well as Theorem 3.4, we can establish the following convergence theorem for the systems (1) and (20). Since this new theorem can be demonstrated in similar ways to the proof of Theorem 3.4 with slight modifications, we omit its proof.
Numerische Mathematik Electronic Edition page 16 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
17
I (H II ) ⊂ L (n , n , . . . , n ). Let the splitTheorem 4.4. Let A ∈ HB n,I 1 2 L B I tings A = B` − C` and B` = M` − N` , ` = 1, 2, . . . , L, be HB II compatible (HB -compatible and such that D(M` ) = D(B` ) = D(A)), and the weighting matrices E` , ` = 1, 2, . . . , L, satisfying (4), satisfy in L X [E` ] ≤ I, the L × L identity matrix. Assume further that addition `=1
Φ : Rn → Rn is a block P -bounded mapping such that ρ(hAi−1 P ) < 1 (ρ(hhAii−1 [D(A)−1 ]P ) < 1). Then, the nonlinear asynchronous two-stage nL T T multisplitting Algorithm 3 converges to (xT ? , . . . , x? ) ∈ R , where x? is the unique solution of the mildly nonlinear system (1), for any initial vec(`) tors x0 , ` = 1, . . . , L, and any sequence of numbers of inner iterations q(`, i) ≥ 1, ` = 1, . . . , L, i = 1, 2, . . . . We point out that when Φ(x) = b ∈ Rn , i.e., when system (1) is linear, Algorithm 3 reduces to Algorithm 6 in [13], and thus Theorems 4.2, 4.3 and 4.4 apply to that case as well. In fact, the results here are more general, since we do not assume that the weighting matrices form a partition of the identity, i.e., when the entries of each E` are 0 or 1. In particular, Theorems 4.2, 4.3 and 4.4 provide general convergence results for two-stage multisplitting methods with overlap. We end the paper with a discussion on a different asynchronous computational model, analogous to the one used in [11] for linear systems. Namely, for i = 0, 1, 2, . . . , xi+ri = (I − Eji )xi+ri −1 + Eji
(M`−1 N` )q(`,i) xi
q(`,i)−1
(29)
+
X j=0
(M`−1 N` )j M`−1 (C` xi
+ Φ(xi ))
,
where {ji }∞ i=0 , 1 ≤ ji ≤ L, is a sequence of integers that indicates the processor which updates the approximation to the solution at the ith iteration and ri −1 is the number of times that processors other than the ji th processor update the approximation of the solution during the time interval in which the ji th processor’s calculations are performed. The computational model (29) is based on Model B of [10], and conditions (21–23) are replaced by the assumption that the sequence {ji }∞ i=0 is regulated; see e.g., [13] for differences and analogies between both sets of conditions. The proof of the convergence of the iteration (29) with the same hypotheses as in Theorems 3.2, 3.3 and 3.4 follows in a similar way as the proof of [31, Theorem 3.2] which in turn is based on [10, Theorem 2.2], and as the proof of [4, Theorem 1].
Numerische Mathematik Electronic Edition page 17 of Numer. Math. (1999) 82: 1–20
18
Z.-Z. Bai et al.
The two asynchronous models (24) and (29) produce the same sequence of iterate vectors when the weighting matrices E` , ` = 1, . . . , L, form a partition of the identity, but when there is overlap, the models are different. In Algorithm 3 the components calculated by the `th processor, say, uses only information calculated by the other processors (k 6= `). However, in model (29) the `th processor introduces in its computations some older information computed in the `th processor in a previous step. References 1. Bai, Z.-Z. (1997): A class of two-stage iterative methods for systems of weakly nonlinear equations. Numer. Algorithms 14(4), 295–319 2. Bai, Z.-Z. (1995): Parallel matrix multisplitting block relaxation methods. Mathematica Numerica Sinica 17, 238–252 In Chinese. 3. Bai, Z.-Z. (1998): A class of blockwise matrix multi-splitting multi-parameter block relaxation methods. In: Shi, Z.-C., Mori, M. (eds.) Proceeding of the Third China-Japan Joint Seminar on Numerical Mathematics, Science Press of China, Beijing, pp. 1–10 4. Bai, Z.-Z. (1997): Asynchronous parallel matrix multisplitting block relaxation iteration methods. Numerical Mathematics, A Journal of Chinese Universities (Chinese Series) 19, 28–39. In Chinese. 5. Baudet, G.M. (1978): Asynchronous iterative methods for multiprocessors. Journal of the Association for Computing Machinery 25, 226–244 6. Benzi, M., Choi, H., Szyld, D.B. (1997): Threshold Ordering for Preconditioning Nonsymmetric Problems. In: Golub, G., et.al. (eds.) Scientific Computing, Proceedings of the Workshop, 10–12 March 1997, Hong Kong, Springer, Singapore, pp 159–165 7. Berman, A., Plemmons, R.J. (1994): Nonnegative Matrices in the Mathematical Sciences. Academic Press, New York, third edition, 1979. Reprinted by SIAM, Philadelphia 8. Bertsekas, D.P., Tsitsiklis, J.N. (1989): Parallel and Distributed Computation. PrenticeHall, Englewood Cliffs, New Jersey 9. Bhaya, A., Kaszkurewicz, E., Mota, F. (1991): Asynchronous Block-Iterative Methods for Almost Linear Equations. Linear Algebra and its Applications 154–156, 487–508 10. Bru, R., Elsner, L., Neumann, M. (1988): Models of parallel chaotic iteration methods. Linear Algebra and its Applications 103, 175–192 11. Bru, R., Migall´on, V., Penad´es, J. (1994): Chaotic inner-outer iterative schemes. In: Lewis, J.G. (ed.) Proceedings of the Fifth SIAM Conference on Applied Linear Algebra, pp 434–438. SIAM, Philadelphia 12. Bru, R., Migall´on, V., Penad´es, J. (1995): Chaotic methods for the parallel solution of linear systems. Computing Systems in Engineering 6, 385–390 13. Bru, R., Migall´on, V., Penad´es, J., Szyld, D.B. (1995): Parallel, synchronous and asynchronous two–stage multisplitting methods. Electronic Transactions on Numerical Analysis 3, 24–38 14. Dembo, R.S., Eisenstat, S.C., Steihaug, T. (1982): Inexact Newton methods. SIAM Journal on Numerical Analysis 19, 400–408 15. El Tarazi, M.N. (1982): Some convergence results for asynchronous algorithms. Numerische Mathematik 39, 325–340, in French 16. Elsner, L., Koltracht, I., Neumann, M. (1992): Convergence of sequential and asynchronous nonlinear paracontractions. Numerische Mathematik 62, 305–319
Numerische Mathematik Electronic Edition page 18 of Numer. Math. (1999) 82: 1–20
Block and asynchronous two-stage methods for mildly nonlinear systems
19
17. Evans, D.J., Bai, Z.-Z. (1997): Blockwise matrix multi-splitting multi-parameter block relaxation methods. International Journal of Computer Mathematics 64, 103–118 18. Feingold, D.G., Varga, R.S. (1962): Block diagonally dominant matrices and generalizations of the Gerschgorin circle theorem. Pacific Journal of Mathematics 12, 1241–1250 19. Frommer, A. (1989): Parallel nonlinear multisplitting methods. Numerische Mathematik 56, 269–282 20. Frommer, A. (1991): On asynchronous iterations in partially ordered spaces. Numerical Functional Analysis and Optimization 12, 315–325 21. Frommer, A., Mayer, G. (1992): On the theory and practice of multisplitting methods in parallel computation. Computing 49, 63–74 22. Frommer, A., Pohl, B. (1995): A comparison result for multisplittings and waveform relaxation methods. Numerical Linear Algebra with Applications 2, 335–346 23. Frommer, A., Schwandt, H. (1997): A unified representation and theory of algebraic additive Schwarz and multisplitting methods, SIAM Journal on Matrix Analysis and Applications 18, 893–912 24. Frommer, A., Schwandt, H., Szyld, D.B. (1997): Asynchronous weighted additive Schwarz methods. Electronic Transactions on Numerical Analysis 5, 48–61 25. Frommer, A., Szyld, D.B. (1992): H-splittings and two-stage iterative methods. Numerische Mathematik 63, 345–356 26. Frommer, A., Szyld, D.B. (1994): Asynchronous two-stage iterative methods. Numerische Mathematik 69, 141–153 27. Fuster, R., Migall´on, V., Penad´es, J. (1996): Parallel chaotic extrapolated Jacobi–like methods. Linear Algebra and its Applications 247, 237–250 28. Huang, C.-M., O’Leary, D.P. (1993): A Krylov multisplitting algorithm for solving linear systems of equations. Linear Algebra and its Applications 194, 9–29 29. Jones, M.T., Szyld, D.B. (1996): Two-stage multisplitting methods with overlapping blocks. Numerical Linear Algebra with Applications 3, 113–124 30. Householder, A.S. (1975): The Theory of Matrices in Numerical Analysis. Blaisdell, Waltham, Mass. 1964. Reprinted by Dover, New York 31. Mas, J., Migall´on, V., Penad´es, J., Szyld, D.B. (1966): Non-stationary parallel relaxed multisplitting methods. Linear Algebra and its Applications 241/243, 733–748 32. Mor´e, J.J. (1971): Global convergence of Newton-Gauss-Seidel methods. SIAM Journal on Numerical Analysis 8, 325–336 33. Neumaier, A. (1984): New techniques for the analysis of linear interval equations. Linear Algebra and its Applications 58, 273–325 34. O’Leary, D.P., White, R.E. (1985): Multi-splittings of matrices and parallel solution of linear systems. SIAM Journal on Algebraic and Discrete Methods 6, 630–640 35. O’Neil, J., Szyld, D.B. (1990): A block ordering method for sparse matrices. SIAM Journal on Scientific and Statistical Computing 11, 811–823 36. Ortega, J.M., Rheinboldt, W.C. (1970): Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, San Diego ¨ 37. Ostrowski, A.M. (1937): Uber die determinanten mit u¨ berwiegender hauptdiagonale. Comentarii Mathematici Helvetici 10, 69–96 38. Polman, B. (1987): Incomplete blockwise factorization of (block) H-matrices. Linear Algebra and its Applications 90, 119–132 39. Renaut, R.A., Mittelmann, H.D., He, Q. (1994): Parallel multisplittings: overview and extensions. In: Lewis, J.G. (ed.) Proceedings of the Fifth SIAM Conference on Applied Linear Algebra, pp 34–38. SIAM, Philadelphia 40. Robert, F. (1969): Blocs H-Matrices et Convergence des M´ethodes It´eratives Classiques par Blocs. Linear Algebra and its Applications 2, 223–265
Numerische Mathematik Electronic Edition page 19 of Numer. Math. (1999) 82: 1–20
20
Z.-Z. Bai et al.
41. Robert, F., Charnay, M., Musy, F. (1975): It´erations chaotiques s´erie-parall`ele pour des e´ quations non-lin´eaires de point fixe. Aplikace Matematiky 20, 1–38 42. Varga, R.S. (1962): Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs, New Jersey
Numerische Mathematik Electronic Edition page 20 of Numer. Math. (1999) 82: 1–20